publication of the International Legal Technology Association
Issue link: https://epubs.iltanet.org/i/535467
ILTA WHITE PAPER: JUNE 2015 WWW.ILTANET.ORG 28 WHEN MACHINE INTELLIGENCE JOINS YOUR PROFESSIONAL SERVICES TEAM The reason for this is simple: Different review methods make different mistakes. Human reviewers tend to make random mistakes. TAR systems typically make systematic errors, getting entire classifications of documents right or wrong. By combining different methods into the workflow, each serves as a check against the others. Because TAR does not make the same class of errors as search terms and human review, it makes a valuable addition to privilege and other data protection workflows, provided the technology can deal with low prevalence and be efficiently deployed. Precision is somewhat less important when your task is to protect documents. However, protection workflows include much human review, so including unnecessary junk quickly gets expensive. You want to achieve a fairly high level of precision, but recall is still the metric to focus on. KNOWLEDGE-GENERATION TASKS: The final task is where the name "discovery" originated. What stories do these documents tell? What can we learn from them? For knowledge-generation, we do not particularly care about recall. We do not want all the documents about a topic, just the best ones — the ones that will end up in front of deponents or used at trial. Precision and relevance are therefore the most important metrics. You do not want to waste your time going through junk, duplicative or less relevant documents. TAR can be of critical help in prioritizing the document population by issue and concentrating the most interesting documents at the top of the list so attorneys can quickly learn what they need to litigate the case. One problem is that TAR algorithms rank documents according to their likelihood of getting a thumbs- up or thumbs-down from a human reviewer. They do not rank documents based on their degree of interest. Some documents could be easy to predict as responsive, but not very interesting. Other documents could be extremely interesting, but harder to predict because they are so unusual. In practice, however, the more interesting documents cluster near the top of the ranking. Interesting documents sort higher this way because they contain stronger terms and concepts and more of them. TAR's ability to concentrate the interesting documents near the top of a ranked list makes it a useful addition to knowledge-generation workflows. MACHINES AND HUMANS IN HARMONY This framework can help you think about, develop and evaluate different discovery workflows when machine intelligence joins your team. The critical factor in your success will be designing workflows that most effectively use all the tools and resources at your disposal. TAR is a powerful addition to your team — welcome the machines aboard! Recall and precision are two crucial metrics for measuring the effectiveness and defensibility of TAR processes.