Digital White Papers

LPSCLD21

publication of the International Legal Technology Association

Issue link: https://epubs.iltanet.org/i/1449117

Contents of this Issue

Navigation

Page 28 of 36

I L T A W H I T E P A P E R & S U R V E Y R E S U L T S | L I T I G A T I O N A N D P R A C T I C E S U P P O R T & C O R P O R A T E L E G A L D E P A R T M E N T S 29 Factors to Consider The optimal solutions to any discovery challenge can be identified by considering a few data points specific to that matter and the data collected for it. Some of those separate, yet interacting considerations include: 1 . T I M E : Time considerations include how long it will take to achieve key milestones – starting review, understanding the content of a document collection and, ultimately, production – as well as how long it will take subject matter experts to train a system, when that is required. 2 . C O S T : Setting aside hard costs associated with in-house staff and attorneys, vendors and document reviewers, always consider the opportunity cost involved with diverting subject matter experts away from other tasks to train a predictive model. Also consider whether the approach selected supports early estimation of the size of the review population (volume) and the number of responsive documents expected to be found (yield). This helps plan any review as efficiently and cost-effectively as possible. 3 . K N O W L E D G E A B O U T T H E M A T T E R : The degree to which facts of a matter are known prior to document review can impact the ability to train a model. Prior knowledge also may impact how quickly a team needs access to "the right documents" to inform both tactical and strategic decisions. Knowledge of the case or about the information contained in the document collection may impact a team's tolerance for finding surprises in the data later in the review process. 4 . S T A N D A R D S F O R Q U A L I T Y : As TAR becomes more prevalent in compliance and discovery arenas, so does the determination of precision and recall targets for satisfaction of discovery obligations. Where minimum thresholds for acceptable quality are known, they can influence the selection of both technology and workflow. 5 . F A C T S A B O U T T H E D O C U M E N T C O L L E C T I O N : Important factors to consider about the document collection itself include its completeness – is all data that needs to be evaluated available, or is a TAR solution expected to accommodate the rolling ingestion of data? Additionally, the richness, or prevalence of responsive material in a population, can influence the performance of different technologies and workflows and greatly impact time to completion. The TAR Landscape Armed with information about the case, the document set that needs to be reviewed and the variables outlined above, teams can make an informed decision about which TAR solution is the best fit. TAR 1.0 Predictive coding, or TAR 1.0, uses both relevant and nonrelevant training documents – the training set – to prime a system to classify documents. Typically, the training set is coded by a subject matter expert so that the system can replicate an expert's knowledge. Methods for identifying a training set could be based on random sampling, uncertainty sampling or a combination of the two. With TAR 1.0 solutions, training is a finite process that precedes the scoring or coding of all documents. The predictive model and its associated scores are frozen once training is complete, so changes to either the SME's understanding of

Articles in this issue

Archives of this issue

view archives of Digital White Papers - LPSCLD21