Peer to Peer Magazine

March 2013

The quarterly publication of the International Legal Technology Association

Issue link: https://epubs.iltanet.org/i/116777

Contents of this Issue

Navigation

Page 74 of 111

Those of us who look back objectively at the implementation of email archiving solutions view them as a learning experience. In hindsight, it is easy to see large volumes of email data (e.g., 20+ GB user mailboxes) forced us into action without the proper time or resources to develop a strategy around the "how" element of managing the data. By focusing only on the "what" and "where" aspects of managing email filing, we failed to address the important issue of determining the most appropriate way to deal with moving matter content into one manageable, easily accessible location. Due to the needs to proactively manage risk, improve client service and minimize costs, many firms then adopted the strategy of managing their electronic content by collecting it in a single data source, typically their document management system (DMS). However, extracting matter-specific content from an unstructured data source, such as an email archive, is a difficult task at best. THE NEED FOR MORE CAPABILITIES Firm resources dealing with this issue quickly realized processes had to be put into place to facilitate the filing of new inbound and outbound email messages, or that content would end up in some large disparate data repository. Technology vendors also responded to this issue by providing tools to assist in partial automation of the filing process. However, the vendor solutions were not sophisticated enough to deliver needed processing analysis and automated workflow capabilities. In addition, inconsistent file management practices, competing workloads and technical and operational constraints often created technology adoption roadblocks. The industry required technologies and processes that could codify data categorization rules and workflows. Firms needed a breakthrough in big data analytics that offered scalability, realtime data insight and data correlation to establish predictive data augmentation to their categorization processes. Despite conventional wisdom, strong statistical skills alone will not solve email data analysis issues. First-generation, analysis-driven vendor categorization tools used algorithms that required high CPU and disk resources, and simple statistical methods. This worked for small data sets but proved ineffective and expensive for large volumes of email data. The new challenge became how to increase and improve automated decision-making tied to data analytics without requiring complex and expensive data analytics software. The barriers to applying analytics are the time and resources needed, as well as the understanding of what types of analytics to apply, when to apply them and how. We needed to develop new methods to improve daily filing decision-making processes. Proven methods to advance the decision process, even in situations where data collection time is limited, include historical email filing behavior diagnostics and analysis of non-email metadata repositories. Automated filing into DMS folders based on historical email filing patterns has been available for some time, but few users judge the tool's accuracy and workflow as sufficient. Emerging solutions focus on the use of analytics and metrics to improve the probability of accurate email categorization, thus increasing the quality of the decision-making process. CONTENT ANALYSIS COMES INTO PLAY In order to analyze the large flow of email data effectively, firms need to ensure their solutions utilize techniques that provide the most appropriate data analytics and metadata management in support of the intake and processing of data from varied sources. CHARACTERISTICS OF EMERGING TOOLS • Server-side processing of messages • Metadata analyses that improve filing predictions and learn over time • Automated processing of email content • Historical email message tagging • Tag encryption to prevent sensitive information from being exposed to external recipients • Distributable configurations that facilitate consistent behavior • Dynamic background email deduping to optimize storage 76 Peer to Peer

Articles in this issue

Archives of this issue

view archives of Peer to Peer Magazine - March 2013