Peer to Peer Magazine

March 2013

The quarterly publication of the International Legal Technology Association

Issue link: https://epubs.iltanet.org/i/116777

Contents of this Issue

Navigation

Page 75 of 111

By utilizing content analysis, application management knowledge and case experience you can gain in-depth insight into real-time email data analysis. One challenge to managing large volumes of data lies in understanding which analytics to apply where and the resources required to achieve accurate and actionable insights. Content analysis can be approached from a quantitative or qualitative perspective. Previously, firms dealt with high email data using a purely quantitative approach, as when many firms quickly conceived remedies when faced with the email upsurge. A qualitative approach is where the future lies. Technology management must know what, where, when and how analytics should be applied to improve decision-making and organizational effectiveness. When managing email, content management must be viewed from both a conceptual and contextual standpoint to achieve proper data categorization. For example, if the topic of an email message is "Star Wars," the tools should be able to determine for proper categorization whether this message is discussing the movie or the space race based on the additional context provided in the email message. Contextual data analysis allows the resulting data set to have more value than words alone. Content without context is not only less than ideal, it is detrimental to an organization's categorization efforts. When categorizing data, the process should provide a means of describing the data to enhance understanding and to foster knowledge. For example, if text states "Did you see the girl with the telescope?" without further information, we are unable to determine if the sentence means the girl has a telescope or a telescope was used to see the girl. MAPPING OUT THE CONTENT Early data analytics processes, such as deductive content analysis, categorized metadata into predefined groupings. More recent and sophisticated tools use deductive content analysis to create content mappings dynamically based on recurring values (i.e., words and phrases). For example: Email messages containing the phrase "shoddy construction practices" are filed consistently into matters related to construction litigation. Once content mapping is deployed, the data within each map can be further categorized by additional conceptual and contextual values. These techniques are used to develop probability values to predict the most likely email filing location. During the processing of unstructured email, extracting content meaning and creating appropriate data mapping and migration to storage is critical. This effort is typically executed as a single effort during the migration from the email message's original creation or receipt state. This is the optimal time to perform the initial "raw data" real-time analysis, as it is streamed toward its destination. However, obtaining relevant information from data extraction of unstructured email messages poses a considerable technical challenge. Leaders of automated email data processing use data extraction cycles to advance their data understanding by adding structure to the data via: • Text pattern-matching, such as regular expressions (text parts that match a provided data specification), to identify small- or large-scale structures • A table-based approach to identify common sections within a limited data collection (e.g., email domain names in the To: and From: address headers) • Text analytics to understand the text and link it to other information DOING IT IN REAL TIME More recent and refined tools that utilize real-time analysis to develop predictive data are available to manage content across multiple data stores enhancing speed, focus and relevance without user interaction. As these advanced tools become available, the following characteristics are surfacing: • System architectural designs that employ server-side processing of messages to minimize time-intensive data processing while managing email messages in Microsoft Outlook • Automated processing of email content and metadata analyses that improve filing predictions • Messages tagged with metadata (e.g., client and matter) through relational metadata analysis and assignment without message and metadata correlation (for example, using the content in the message itself to correlate with historical email messages to come up with the best possible location for filing) • Historical email message-tagging metadata storage to improve predictive filing accuracy Peer to Peer 77

Articles in this issue

Archives of this issue

view archives of Peer to Peer Magazine - March 2013