Digital White Papers

Page 45 of 71

I L T A W H I T E P A P E R | I N F O R M A T I O N G O V E R N E N C E 46 The organization defensibly identified more than 700 million emails that were no longer needed, saving an estimated $10 million. and outside experts to collaborate on a plan of attack. The project combined technical experts, a managed review team and machine learning technolo to bring the organization's data stores back under control, and in a way that allowed for retention of legacy data that held continued business value, was under legal hold or required for regulatory compliance records. Over the course of the project, the team built a content classification engine with the ability to sort between data that needed to be retained vs. deleted, based upon carefully defined parameters to categorize the data by group, sensitivity level and retention period. By marshalling a suite of technologies alongside information governance best practices and document review, the organization defensibly identified more than 700 million emails that were no longer needed, saving an estimated $10 million. A powerful and repeatable process that can guide other organizations in defensibly disposing large quantities of data also emerged. It includes the following steps: • Secure executive sponsorship: Working closely with outside partners and cross- department stakeholders, the team evangelized the importance of the project to executive leadership. C-suite buy-in meant that the team would have the budget they needed to execute, as well as the authority to make important decisions and socialize the program across the organization. • Development of an email classification tool: Using existing classification products, the team created a robust classification model that had the ability to analyze, categorize and apply retention rules and time periods to one billion emails across 140 record categories. • Extensive interviews with subject matter experts: The team engaged key individuals in each of the organization's business units to determine, learn about and define record categories. This provided insight into the types of emails the organization had in its stores, who sent them, keywords regularly used and timeframes or seasonality associated with them. Further, the subject matter experts also shared information about other records and systems they used, which helped with the creation of a broader enterprise data map. • Creation of a file plan: Based upon subject matter expert input, a file plan that defined records categories was outlined. This provided guidance on appropriate retention periods and risk levels associated with unique documents. • Combination of document review and advanced analytics: To narrow a set of 300,000 emails from the archive down to only the most relevant examples of several hundred emails for each record category, the team leveraged experienced document reviewers to apply analytics and streamline review. This approach to the 'needle in the haystack' challenge resulted in discovery of 40,000 highly relevant emails needed to train the knowledge base and enable the classification model to make precise decisions about the information funneling into it. M O V I N G M O U N T A I N S : M A S S I V E D A T A R E M E D I A T I O N P R O J E C T P R O V I D E S P L A Y B O O K F O R D E F E N S I B L E D I S P O S A L

Articles in this issue

Cover

Archives of this issue

view archives of Digital White Papers - IG19

Digital White Papers

IG19

Contents of this Issue

Navigation

Page 45 of 71

Articles in this issue

Archives of this issue