P2P

Spring2020

Peer to Peer: ILTA's Quarterly Magazine

Issue link: https://epubs.iltanet.org/i/1227987

Contents of this Issue

Navigation

Page 37 of 94

38 P E E R T O P E E R : I L T A ' S Q U A R T E R L Y M A G A Z I N E | S P R I N G 2 0 2 0 Consequently, I could very quickly look at a Bad Boy Act and characterize it. • I was the only person training AI Software on the Bad Boy Acts and took a very consistent position on the classifications. Having more than 1 person training AI Software could easily lead to a different result. Data Set Issues: Collecting a quality data set is essential to a successful AI Software training project. Had I only used samples of Guaranties from one particular lender to train AI Software, then AI Software would only be able to recognize the language that specific lender used to describe the Bad Boy Act. I never used more than 3-5 Guaranties from any particular lender. Since these were final negotiated Guaranties, there was variation in language used to describe each Bad Boy Act within a particular lender's set of Guaranties. I ended up using samples from approximately 20 different lenders to train AI Software. The market study for Guaranties focused on the frequency with which we saw each Bad Boy Act in a Guaranty. Some Bad Boy Acts appear in 100% of our Guaranties, while others appear in less than 1/3 of them. It was very easy to train AI Software on the Bad Boy Acts which appeared more frequently because every time I uploaded a Guaranty to AI Software, I would train it on all of the Bad Boy Acts contained therein. The more frequently a provision occurred, the more quickly AI Software learned to recognize it. It was much more difficult to train AI Software on Bad Boy Acts which appeared less frequently because it was challenging to locate a sufficient number of Guaranties which contained examples of the more obscure Bad Boy Acts. Since I had been collecting data for several years on which Guaranties had each bad Boy Act, I could quickly hone in on which Guaranties had these clauses and add them to my training data set. Without this back-up data, it would have been exponentially more difficult to train the AI on Bad Boy Acts which occur appear less often in our Guaranties. How Many Guaranties did it take for AI Software to learn?: One of the questions I had going into this project was how many examples of a Bad Boy Act AI Software would need to see before it began to recognize it. AI Software has 2 different metrics for gauging accuracy: precision and recall. Precision means the percentage of your results which are relevant and useful, while recall refers to how complete the results are. I found that it took between 40-50 examples of a Bad Boy Act F E A T U R E S before AI Software got to a precision and recall number that I was comfortable with. The project plan for maintaining this market study required that a 1st year associate review each Guaranty and use AI Software to help facilitate that review. The associate was going to make the final judgment call as to whether AI Software correctly identified the Bad Boy Act in the Guaranty. Given this additional level of review, I was comfortable with the precision and accuracy numbers I achieved after training AI Software on 50 Guaranties. It took several weeks to train AI Software on the various Bad Boy Acts. I probably spent at least 50 hours over that time which included: • Sessions with AI Software staff to teach me about collecting a data set and helping me to understand the mechanics of training AI Software; • Establishing a quality data set of Guaranties with a large enough variety to properly train AI Software; and • Supplementing the data set to locate more obscure examples of certain Bad Boy Acts Implementing a market study using AI Software + 1st year associates: The project plan to maintain the Guaranty market study involved (i) training 1st year associates on how to recognize and classify each Bad Boy Act "Collecting a quality data set is essential to a successful AI Software training project."

Articles in this issue

Archives of this issue

view archives of P2P - Spring2020