In a recent interview, Nina Koteva, Product Manager for Document Intelligence, and Lea Strohm from the Data and Model Ethics team at Thomson Reuters, discuss how Document Intelligence is using AI and machine learning to enable legal professionals to gain immediate insights from their contracts and documents.
Koteva focuses on operationalizing machine learning models and building upon and maintaining the services that power Document Intelligence.
Strohm: What is Document Intelligence?
Koteva: You can think of Document Intelligence as a contract review tool. And really, it enables legal professionals to search, analyze and gain immediate insights from all of their contracts. This tool is right now powered primarily by machine learning models that are trained by Practical Law attorney editors.
Strohm: Why is AI a good technological solution? And could you explain a little bit what type of AI technologies you’re actually leveraging within Document Intelligence?
Koteva: We have several models that are powered by AI and that are part of Document Intelligence, but the technologies that we use really depend on the use case. We might have models that are simpler model architectures but still do the job with accuracy for users, but also range to models that are using the latest large language model (LLM) architectures.
It’s really a big range, but our primary model – which is extracting information from text – is built in-house and trained by our Practical Law attorney editors and Document Intelligence subject matter experts (SME).
Strohm: How does Document Intelligence work?
Koteva: Document Intelligence is a bit of an umbrella term for our suite of products. Document Intelligence is the customer-facing application, and then there’s Document Understanding, which is our end-to-end solution, where we’ve automated the creation of the model that I mentioned previously for extracting information from text.
We have an internal tool called Label Insights, where those SMEs can load the data and define what information they need to extract. Create a kind of ontology out of it. Then, they can label the data themselves and train the model. A big focus after that is how do they validate that? How do they trust the model? Over the past year, we’ve been developing more features around that to build trust and make quicker decisions on what goes into production.
Once a model is promoted to production, we have automatic deployment capabilities via Label Insights, then those models are available via Application Programming Interfaces (APIs) to be integrated into the different product suites.
Strohm: Can you give us some more background on the SMEs’ roles in the technology?
Koteva: The contracts that are loaded into Document Intelligence come from different types, for example, sales of goods agreements, commercial leases, etc. What we’ve seen is the best way to extract this information is to train a model per document type, so we needed to be able to scale very quickly. Once that actual model architecture is developed by the data scientists, we can give that to the SMEs, who have the legal background and know what they’re expecting to see from the model in terms of quality.
In terms of the SME experience within the tool, we are really looking at this as a product. We have demo sessions with our SMEs to get regular feedback from them – and we really try to focus on their main pain points and their entire experience.
Strohm: Are there user experience steps that you’ve taken to enable these SMEs to trust the models?
Koteva: Trust is always a tricky point, and we have features around validation. For example, after the model is trained, let’s actually see it compared to the golden truth. How does the model perform per pieces of information that are extracted?
We look at the overall performance and then the SME is able to zoom in and really drill down into the details. Something else that we’ve been recently working on is a feature around topic coverage and discovery. For example, let’s say the SMEs have identified that the model is incorrect. This feature can enable them to see samples or pieces of text that the model thinks they need to be close to, and they can quickly check if this is really the case, or if they need to change the data and the labeling.
Strohm: Last question. What is something you and your team are extremely excited about?
Koteva: Going back to the external facing application, we are really excited about the new initiative for drafting and the partnership with Microsoft Copilot. We are very eager to see how it’s going to play out.
Learn more about how Document Intelligence can help you advise with confidence, seize opportunities, mitigate risk and identify savings.