Project Summary
Commercial lease agreements can be tedious to analyse, with their extensive length (40-60 pages)
and obscure language. Our AI solution simplifies and automates the preparatory tasks involved in
sifting through them. Users can effortlessly send large volumes of documents to an email address
and receive the extracted information in the desired format. This AI service integrates external
services like Google Sheets to allow users to customise their queries and toggle parameters.
Overall, our AI solution has dramatically streamlined our client's administrative tasks,
enabling them to refocus their time and energy on their legal tasks instead.
Client Overview
Pubs Advisory Service specialises in providing expert guidance, information, and representation
services to a diverse range of clients within the pub sector. With a solid reputation as the
lead advisory and guidance organisation for pubs in the UK, Pubs Advisory Service proudly holds
a prominent market position and is the go-to choice for pubs seeking comprehensive legal
support.
Challenges
People entering the Licensed Trade sector need detailed independent advice on a wide range of
issues. Access to such pre-entry advice can prove invaluable for conducting thorough due
diligence and making informed decisions, ultimately setting the stage for successful pub
ownership and prosperous growth. However, because each commercial lease agreement is unique and
requires individual attention, this detailed advice can be costly and time-consuming to prepare.
Previous methods to perform document searches on commercial lease agreements often involved
scanning them – using Adobe's inbuilt Optical Character Recognition (OCR) to transcribe
handwriting – and performing traditional keyword searches (e.g. ctrl+F) throughout the PDF.
However, this method is flawed on several fronts:
- Commercial lease agreements often contain handwritten notes, which are not accurately
transcribed when using standard OCR software.
- The keywords used in the search bar may not appear in the text exactly as searched, and
ctrl+F-type searches are not complex enough to detect variations in phrasing or understand
natural language questions instead.
- The information being searched for is not usually accessible in one place, but is instead
scattered throughout the entire document, requiring the user to piece it together manually.
Pubs Advisory Service realised that they needed a smarter approach to extract information from
lease agreements. They asked OpenKit to develop an AI search tool able to perform natural
language searches on large volumes of documents.
An example document with handwriting Deliverables
1. Scalable and Cost-Effective Analysis Tool
Create a scalable, low-cost administrative tool for lease agreement analysis with direct citations.
2. Accurate Information Extraction
Maximise accuracy of extracted information and prevent hallucinations outside the document's scope.
3. Adaptable System
Build a dynamic _system capable of adapting to various use cases.
4. Seamless Integration
Allow easy integration into existing business tools without extensive additional training.
5. Showcase Generative AI Benefits
Demonstrate practical advantages of generative AI solutions in real business scenarios.
Our Approach
Consultancy
We began by conducting in-depth consultations with Pubs Advisory Service to gain a comprehensive
understanding of their industry and the obstacles they encounter. OpenKit's main goal was to
explore how the current landscape of generative AI could be harnessed to enhance productivity
and overall business performance for Pubs Advisory Service, and provide them with a substantial
competitive edge.
This unique solution required thorough exploration and prototyping. As such, a portion of the
project was allocated to experimenting with various approaches, tools and methodologies.
First, OpenKit analysed the business landscape and gained a deep and nuanced understanding of
the client's day-to-day business activities. We maintained an open dialogue with the client,
discussing the different tools and services that we could offer, ranging from Smart Query
Systems to advanced autonomous agents. Because the tool had to be user-friendly, integrating
easily into the client and their customers' daily workflow, we settled on using an email
interface through which users could effortlessly send questions and receive answers. We
determined that avoiding the need to build a new interface from scratch would significantly
reduce development time and costs for our client.
This collaborative process enabled us to establish a clear project goal and scope, which guided
our research and development efforts, resulting in a refined final product.
Design
OpenKit has existing experience using Large Language Models (LLMs) to build automation tools, so
we had strong foundations to build on when approaching the problem. We quickly entered an R&D
phase and identified several key aspects for the initial design:
- An email interface system capable of receiving large commercial lease agreements (up to 50mb
attachments)
- A cloud-based API for Open Source and proprietary LLM inferencing, which makes the solution
accessible online from any device
- State-of-the-art information extraction which could perform just as well with Open Source
models (allowing clients to remain independent from third-party proprietary models)
Development
Features of the service:
1. High Performance
Processing extensive legal documents rapidly; extracting key information like clauses, obligations, and rights; accelerating the decision-making process by drastically reducing the need for manual review.
2. High Scalability
Handling a large volume of documents simultaneously, useful for high throughput requirements.
3. Accuracy and Consistency
Reducing human error in document analysis by consistently applying the same criteria for information extraction; ensuring uniformity across multiple documents.
4. Flexibility
Adjustable parameters to refine the search parameters and responses provided by the LLM.
Testing
We developed this Smart Query System iteratively, meaning we alternated between testing and
development phases. This open exchange allowed us to make the system much more robust and
optimise it for live use. After Pubs Advisory Service first began testing the system's
performance, they provided detailed feedback. Following their report, we refined and improved
the system, ensuring their maximum satisfaction.
Notable improvements included allowing users to customise their prompts without the need for any
technical expertise. We did this by integrating a signposted Google Sheets interface through
which they could update their questions and toggle certain parameters:
Google Sheets interface for customising prompts
We also improved the presentation and legibility of the email response:
Example of an improved email response Developmental Challenges
We faced a range of challenges during the completion of this project, which allowed us to
demonstrate our expert problem-solving abilities.
-
OCR Limitations: The documents we initially handled were pre-scanned by
Adobe's OCR service. However, the software's limitations led to multiple scanning
errors, especially when it came to parsing tables and deciphering handwritten
information.
- First Solution: OpenKit's first solution was a retroactive
corrective approach, using GPT-4. The LLM would parse the document a first time,
highlighting OCR and formatting issues, then attempt to correct the distortions
before passing the rectified output to a smaller LLM.
- Second Solution: We introduced an AI OCR system to re-scan the full
document, removing the need for Adobe's OCR. We incorporated AWS Textract into our
pre-processing pipeline, which improved data quality by filling in missing
information, reducing noise, and resolving inconsistencies.
-
Context Length Limitations: We faced challenges in determining the
context length we needed from our chosen LLM. Most Open Source models are limited to
shorter context lengths, meaning they can only consider a limited amount of text and
would be incapable of handling documents as lengthy as commercial lease agreements.
Solution: In order to make the service equally efficient on Open Source
LLMs and proprietary models like GPT-4, we developed a technique which used word
embeddings and separated the documents into individual 'chunks,' each capable of being
processed by a more lightweight LLM.
-
Scalability: Once Pubs Advisory Service began testing the system and saw
how valuable and impressive the results were, they were eager to expand its use across
their customer base. However, the system was at the time incapable of serving high
volumes of users simultaneously.
Solution: OpenKit designed and developed a serverless scheduling system
which queued all requests upon reception, allowing it to process them individually. This
ensured a reliable, failsafe performance even when handling a dynamic amount of
concurrent documents.
Key Successes
Our client tested our AI solution against similar existing market solutions. The results were
extremely positive, as our solution outperformed competitors in terms of accuracy, reliability,
and cost-effectiveness. This evaluation solidified our solution's position as the preferred
choice for Pubs Advisory Service.
- Overcoming complex challenges to build a bespoke, innovative AI solution
- Outperforming existing services on key factors, including performance and price
- Establishing a standard for high scalability and reliability
- Optimising the ease of use by relying on well-known email and spreadsheet interfaces
Next Steps
Pubs Advisory Service was highly impressed by the results of our collaboration and has expressed
an eagerness to continue working with OpenKit on several forthcoming projects.
Technologies Used
- OpenAI LLM models
- Supabase
- Typescript
- Node.js
Client Testimonial
"OpenKit provided us with a robust and innovative back-office tool to tackle the wide range of
commercial agreements we need to examine. Their deep understanding of our business needs,
coupled with their expertise in GPT and Cloud (AWS) services, enabled them to swiftly navigate
complexities and deliver a bespoke AI solution tailored to our operations. They explained
detailed, complex issues in straightforward terms we could understand. No question we asked of
them was avoided: the engagement was efficient and inspired confidence from top to bottom.
The team at OpenKit demonstrated a high level of professionalism and adaptability throughout,
ensuring a smooth project delivery despite unforeseen challenges. They went above and beyond to
ensure the solution was optimal and offered invaluable post-launch support to guarantee our
satisfaction.
We are excited to continue our journey with OpenKit and are very much looking forward to working
with them again. Their proficiency in software development and AI technologies, coupled with
their professional and client-focused approach, makes them an ideal partner for any business
seeking bespoke AI solutions."