The brief
Client reached out to WebbyLab to create a convenient tool for analyzing and researching markets in different countries to launch new medical products there.
Challenge
When working on a market access tool, we faced a significant challenge. In particular, it was crucial to combine different classifiers of illnesses, procedures, and medicines for various countries in one system and tune the search so that it would provide all the essential data about the disease, related procedures, and drugs.
Solution
Based on the client’s need to build a high-quality search for different types of medical classifiers, we have created a system that provides an aggregated medical knowledge database and a search engine that can find the most relevant matches in this database using artificial intelligence.
In the first stage, we focused on creating an aggregated database of documents and testing the capabilities of full-text search. We have tried different advanced settings and methods such as word stemming, multilingual search, and generating additional input using AI models to reach better search quality.
At this stage, we’ve reached a search efficiency of up to 50% of the required results. The determined problem that stopped us from further optimizations was the lack of relations in the source data and the limited capabilities of the full-text search. It was unable to find semantically correct results in the complex medical domain.
That’s why we started the research process to find new technologies that could help us reach desired results.
In the second stage of the project, we looked into the direction of RAG — Retrieval Augmented Generation. It is a technique that combines the capabilities of a pre-trained large language model with an external data source. The main idea was to provide an AI model with the domain knowledge database to search data within.
This database must contain the data in a format understandable for AI — the vectors, also called embeddings. They are numerical representations of concepts converted to number sequences, which make it easy to build the relationships between those concepts. In our case, we needed exactly a tool to build relationships between medical concepts.
One way to build embeddings is to train LLM. It may be more effective, but it is very costly and time-consuming. Another way is to use pre-trained embedding models that already contain information about data relationships. We chose the last one and generated our own embeddings database using the Open AI ADA text model.
In our case, we also didn’t need generative AI answers for a user but rather structured results with real document references. That’s why we decided to use a separate service that is able to search inside the embedding database. We selected Azure Cognitive Search because our system was already connected with this cloud provider, and its features were enough for our needs.
We have configured the Hybrid search for our system, which is a combination of full-text and vector search. We also set up the semantic ranking — a feature provided by Azure to retrieve more semantically relevant data. This way, we have significantly increased search efficiency and achieved up to 95% of all expected search results.
Software architecture model
The system consists of a single-page Frontend Application connected with a monolithic Backend Application and several additional modules used to parse, enrich, and index medical data.
The system uses a set of 3rd-party services, such as:
- Embedding AI model (OpenAI ADA)
- Generative AI model (OpenAI GPT)
- Search service (Azure Cognitive Search)
- Translation service (Deepl, Azure Translator)
You can see the relations between these components in the diagram below.
Key features
The developed market access planning tool features the following:
- AI multilingual search in the knowledge base of disease classifiers, procedures, and drugs for their treatment in different countries
- Display suggestions at the start of the search to help the user find the desired result
- Saving search history so users can easily return to previous results
- Saving the search result to a PDF report for further information analysis
- Authorization and user management on the platform to distribute access between users focused on different countries
- A tool to get more information about reimbursement options in different countries
- A tool for assessing the significance of the impact on treatment with a new drug
Results
We have implemented a new product for analyzing and searching for information for medical markets:
- Unified and combined 32 classification documents from different countries into one knowledge base
- Implemented AI-based search for 7 countries and in 6 languages
As a result, the client can analyze different medical markets in one place without having to search for information from different sources.