Guidelines for Building a Multilingual Query Resolution System Like Zepto's From the Ground Up
In the ever-evolving world of e-commerce, Zepto, a popular online retail platform, has taken a significant step forward in improving user experience by implementing an innovative multilingual query resolution system. This system leverages Large Language Models (LLMs) combined with Retrieval-Augmented Generation (RAG) to handle misspellings and improve search precision across multiple languages.
The system works end-to-end, starting from fuzzy query detection and ending with query correction. It uses LLMs to understand and interpret user queries contextually in various languages, going beyond simple keyword matching. Integrating RAG, the system augments the LLM with real-time retrieval of relevant product information or query intents from a knowledge base, improving accuracy and relevance.
Employing stepwise prompting techniques, the system iteratively refines query understanding and generates corrected or optimized query outputs in a structured JSON format. This approach allows the system to handle misspellings like typing phonetically or close variants, ensuring relevant search results despite user input errors.
As a result, this intelligent multilingual query resolution system has significantly enhanced Zepto's search quality, leading to a reported 7.5% lift in conversion rates on their platform.
The system also utilizes a vector database for storing and indexing product embeddings. These embeddings are created using FastEmbed and stored in ChromaDB. The BAAI/bge-small-en-v1.5 model is used for creating the embeddings, which is a strong English text embedding model suitable for multilingual tasks.
The components are chained together using LangChain Expression Language (LCEL) to create a seamless flow from query to final result. User feedback is used to continuously improve the system, adding new few-shot examples, dropping new synonyms, and squashing bugs.
One of the key challenges the system addresses is the handling of vernacular terms and phonetic typing, which traditional search systems often struggle with. The LLM prompt is designed to instruct the LLM to act as an expert query interpreter, base its decision on the list of retrieved products, and return a structured JSON object with specific fields.
In a demonstration, the system was able to correctly interpret a misspelled query like "kele chips" as "banana chips", ensuring relevant search results. The system also showed its ability to handle multilingual queries, disambiguate queries, provide structured, auditable outputs, and demonstrate a clear path to improving user experience and search conversion rates.
In conclusion, Zepto's innovative multilingual query resolution system, powered by LLMs and RAG, is a significant step forward in enhancing the shopping experience on e-commerce platforms, especially in a multilingual setting where spelling errors can vary widely.
- This innovative multilingual query resolution system, implemented by Zepto, not only improves user experience in shopping but also expands its reach to various sectors like data science, technology, and education-and-self-development.
- The system's ability to handle misspellings and disambiguate queries shows promise in enhancing the efficiency of data-and-cloud-computing search systems.
- The home-and-garden department could benefit from similar technology, improving its search quality and lead to better user experiences.
- Artificial Intelligence, through Large Language Models and Retrieval-Augmented Generation, is transforming various industries, including lifestyle choices and learning, by offering smarter and more accessible solutions.