Our first prototype
After two weeks of coding, we developed our first prototype. This version allowed users to upload multiple PDFs and query them via a search bar. The system generates responses using Retrieval-Augmented Generation (RAG), leveraging various open-source large language models and embedding models. To find relevant articles, we utilized similarity search within a Postgres vector database that stores all the embeddings. We built the application using the Streamlit framework.
Our prototype successfully answered questions on topics explicitly mentioned in the documents. Additionally, we experimented with different embedding models to compare their performance. Since our queries were in Swedish while the papers were in English, we noticed that language often influenced the results more than context. For instance, text chunks containing Swedish words were automatically prioritized. Moreover, when generating responses, the answers were frequently returned in English.