This is a submission for the Open Source AI Challenge with pgai and Ollama
What I Built
Imagine having an assistant that can, based on a given research topic/paper, instantly connect you with papers that are relevant to you - saving hours spent sifting through services like Arxiv!
This project is a research paper recommendation system leveraging RAG with PostgreSQL, pgVectorScale, and a language model (choose from ChatGpt4o:mini/Claude35/llama32:3b). Using a *single * arXiv paper ID, the system finds similar research articles using vector embeddings, allowing users to dive deeper into related works, spot trends, and explore different approaches on the topic.
Demo
Hosted GUI:
https://tinyurl.com/timescalechallengeanyademo
Note: May be unavailable due to Gradio 72hr url limit - SEE COLAB-HOSTED SELF-RUNNABLE SOURCE CODE BELOW, slow due to multiple users, some recent arXiv url/papers don't work)
T10 Similar Paper Summary/Analysis Demo
Colab Notebook Source Code (Try it: ~10min):
https://tinyurl.com/timescalechallengeanyanotebook
- NOTE: Default configuration uses Ollama, but OpenAI Anthropic Claude w/ Cohere Embeddings is preferred due to context length limitations with pgAI and Ollama embeddings (LLM similarity analysis/question: 3 papers instead of 10 papers w/ Ollama, see final thoughts).
Tools Used
- pgvector & pgvectorscale: Backbone for storing and searching vector embeddings of arXiv paper texts, which are each converted into vector representation. Use DISKANN (or IVFFLAT) for grouping, indexing embeddings.
- pgai: Used for generating embeddings and answer questions for research documents. pgAI is used as a gateway to OpenAI, Anthropic, Cohere, and Ollama.
Final Thoughts
- Additional unique aspects of this project:
- Usage Postgres stored functions to call pgai functions in 'function mode', enabling users without any access to the database or pgai to build a RAG (superior security).
- Integration with OpenAI, Anthropic, and local Ollama APIs.
- Learnings
- The learning curve for implementation was of medium level, but I felt like I learned a lot from exploring timescale's github documentation and writing stored function commands (with ChatGPT's help, took my database systems course >1yr ago - a little rusty). Should add further documentation and review for inefficiencies to notebook in future.
- Feedback
- pgvector is limited to an embedding dimension size of 4k (2k if full vector is used), falling short of OpenAI's 4096. I wrote additional code to trim the output, which complicated the implementation.
- pgAI's Ollama may have context length issues (when I use Ollama's interface directly there are no such issues), which limited the later question-answering function to 3 papers. When using Anthropic/Cohere, we could do more.
Top comments (1)
Forgot to mention which additional prize categories my project may be applicable towards:
Additionally, please note that the original post has two notable links: