DocsGPT 🦖

Open-Source Documentation Assistant

DocsGPT is a cutting-edge open-source solution that streamlines the process of finding information in the project documentation. With its integration of the powerful GPT models, developers can easily ask questions about a project and receive accurate answers

Say goodbye to time-consuming manual searches, and let DocsGPT help you quickly find the information you need. Try it out and see how it revolutionizes your project documentation experience. Contribute to its development and be a part of the future of AI-powered assistance.

🎃 Hacktoberfest Prizes, Rules & Q&A 🎃

Production Support / Help for Companies:

We're eager to provide personalized assistance when deploying your DocsGPT to a live environment.

Send Email 📧

Roadmap

You can find our roadmap here. Please don't hesitate to contribute or create issues, it helps us improve DocsGPT!

Our Open-Source Models Optimized for DocsGPT:

Name	Base Model	Requirements (or similar)
Docsgpt-7b-mistral	Mistral-7b

…

🚀 Feature: Caching for DocsGPT #1295

pabik posted on Oct 10, 2024

🔖 Feature description

We need to implement caching for DocsGPT to improve performance and efficiency. If the same question is asked, using the same source and the same LLM, the result should be retrieved from the cache rather than triggering a new API call.

Redis is already configured and used for Celery tasks, so the cache system should leverage Redis for storing and retrieving these cached responses.

🎤 Why is this feature needed ?

This feature will improve the performance of DocsGPT by avoiding redundant API calls for identical requests. It will reduce response time for repeated queries, lower API costs, and improve user experience, especially for frequently asked questions.

When users ask similar questions repeatedly using the same data source, there's no need to re-run the same logic each time (at least for some time period). By introducing caching, we can streamline this process.

✌️ How do you aim to achieve this?

The implementation will involve:

Using Redis as the caching layer to store LLM responses, indexed by the combination of question, source, and LLM used.
Checking the cache before executing new LLM queries to see if a cached result is available.
Triggering from cache if an identical question is found in the cache, otherwise proceeding with the usual query process and then storing the result for future use.

This is a challenging task, and we'd love to collaborate on it. You can contribute directly in this issue or join the discussion in our Discord (collaborative-issues).

We also encourage splitting this issue into smaller, manageable tasks, but please link them back to this original issue for tracking purposes.

🔄️ Additional Information

No response

👀 Have you spent some time to check if this feature request has been raised before?

[X] I checked and didn't find similar issue

Are you willing to submit PR?

None

</div>
<div class="gh-btn-container"><a class="gh-btn" href="https://github.com/arc53/DocsGPT/issues/1295">View on GitHub</a></div>

Caching docsgpt #1308

fadingNA posted on Oct 12, 2024

What kind of change does this PR introduce? (New Feature Caching)

The changes are applied in the BaseLLM class to ensure that all LLM queries (both standard and streaming) benefit from

Caching of responses to improve performance.
Token usage tracking for monitoring API costs.
The concrete LLMs implementations now automatically apply caching and token tracking without modifying their core logic.

Why was this change needed? (You can also link to an open issue here)

#1295

Other information

The addition of caching and token usage tracking was necessary to improve performance and reduce redundant API calls for LLM queries. This change also allows monitoring of token usage for better cost management. By caching the results of similar requests, repeated queries can retrieve cached responses, thus saving time and reducing API costs.

Additionally, the use of decorators makes the code more modular, allowing the caching and token tracking logic to be applied across different LLM implementations without modifying each one.

View on GitHub

LLM Models and RAG Applications Step-by-Step - Part III - Searching and Injecting Context

InterSystems Developer - Oct 24

Generative Audio

David Mezzetti - Oct 13

RAG vs. Fine-Tuning: Which Is Best for Enhancing LLMs?

fotiecodes - Oct 23

Understanding SafeTensors: A Secure Alternative to Pickle for ML Models

Luke Hinds - Oct 23

DEV Community

New Feature for caching LLM response with redis instance

arc53 / DocsGPT

Chatbot for documentation, that allows you to chat with your data. Privately deployable, provides AI knowledge sharing and integrates knowledge into your AI workflow