Software development heavily relies on third-party libraries and packages.
They evolve regularly, with new exposed methods, updating to support OS updates, and more. This poses a problem when using coding copilots that use LLMs trained on data with a cut-off date. Oct’23 for most latest models.
At CommandDash, we’re committed to solving this.
A Big Bump in the Road of Automating Software Development
Developers today can’t build apps without incorporating at least a dozen libraries. This integration is a fundamental aspect of software development.
While AI has excelled at generating code from thousands of interactions and data points, it struggles with integrating these packages due to unreliable code generation.
This isn’t acceptable and stands in the way of autonomous software development.
Efforts Being Made to Solve This
Many copilots are attempting to tackle this issue but often encounter fundamental problems:
Approach | Benefits | Limitations |
---|---|---|
Enabling Web Search | - Answers simple questions like the latest version of a package. | - Fails with complex queries. - High-level results aren’t useful for integration. |
Indexing Documentation | - Helps in providing additional context to LLMs increasing quality by 20-30%. | - Many sites lack sitemaps or have paywalls. - Poorly written documentation. - Valuable info buried in support tickets. |
GitHub Repository Indexing | - Provides deeper answers by referring to code and issues. | - Limited to open-source projects and frameworks. |
In general, trust in these approaches is extremely low, and developers end up consulting documentation themselves.
Our Vision of Solving This: Infrastructure + Generated RAG
At CommandDash, our primary goal is to address the infrastructural issues to lay the groundwork for building world-class integrating copilots. We aim to create custom agents for all libraries using multiple data sources:
- Scraping data from all package managers like NPM, PyPi, and Pub.dev.
- Regularly re-indexing open-source repositories, including code, issues, PRs, wikis, and more.
- Building AI-assisted website scrapers to handle even the most challenging websites.
We collect this data in parallel with user demands and interactions, enabling the community to contribute trusted sources like blogs and example repositories.
Generated RAG
Once we have all the data, RAG alone takes us only so far. From our observations:
When RAG Works | When RAG Fails |
---|---|
Single-scoped questions (e.g., how does {x} method work?) | Almost every other time. |
Why?
- It’s extremely difficult to find the right pieces of documentation from high-level user requirements. Our Langchain agent struggles to convert instructions like “write me code to count apples in an image” into correctly implemented code.
- Even with the right documentation pieces, LLMs are poor at converting them into integration code. LLMs prefer examples over documentation.
This is why we need Generated RAG.
Generated RAG uses a small model continuously fine-tuned on a library’s data to generate examples for specific user use cases within ~1 second.
For instance, when a user asks, “write me code to count apples in an image”, the fine-tuned model on Langchain’s data should generate 2-3 examples of:
- “Multimodal chat requests”,
- “Counting prompt instructions”,
- “Code implementation for the best multimodal models”
Then, a larger model (like Llama3) can use these examples as references instead of documentation pieces to generate a perfectly working piece of code.
We’re still on the journey to achieving this. Try CommandDash and share your thoughts with us.
Top comments (0)