ContextCheck: An open-source framework for testing and evaluating LLMs, RAGs, Chatbots

#ai

Hey devs!

We just open-sourced ContextCheck, a framework for testing and evaluating LLMs, RAGs, and chatbots 🚀

What it does:

Generates queries and handles completions
Detects regressions and hallucinations
Runs penetration tests
Works in CI pipelines (YAML-configurable)

We built it while developing our AI Knowledge Base Assistant to solve real headaches with testing and validating LLMs. Now it’s out there for you to use, break, and improve.

Try it out and let us know what you think! ➡️ Github repo

Top comments (0)

A Practical Guide to Reducing LLM Hallucinations with Sandboxed Code Interpreter

Dmitrii - Dec 21

The Limitations of Machine Learning: What We Still Can't Teach Machines

Arbisoft - Dec 16

.NET Development and Localization for JustAnswer – case study

Abto Software - Dec 17

TransMonkey: A Versatile Alternative to DeepL?

Fagac Gvwc - Dec 16

DEV Community