Newsletter - April Edition

#ai #testing #developer #tooling

Announcements

We’ve gone live with our new developer tool for evaluating, improving, and observing your LLM applications – both during development and in production. For teams who are serious about shipping production-ready LLM apps (i.e., that are high-quality, trustworthy, and cost-effective), we’ve built Inductor to help you do so much more rapidly, easily, and reliably.

Interested in early access to Inductor? Request access here.

‍

A note from our founder

The developer community has been building web applications for well over 25 years. During that time, we’ve created a set of highly effective practices and tools for building web apps productively, and for reliably delivering great experiences to users. This has been critical to web apps becoming a ubiquitous way in which developers (and businesses) innovate and deliver value to their users.

We as a community are now at the beginning of that journey for LLM-powered applications – a different (and much newer) class of applications that are poised to have at least as much impact as web apps, if not much more. The underlying tech (i.e., large language models) is powerful, but it remains challenging and time-consuming to build and ship LLM apps that reliably deliver high-quality, high-value experiences to users. We’ve seen too many teams struggle to do this (and have experienced this pain ourselves).

At Inductor, we’re working to solve this problem. The work required to build a production-ready LLM app differs in fundamental ways from the work of building other types of applications, such as web apps. In particular, LLM applications require iterative development driven by experimentation and evaluation, as they cannot be a priori written to guarantee desired behavior (e.g., LLMs’ inputs cannot be designed a priori to guarantee desired output behavior). The only way to build a high-quality LLM application is to iterate and experiment your way to success, powered by data and rigorous evaluation; it is essential to then also observe and understand live usage to detect issues and fuel further improvement. Today, the work of doing all of this is too often slow and painful.

We’ve been building Inductor to address this. We’re excited to say that we’ve recently gone live with our new product, and Inductor is already being used by customers to build and deliver production-ready LLM apps. And, we’re working on a whole lot more – we’ll keep you in the loop along the way.

‍

Release Notes

We’ve been busy building – below are some of the exciting new capabilities that we’ve recently added to Inductor (see our demo video for an overview of Inductor’s features beyond the below):

Use Inductor’s application-level Hyperparameters to automatically screen across different versions of your LLM app in order to rapidly determine which is best for your needs. This enables you to super-quickly and systematically test variants of any facet of your LLM app, such as different models, prompts, RAG strategies, or pre- or post-processing approaches.
Add LLM-powered quality measures to your test suites (or live traffic) to automate and scale up human-style evaluations. Inductor automatically determines the degree of alignment between your LLM evals and any corresponding human evals, in order to ensure that your LLM-powered evaluations rigorously reflect your human definitions of quality.
Use Inductor’s rich suite of sharing functionality to securely collaborate with team members or other stakeholders to get feedback and analyze results.
Run quality measures (of any type - function, human, or LLM) on live executions to automatically and continuously assess quality on your live traffic. Filter live executions by quality measures to rapidly diagnose issues and areas for improvement.

DEV Community

Newsletter - April Edition

Announcements

A note from our founder

Release Notes

Top comments (0)

Read next

8 tips to learn GenAI in 2025

How Gamification in Web Development Can Supercharge User Experience

Enhance Your ArcGIS Web App with OpenAI

Cracking the Code of AI Conversations: The Art of Prompt Engineering