What is LangSmith and why should I care as a developer?

#openai #langchain #llms #ai

I have said it before and I will say it again, the tooling around large language models (LLM’s) is still in its infancy. Due to the nature of LLM’s and their inherent dynamism, traditional software tooling is often ill-equipped to handle these models out of the box.

Enter LangChain and LangSmith.

In this post, we will explore the latest product by the team that created Langchain (the most popular LLM software tool) and see what new parts of the LLM stack LangSmith hopes to tackle.

If you are new to LLM development, the first place to start is with Langchain itself. I wrote up a comprehensive intro with details on what problems it can solve.

[Quick note: I am writing this article to reflect my personal views as I explore the LLM ecosystem and this is not intended to represent the views of my employer, hence being on my personal blog]

What is LangSmith? 🤔

When Langchain was originally created, the goal was to reduce the barrier to entry with respect to building prototypes. Despite some pushback on the viability of Langchain as a tool, I think it has largely delivered on this goal. The next problem space to tackle after prototypes is helping get these applications into production and ensuring this happens in a reliable and maintainable way. The simple mental model is:

Langchain = prototyping
LangSmith = production

But what are the production challenges that need to be solved which were not as relevant in prototyping?

Reliability — it is deceptively easy to build something that works well for a simple constrained example but actually still quite hard today to build LLM applications with the consistency that most companies would want.

To tackle this, LangSmith provides new features around 5 core pillars:

Debugging
Testing
Evaluating
Monitoring
Usage metrics

A huge part of the value add for LangSmith is being able to do all of these things from a simple and intuitive UI which significantly reduces the barrier to entry for those without a software background.

There are also a lot of things about LLM’s that are not intuitive when you look at them from a numbers perspective so being able to see visualizations through a UI will be useful (e.g. how temperature effects model output distribution). I personally find that having a polished UI can actually be the accelerant to my prototyping and work since doing everything with code can often times be cumbersome.

Further, being able to visualize the process the LLM system is going through and a complex chain of commands can be super useful in understanding why you are getting the output that you are. As you build more complex workflows, it can be hard to understand exactly what queries are moving through different flows so a simple UI to see this and log the historical data is going to be a value add from day one.

Who is competing with LangSmith?

While not direct competitors thus far, it does make a lot of sense for organizations like Vercel (who have the AI SDK) to build similar features into their platform given the desire to be the place for AI builders. I would imagine that other platforms build similar tooling over the next 3–6 months given the market for these tools has so much potential.

Vercel is still more focused on the deployment and serving aspect of LLM’s today since that is more aligned historically with their core product but it would make sense to extend the AI SDK to do more over time.

While LangSmith does not appear to go deep on Embeddings yet, there does seem to be a ton of natural crossover between this and many of the Embeddings providers who are differentiating with the batteries included UI. Ecosystems like LlamaIndex would benefit from this type of product development but it is unclear they can stay differentiated overtime as the problem space seems to be very similar.

Despite this, it is nice to see LangSmith still wanting to connect with as many tools as possible. In the launch blog post, they mentioned integrations with OpenAI evals as well as multiple fine-tuning providers that will enable developers to export data and directly train on it. These types of integrations seem like they will not only enable a ton of developer goodwill but actually serve as a lightweight moat over time (connecting things is not always easy).

What I want from LangSmith 👀

The main ask I have is extensibility. I really do think there could be orders of magnitude more impact if the core of LangSmith could be built into other applications and services. Allowing developers to sign in with their LangChain account and monitor their LLM’s on Vercel for example with the AI SDK and deployment information all in one place would be extremely valuable.

What it takes to be differentiated over a long period of time

I am very excited about LangSmith, hence spending the time to write this up. I think it solves a bunch of actual problems that developers and builders have when trying to go into production. The real long term question still remains: “is there enough here to build a long term defensible business”.

I do not have a crystal ball (shockingly) but my general mindset today is that many of the current features of LangSmith are table stakes for developers. Most LLM providers will want to build similar features into their platforms over time. That doesn’t mean that LangSmith cannot succeed though. Just look at Terraform by HashiCorp for example, it is the glue that sits in between all the cloud providers and solves a large enough problem to be a publicly traded company. But LangSmith will need to continue to expand in scope in order to be competitive with multiple providers and other tooling ecosystems.

You got this Harrison!