DEV Community

Cover image for I know you want to build AI applications too : LangChain
Emma Ham
Emma Ham

Posted on

I know you want to build AI applications too : LangChain

n image- title


Hey there!

How have you been?

After posting about ChatGPT prompt patterns and seeing your incredible responses,
it's clear to me that our interest isn't just in using AI tools like ChatGPT.
We're also keen on figuring out how we can use ai technologies like LLMs(Large Language Model) to build ai powered applications.
If that sounds like you, then you came to the right place🚀🚀!

Because today, what we I will be looking at together might be exactly what you've been looking for and a game changer!


So, Let's see what's on the menu today!

LangChain image- main

The core idea of this post is to introduce "LangChain" as a tool to crafting applications powered by large language models (LLMs) at a high level.

We'll begin by unpacking the essence of LangChain—the 'what' and 'why' that define its core. We’ll delve a little into its inner workings and uncover the advantages it offers.

And then we will briefly look into the main components of LangChain, exploring what LangChain is made of and what makes it work the way it does.

What won't be included in this particular post are detailed implementation guide and demo projects which I will be writing as part of a next post:)

- But first,

LangChain image- Table of content

But hold on a second, let me just put this out there.
I am not an AI expert nor a ML engineer, actually I am far from it.
Therefore, I obviously don’t know what I am talking about, what a disappointment😞😞😞

But you know that’s also exactly why I chose this topic for today’s post. Because I’ve always been a passionate user of these llm tools like ChatGPT, Bards but the idea of building something with it always felt like a huge project that was too intimidating for me to even try.

And I thought
"There’s gotta be an easier way to do this👿"

And since I’ve discovered LangChain few weeks ago, it was such an interesting learning journey. And more than anything I wanted to share this experience with you guys so I hope you guys find something useful from this article!


1. What is LangChain, anyway?

1. What is LangChain, anyway

If you look at the official documentation, LangChain is introduced as a framework designed to simplify the process of building llm powered application.

So, it's not a kind of LLM or an AI chatbot itself. Instead, it's more like a support layer that stands between us and the LLMs, making it easier for us to work with LLMs and get better results.

There are two things that make LangChain special.

First, It's data-aware.
This means you can feed your own data into large language models (LLMs). Instead of relying solely on the information they were trained with, you can link these models to your own data to get responses that are specifically meaningful to you.

The second standout feature is something called "Agents".
Agents elevate the application beyond simply responding to questions like a chatbot. They add the ability to perform actions on our behalf. Whether it's sending an email, updating a database, or buying products online, the potential uses are both endless and incredibly powerful.


Basic data flow of LangChain

Basic data flow of LangChain

The basic flow goes something like this:

1. Question
We start by asking a question to an LLM.
By this point, we've already loaded our data into LangChain, which could be stored in a Vectorstore. Then, based on the question asked,

2. Similarity/Vector search
It carry out a similarity search to pull out the relevant information.
This step is crucial for enabling the LLMs to give better responses, as this relevant info is passed as context for llm.

3. Crafting an answer
When the LLM gets the question along with this background information, it crafts an answer.

4. Actions
. Depending on our needs, we can then let the model to take a set of actions based on this process.

Some use cases
Use cases can be various but some of really common uses case could be SQL QA which we can ask a question about the data in a SQL database and get back a natural language answer. Or QA about Documents, SummarisationsAI, chatbots analysing data and so on.


2. But still, why use it?

But “Why” LangChain though?

It sounds like we can achieve this using LLM directly without any additional helper.

True, but I also believe that LangChain, like any new technology, was introduced to solve certain challenges that current large language models face.

And for that let’s take ChatGPT for example to understand why something like LangChain might be helpful.

But why use it 1

1. Limitation of real-time data access
GPT models, such as GPT-3.5, which many of us use, are limited by the information they were trained on, with the latest update being in January 2022. The newer version of GPT includes content up to April 2023, offering slightly more up-to-date information, but it comes with a subscription cost.

2. Token Limitation
This means there's a maximum number of tokens (pieces of text) that the model can handle in a single question and its response. This limitation can impact the model's understanding of questions and its ability to generate meaningful answers.

3. Hallucination
"Hallucination" in AI is also an issue. A Purdue University study last year found that when ChatGPT was tested with 517 Stack Overflow programming questions, it returned inaccurate or broken code in over half the cases.

Typical Approaches

Typical solutions

In machine learning development, when issues like these happen, it can be approached through fine-tuning, N-shot learning, and in-context learning.

Fine-tuning tweaks a model's parameters to better perform specific tasks. N-shot learning leverages a few examples to refine the model's task-specific responses. In contrast, in-context learning embeds the necessary information within the prompt itself, bypassing additional training—this can be more efficient than fine-tuning, which requires direct model adjustments and can be resource-intensive.

LangChain is designed to facilitate in-context learning effectively.


Coming back to limitations

limitations2

LangChain offers some smart solutions to tackle these issues:

For accessing information, you can use your own data or a search agent (like a Google search agent) with LangChain to do real-time searches.

For Token limitation, LangChain provides TextSplitters and its parameters to split the big chunk of information into smaller Chungs in a cleaver way helping maintain the performance of LLMs.

For Hallucination, LangChain has effective methods to shape prompts depending on the model type and includes built-in instructions to minimise errors.

What else?

Other reasons

Quite recently we will all remember this drama around this guy in OpenAI,
Like there was no there time quite like now where everything is changing so quickly, that sometimes can worries us whether we will always need to be alerted when we use things like LLM models because we just don’t know what’s gonna happen!

One of my favourite part of LangChain is that is just unbelievably easy to migrate from one llm model to another. It literally takes just few lines of code changes because whatever happens behind the scenes, it is all handled by LangChain!

How cool is this? On the top of it, LangChain are supported by some helpful tools like LangSmith which you can see in this small screenshot that it is quite helpful when debugging the application, if you want to know what was passed as prompt and how the whole process played out.


3. 6 Main Components of LangChain

So far, we looked at how long chain work from a high level, Now let’s take a look at these 6 main components of LangChain that make this work.

Main Components

- 💡 LLM : An engine of LangChain
here are plenty of models that LangChain can integrate for us including GPT models, palm, gemini, open source models like Llama etc.

- 💡 Prompt: A command to LLM
I've been saying "question" up until now for the convenience sake, but essentially, a prompt is a command we give to an LLM to get the output we want. This can be a question, some instructions, or context we provide.

- 💡 Indexing: Organising the data for LLM
Indexing is about organising and structuring our data so the LLM can work with it effectively.
We touched on text splitters before, but there are also document loaders, vectorstores, and much more!

- 💡 Memory: What makes a conversation?
Many LLM applications will have a conversational interface. An essential component of a conversation is being able to refer to information introduced earlier in the conversation. LangChain provides utilities for adding memory to a system. 

For example,
Memory types

- 💡 Chain : Chaining complex operations into a single line!

It is probably the most important concept because even the LangChain is named after this component!

In LangChain, a 'chain' links a series of operations together. Each operation's output is directly fed into the next as its input. This sequence continues from one operation to the next until the final result is produced.

Chain is a unique component that makes complicated data flow from us to llm more straight forward.

Here is a very simple example of chain

chain = prompt | llm | OutputParser

As you notice "|" is a chain expression which binds operations into a single liner.
As explained above, each operation's output will be input for the next operation.

chain flow

This is a very simple example but still explains core concept of chain.

- 💡 Agents : Let it action for you!

The core idea of agents is making language model to select actions to execute.
It is not like we are making it to perform certain actions, but rather is letting the model to use its own reasoning skills to decide on the correct action to take.

It sounds like a magic but if you take a look at this simple code snippet, It is not so much of a magic actually.

agent = initialise_agent(
   llm=llm,
    verbose=True,
    agent=AgentType,
    tools=[
        StructuredTool.from_function(
            func=plus,
            name="Sum Calculator",
            description="Use this to perform sums of two numbers. This tool take two arguments, both should be numbers.",
        )
    ],
 )
 agent.invoke(prompt)
Enter fullscreen mode Exit fullscreen mode

Take a look at the "tools=", we are sending a list of tools when we are initialising agent.
In the example above, we seem to send a tool called "plus" saying LLM to use this tool when "perform sums of two numbers".

When we invoke this agent like this

agent.invoke(
"Give me sum of 3,32,4234,32543534,43,1,0,123,344,12,11,5,78,03,12."
)
Enter fullscreen mode Exit fullscreen mode

LLM will evaluate this command and choose to use the provided tool when performing sum operations.

We are enabling LLM to use its reasoning skill in deciding what to do to return desired output by LangChain component called "Agent" here.

In real world scenarios, This could be sending an email, updating database, requesting network call or purchasing products online and many more using Agents!
Very Powerful isn't it?👏👏👏


4. Example DocsAssistantAI

DocsAssistantAI

Let's imagine DocsAssistAI uses LangChain to analyze documents for us:

a. We start by uploading any document, whether it's a calendar event, an email, an article, or a sales report.

b. Next, we break down the document into smaller sections to help the LLM process the information more efficiently.

c. These pieces are then converted into vectors—a format the LLM can understand.

d. When we ask a question, the system searches for and provides relevant context to the LLM, along with the query and chat history.

e. The LLM then gives us an answer.

For example during "Retrieving our data and send relevant data extracted to llm" part,

While this is a simplified view, LangChain allows for much more. For instance, in retrieving data, we can refine how we present information to the LLM to optimize the interaction.

RAG types


5. What's coming next?

In today's article, we've explored the world of "LangChain" - covering its essence, benefits, functionality, and providing an illustrative example to spark your imagination.

We intentionally skipped the technical deep dive, such as setting up an OpenAI API key, downloading LangChain, and integrating it with an LLM model, among other details. This more intricate exploration is reserved for a future post, should you be keen on a step-by-step implementation guide.

I hope this introduction has illuminated the path for your next development venture with LangChain. For those eager to delve further, here are some insightful articles to guide your upcoming projects!

LangChain Official Document
LangChain Concepts Tutorial on YouTube
Medium article serious of each components
Retrieval Augmented Generation (RAG) Using LangChain
Implementing RAG with Langchain and Hugging Face


Wrap!

🚀🚀🚀🚀🚀🚀🚀
That's a wrap!

I hope you've enjoyed diving into the world of LangChain as much as I did when I first stumbled upon it. Whether you're an AI engineer or not, we're all part of a world brimming with exciting and interesting discoveries every single day.

I'm eager to hear about your latest interests in AI development. What's been capturing your attention lately? Share your thoughts here, and let's foster a vibrant discussion. Also, stay tuned for our next post, where we'll delve deeper into LangChain with more exciting insights and information. Let's keep the conversation going!

🚀🚀🚀🚀🚀🚀🚀

Top comments (1)

Collapse
 
aayyusshh_69 profile image
Aayush Pokharel • Edited

🙌 Thank you for sharing.