Florian

Posted on Apr 2

From API to Agent: Insight into Engineering Design of LangChain

#langchain #agent #architecture #tugraph

Author: Florian

I'd like to make an attempt to introduce LangChain to "AI laypersons" (of course, I am one too) using as clear logic as possible. I aim to understand the design and use of LangChain from an engineering perspective. This document can also be treated as a "10-minute quick start" manual for LangChain, intending to help those in need to bootstrap their AI engineering projects.

All example code mentioned in the article is hosted on GitHub: https://github.com/TuGraph-contrib/langchain-demo

For those who like hands-on experience, you can follow along and replicate the steps as you read.

1. Introduction

1.1 What's LangChain?

Before we begin, it's necessary to start with a definition (What). LangChain is an AI engineering development framework based on open-source large language models, initiated by Harrison Chase from Harvard University at the end of October 2022. You can also ask AI:

Me: What is LangChain?

GPT-4: LangChain is an open-source language model toolkit framework designed to help researchers and developers build, experiment, and deploy natural language processing (NLP) centered applications more easily. It offers various components and tools that enable users to leverage recent advancements in language models, such as large Transformer models, and it can be integrated with platforms like Hugging Face.
The core idea of LangChain is to treat language models as collaborative tools, allowing developers to create systems that handle complex tasks and interface efficiently with various data sources and application programming interfaces (APIs). This framework aims to simplify the linking of different input and output streams and the implementation of language models in these streams.

As the name implies, the "Lang" in LangChain clearly refers to large language models, and "Chain" means linking these models with other components to build AI engineering applications. How does LangChain achieve this? Before answering, we need to first define what engineering is and what AI engineering is.

1.2 What's AI Engineering?

Let's first review the encyclopedic definition of "engineering":

Engineering refers to the process of transforming an existing entity (natural or man-made) into a manufactured product with expected use value, based on a set of envisioned goals, applying pertinent scientific knowledge and technological means, through an organized group of people.

In this context, the "goal" defines the problem domain to be addressed, informing the top-level design and capability boundaries of the engineering effort, and also defines the final form of the "product". Enhancing the ROI of "people" encapsulates the value derived from engineering design. The "entity" represents the material input for engineering, while "science | technology" is the foundation for the orderly operation of engineering; their appropriate use can improve the overall output efficiency.

Hence, we can interpret the key concepts of "AI engineering" as follows:

Goal: Specific AI user needs to be addressed, such as content creation, intelligent Q&A, document summarization, image recognition, and more.
People: The specific roles involved in implementing AI engineering, which can be programmers, or the R&D teams and startup companies developing AI applications.
Science | Technology: Evidently, the large models and related tool services, as well as the subsequent computing science theories.
Entity: Existing documents, knowledge bases, business data, and other production materials.
Product: Concrete products that meet the target needs, such as chatbots, content generation tools, etc.

1.3 How to Design LangChain?

Therefore, if we were the designers of LangChain, aiming to create a universal AI engineering framework, we would need to answer the following questions:

【Goal | Product】 What are the design goals of LangChain, and which AI engineering problems can it solve?
【People】 How should the programming interface of LangChain be defined to improve the R&D efficiency of AI engineers?
【Entity | Science | Technology】 How should the core components of LangChain be abstracted to enhance the framework's scalability?

Of course, as “Monday morning quarterbacks”, these questions have fairly clear answers at present:

As an AI engineering framework, LangChain is actually an extension and supplement to LLM capabilities. If LLM is likened to the human brain, then LangChain is the torso and limbs, assisting the LLM in the "dirty work" beyond "thinking". Its capability boundaries depend only on the level of intelligence of LLM and the richness of the toolset that LangChain can offer.
LangChain provides a declarative programming language called LCEL (LangChain Expression Language), which reduces the R&D costs for AI engineers.
LangChain offers six core abstractions—Models, Prompts, Indexes, Memory, Chains, Agents—to build complex AI applications while maintaining good scalability.

Clearly, understanding LLM as the basis of LangChain's capabilities is a prerequisite for grasping LangChain's engineering design. Next, let us start with the most basic use of the LLM API and step by step understand the engineering construction process of LangChain and the design philosophy behind it.

2. Environment

Python Environment: Version 3.8 or above is recommended.
- Download from: https://www.python.org/downloads
OpenAI SK: Must be obtained separately.
- Apply from: https://platform.openai.com/api-keys
- Environment variable: export OPENAI_API_KEY="<Your-OpenAI-SK>"
Install LangChain:
- Execute command: pip install langchain langchain-openai

3. Design

There's a popular term in the field of architectural design—Lego architecture, which can also be known as plug-and-play architecture. In simple terms, it's about finding a unified rule and feasible path to construct complex systems through reasonable abstraction of the system's basic components. This approach aims to reduce the complexity of system implementation while enhancing the overall extensibility of the system. (This is an unofficial expression, but as long as you can get my point...)

LangChain, in fact, also follows the principles of Lego architecture. Of course, as one of the most critical Lego components, the capabilities of LLMs are naturally the first we need to understand. So, let's start with OpenAI's API!

3.1 API

The text generation model service is the most core API service provided by OpenAI and has undergone several iterations since the release of ChatGPT.

3.1.1 Chat Completion API

The most recent update is the Chat Completion API, which forms the core endpoint for AI to interact with LLMs.

Example code:

import os
import requests

# API Key
api_key = os.getenv('OPENAI_API_KEY')

# Header information
headers = {
    'Content-Type': 'application/json',
    'Authorization': f'Bearer {api_key}'
}

# Prepare data
data = {
    'model': 'gpt-4',
    'messages': [{'role': 'user', 'content': 'What is graph computing?'}],
    'temperature': 0.7
}

# Call the API
url = 'https://api.openai.com/v1/chat/completions'
response = requests.post(url, json=data, headers=headers)
answer = response.json()['choices'][0]['message']['content']
print(answer)

Example code output:

Graph computing is a computation model that deals with data in large-scale graph structures and performs various complex algorithms and calculations. This computational model is mainly used in fields such as social network analysis, web search, bioinformatics, network routing optimization, data mining, and more. The core of the graph computing model is to represent data as graph structures (nodes and edges), which better reveals the relationships and interactions between data sets. In graph computing, algorithms usually run in an iterative manner, with each iteration updating the state of the nodes in the graph until a certain stopping condition is met.

3.1.2 Completion API

The earlier Completion API, phased out since July 2023, differs significantly in terms of parameters and result formats from the latest Chat Completion API. Notably, prompts are passed as plaintext rather than Message objects.

# Prepare data
data = {
    'model': 'gpt-3.5-turbo-instruct',
    'prompt': ['What is graph computing?'],
    'max_tokens': 1024
}

# Call the API
url = 'https://api.openai.com/v1/completions'
response = requests.post(url, json=data, headers=headers)
answer = response.json()['choices'][0]['text']
print(answer)

Beyond text generation, OpenAI also offers an array of LLM peripheral services to aid in the construction of more complex AI applications, such as function calls, embeddings, fine-tuning, and multimodality. For more details, refer to the content in OpenAI API Documentation.

3.2 Chat

Since the release of ChatGPT at the end of November 2022, the gates of AI truly opened to humanity, with the smart conversation feature being the most memorable to users. The OpenAI Chat Completion API supports passing message history, enabling the easy implementation of simple dialogue services.

Example code:

# Conversation history
messages = []

def chat_with_ai(message):
    # Record history
    messages.append({'role': 'user', 'content': message})
    print(f'me: {message}')

    # Dialogue request
    data = {
        'model': 'gpt-4',
        'messages': messages,
        'temperature': 0.7
    }
    url = 'https://api.openai.com/v1/chat/completions'
    response = requests.post(url, json=data, headers=headers)

    # Parse the answer
    if response.status_code == 200:
        answer = response.json()['choices'][0]['message']['content']
        messages.append({'role': 'assistant', 'content': answer})
        print(f"ai: {answer}")
    else:
        print(f'Error: {response.status_code}', response.json())

# Multi-turn conversation
chat_with_ai('What is graph computing?')
chat_with_ai('What question did I ask just now?')

Example code output:

me: What is graph computing?

ai: Graph computing is a computational model that deals with the computation and analysis of large-scale graphical structured data. In this model, data is represented as graphs, with nodes representing entities and edges representing relationships between entities. Graph computing can be used to solve many real-world problems such as social network analysis, network routing, bioinformatics, etc. One of the major challenges of graph computing is how to efficiently process large-scale graphical data and provide fast computational and analytical results.

me: What question did I ask just now?

ai: The question you asked was: "What is graph computing?"

3.3 SDK

Up to this point, we have been constructing LLM engineering capabilities with OpenAI's most primitive RESTful API and have not even used the SDK provided by OpenAI. Clearly, this is not an efficient approach. The utilization of the LangChain-OpenAI integration package langchain-openai that we installed earlier can significantly reduce the code development overhead.

Example code:

from langchain_openai import ChatOpenAI

# Invoke the Chat Completion API
llm = ChatOpenAI(model_name='gpt-4')
response = llm.invoke('What is graph computing?')
print(response)

Example code output:

content='Graph computing is a computational model primarily utilized for the computation and analysis of graph-structured data. The object of graph computing is a graph, which consists of nodes and edges; nodes represent entity objects, while edges represent the relationships between these entities. Graph computing is mainly applied to solve problems with complex entity relationships and dense interconnections, such as social network analysis, network topology analysis, and recommendation systems. The key task in graph computing is to discover and extract the implicit knowledge and information in the graph by calculating the nodes and edges.'

3.4 IO

With text generation model services, both inputs and outputs are fundamentally strings. The issue with making direct, raw calls to LLM services is the substantial need for repetitive text processing for input formatting and outcome parsing. LangChain has considered this and offers abstractions for a Prompt and OutputParser. Users can select the specific implementation type they need.

Example code:

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

# Create LLM
llm = ChatOpenAI(model_name='gpt-4')

# Create Prompt
prompt = ChatPromptTemplate.from_template("{question}")

# Create Output Parser
output_parser = StrOutputParser()

# Invoke LLM
message = prompt.invoke({'question': 'What is graph computing?'})
response = llm.invoke(message)
answer = output_parser.invoke(response)
print(answer)

3.5 Chain

The IO components of the model can indeed reduce the redundancy of text processing work, but they are still not sufficiently clear in form. This is where the key concept of LangChain is introduced: Chain.

3.5.1 HelloWorld

LangChain's Expression Language (LCEL) constructs a design equivalent to Unix's pipe operators by overloading the __or__ operator, achieving a more concise form of LLM call.

Example code:

# Create a Chain
chain = prompt | llm | output_parser

# Invoke the Chain
answer = chain.invoke({'question': 'What is graph computing?'})
print(answer)

Up to this point, we have finally seen the LangChain version of "HelloWorld"...

3.5.2 RunnablePassthrough

Of course, to simplify the parameter invocation format of Chains, you can also use RunnablePassthrough to pass through the upstream parameter input.

Example code:

from langchain_core.runnables import RunnablePassthrough

# Create Chain
chain = {"question": RunnablePassthrough()} | prompt | llm | output_parser

# Invoke Chain
answer = chain.invoke('What is graph computing?')
print(answer)

3.5.3 DAG

Additionally, Chains can fork and merge, combining into more complex DAG (Directed Acyclic Graph) computational graph.

Example code:

from operator import itemgetter

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI

# Create LLM
llm = ChatOpenAI(model_name='gpt-4')

# Create Output Parser
output_parser = StrOutputParser()

# Create Prompts
topic_prompt = ChatPromptTemplate.from_template("Generate a name for '{input}'")
good_prompt = ChatPromptTemplate.from_template("List the benefits of {topic}:")
bad_prompt = ChatPromptTemplate.from_template("List the drawbacks of {topic}:")
summary_prompt = ChatPromptTemplate.from_messages(
    [
        ("ai", "{topic}"),
        ("human", "Benefits:\n{good}\n\nDrawbacks:\n{bad}"),
        ("system", "Generate final conclusion"),
    ]
)

# Create composite Chain
topic_chain = topic_prompt | llm | output_parser | {"topic": RunnablePassthrough()}
goods_chain = good_prompt | llm | output_parser
bads_chain = bad_prompt | llm | output_parser
summary_chain = summary_prompt | llm | output_parser
chain = (
    topic_chain
    | {
        "good": goods_chain,
        "bad": bads_chain,
        "topic": itemgetter("topic"),
    }
    | summary_chain
)

# Invoke chain
answer = chain.invoke({"input": 'common fruits'})
print(answer)

Example code output:

Apples are a nutrient-rich fruit that offer a multitude of benefits including aiding digestion, protecting the heart, lowering the risk of diabetes, strengthening the immune system, assisting with weight loss, safeguarding vision, preventing asthma, fighting cancer, and enhancing memory. However, excessive consumption or improper eating methods can also lead to adverse effects, such as triggering allergies, causing diarrhea, damaging teeth, potentially carrying pesticide residues, affecting normal dietary intake and calcium absorption, increasing the risk of dental cavities, and causing stomach pain. Therefore, while we enjoy the benefits of apples, it is also important to be mindful of the quantity and the correct way to consume them.

By invoking chain.get_graph().print_ascii(), you can view the computational graph structure of a Chain. Of course, using LangSmith provides a clearer tracking of the results at each step of the computation.

Tips: To enable LangSmith, one must apply for LangChain's Access Key (AK) and set it up in the environment variables:

export LANGCHAIN_TRACING_V2="true"

export LANGCHAIN_API_KEY="<Your-LangChain-AK>"

3.5.4 LangGraph

Building upon LCEL, indeed, one can describe complex LangChain computational graph structures. However, there are inherent design limitations of DAGs, notably the lack of support for "loops". In response, the LangChain community introduced a new project — LangGraph. This project aims to construct computational graph structures on top of LangChain that accommodate loops and span multiple chains, enabling the depiction of more intricate, and even automated AI engineering application logic, such as intelligent agents. The specific usage can be referred to in the LangGraph documentation.

LangGraph claims its design philosophy is inspired by Pregel/Beam, creating a computation capability that supports multi-step iterations. This part of their design is very similar to our own creation, a graph computation engine that integrates "stream/batch/graph" computations called TuGraph. Friends who are interested can visit the TuGraph Analytics project to learn more.

3.6 Memory

By utilizing the Chain, LangChain essentially connects LLM and IO components in an orderly "workflow" mechanism, granting the capability to build complex AI engineering processes. As we know, the text generation service offered by LLM itself does not come with a memory function, necessitating users to manage conversation history themselves. Therefore, the introduction of the Memory can significantly enhance the boundary of AI engineering capabilities.

3.6.1 Memory Interface

LangChain's BaseMemory interface provides a unified abstraction for Memory (still Beta version as of v0.1.12), offering implementations of various types of Memory components. For our purposes, we chose the simplest implementation type called ConversationBufferMemory.

It's important to note that to apply the Memory component to a Chain, the subclass LLMChain must be used to create the Chain.

Example code:

from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder, \
    HumanMessagePromptTemplate
from langchain_openai import ChatOpenAI

# Create LLM
llm = ChatOpenAI(model_name='gpt-4')

# Create Prompt
prompt = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name='chat_history'),
    HumanMessagePromptTemplate.from_template('{question}')
])

# Create Memory
memory = ConversationBufferMemory(memory_key='chat_history',
                                  return_messages=True)
# Create LLMChain
llm_chain = LLMChain(llm=llm, memory=memory, prompt=prompt)

# Invoke LLMChain
print(llm_chain.predict(question='What is graph computing?'))
print(llm_chain.predict(question='What question did I just ask?'))

Example code output:

Graph computing is a type of computation primarily dealing with data structures called graphs. A graph is made up of nodes (or vertices) and edges, where nodes represent entities and edges represent the relationships between entities. In graph computing, the main problems to solve are how to perform effective computation and analysis on the structure of the graph.

The question you asked was: "What is graph computing?"

Here we can see that creating a Chain with Memory functionality doesn't come with a unified LCEL syntax. When invoking LLMChain, the method used is predict rather than invoke, and directly calling invoke would return a result of LLMResult type. Consequently, LLMChain also cannot be connected using the pipeline operator with StrOutputParser. These design issues, I surmise, are also part of the reason why the Memory module is still in Beta.

3.6.2 History Interface

However, LangChain provides a utility class RunnableWithMessageHistory, which supports the appendage of History to a Chain, thus alleviating the aforementioned issue to some extent. This requires specifying a Lambda function, get_session_history, to differentiate between sessions and config parameters must be specified when calling to define the particular session ID.

SessionHistory must be a type of the History interface BaseChatMessageHistory, and users can choose different storage implementations as needed. For simplicity, a single in-memory type ChatMessageHistory is employed globally.

Example code：

from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder, \
    HumanMessagePromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import ChatOpenAI

# Create LLM
llm = ChatOpenAI(model_name='gpt-4')

# Create Output Parser
output_parser = StrOutputParser()

# Create Prompt
prompt = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="chat_history"),
    HumanMessagePromptTemplate.from_template("{question}")
])

# Create Chain
chain = prompt | llm | output_parser

# Add History
history = ChatMessageHistory()
chain_with_history = RunnableWithMessageHistory(
    chain,
    lambda session_id: history,
    input_messages_key="question",
    history_messages_key="chat_history",
)

# Invoke Chain
print(chain_with_history.invoke({'question': 'What is graph computing?'},
                                config={"configurable": {"session_id": None}}))
print(chain_with_history.invoke({'question': 'What did I ask just now?'},
                                config={"configurable": {"session_id": None}}))

The syntax seems a bit more complicated, but compared to the Memory component, the code structure is somewhat clearer – it's better than nothing…

3.7 RAG

Once equipped with memory, AI engineering indeed found more extensive applications. Yet in specialized domains, LLMs can't learn all the nuanced professional details and thus become unreliable when answering specialized queries. They may even spout nonsense—a phenomenon labeled as "illusions" of LLMs.

Retrieval-Augmented Generation (RAG) merges information retrieval technology with large models, presenting both retrieved documents and prompts to the service. This generates more reliable responses and effectively mitigates the "illusion" issues during large model inference.

If LangChain is akin to equipping the LLM (the "brain") with "limbs and torso", RAG endows the LLM with the ability to tap into the "human library of knowledge."

Compared to prompt engineering, RAG offers a richer context and sample data pool. It does not require users to provide extensive background information and can still produce answers that align with user expectations. Compared to model fine-tuning, RAG enhances the timeliness and reliability of question-and-answer content, while also protecting business data privacy to some extent.

However, because each Q&A involves external system data retrieval, RAG has relatively higher response latency. Also, using external knowledge data consumes a considerable amount of model Tokens. Therefore, users should choose the appropriate technology based on their specific application scenarios.

By employing RunnableParallel provided by LCEL, we can clearly describe the computation graph structure of RAG. The critical part is injecting a query into the Vector Store through the context key of the Retriever.

Example code:

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores.faiss import FAISS
from langchain_core.documents import Document
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings, ChatOpenAI

# Create LLM
llm = ChatOpenAI(model_name='gpt-4')

# Create Prompt
prompt = ChatPromptTemplate.from_template('Based on the context: {context}\nAnswer: {input}')

# Create Output Parser
output_parser = StrOutputParser()

# Simulate Documents
docs = [Document(page_content="TuGraph is an open-source graph database product by Ant Group")]

# Document Embedding
splits = RecursiveCharacterTextSplitter().split_documents(docs)
vector_store = FAISS.from_documents(splits, OpenAIEmbeddings())
retriever = vector_store.as_retriever()

# Create Chain
chain_no_context = RunnablePassthrough() | llm | output_parser
chain_with_context = (
    {"context": retriever, "input": RunnablePassthrough()}
    | prompt | llm | output_parser
)

# Invoke Chain
print(chain_no_context.invoke('Is TuGraph an open-source graph database by Ant Group?'))
print(chain_with_context.invoke('Is TuGraph an open-source graph database by Ant Group?'))

Example code output:

Ant Group's graph database has not publicly disclosed information indicating its open-source status. The open-source state may change over time and with corporate policy, so it is advised to consult the official information from Ant Group or related open-source platforms to obtain the latest and accurate news.

Ant Group's graph database product TuGraph is open-source.

Combining the example and the process of storing and retrieving in a vector database, let's briefly understand the critical components in RAG.

Document Loader: Retrieves document data from external systems. For simplicity, the example directly constructs a test document object. In practice, LangChain offers an abstract interface for the document loader BaseLoader and numerous implementations, which can be specifically selected based on your needs.
Text Splitter: Splits documents into chunks to fit the context window of large models. The example uses the commonly employed RecursiveCharacterTextSplitter. For other options, refer to LangChain's TextSplitter interface and implementations.
Embedding Model: A text embedding model that provides the capability to encode text into vectors. Text embedding encoding is performed before document writing and query matching. The example utilizes OpenAI's text embedding model service, with other options referenced through LangChain's Embeddings interface and implementations.
Vector Store: A vector storage that provides the capability for vector storage and similarity retrieval (ANN algorithms). The vector stores supported by LangChain can be referenced through the VectorStore interface and implementations. The example employs Meta's Faiss vector database, installable locally via pip install faiss-cpu. It is worth noting that for graph databases, the similarity search problem can be transformed into a graph traversal problem, endowing it with more robust knowledge interpretability. The open-source TuGraph database by Ant Group is currently exploring similar technologies.
Retriever: A query handler for vector storage. Generally implemented in conjunction with VectorStore, retrievable through the as_retriever method, and LangChain's abstract Retriever interface is BaseRetriever.

3.8 Tool

"The ability to use tools is a fundamental difference between humans and animals."

To build more powerful AI engineering applications, having the capability of merely generating text, akin to strategizing on paper, is naturally not sufficient. Tools are not only extensions of the "limbs" but also wings of imagination for the "brain". Utilizing tools allows the potential of AI applications to become truly limitless, enabling us to move from “understanding the world” to “changing the world”.

Here I must mention the function invocation capability provided by OpenAI’s Chat Completion API (note that this is not Assistant’s function calling), which can be included in the dialogue request by describing the tool’s definition format using the tools parameter (the former functions parameter is now deprecated). The LLM will deduce which tools are needed based on the prompt and provide specific call parameters. Users need to trigger the callback of the relevant tool according to the returned tool invocation information. In the next chapter, we can see that the tool's invocation action can be autonomously managed by an Agent.

To simplify the code implementation, we used LangChain’s annotation @tool to define a test tool function, "getting the current temperature of a specified city". It can then be bound to the LLM object via the bind_tools method. It is important to note that JsonOutputToolsParser should be used to parse the result output.

Example code:

import random

from langchain_core.output_parsers.openai_tools import JsonOutputToolsParser
from langchain_core.runnables import RunnablePassthrough
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI


# Define Tool
@tool
def get_temperature(city: str) -> int:
    """Get the current temperature of a specified city"""
    return random.randint(-20, 50)


# Create LLM
llm = ChatOpenAI(model_name='gpt-4')

# Create JSON Output Parser
output_parser = JsonOutputToolsParser()

# Create Chain
chain = (
    RunnablePassthrough()
    | llm.bind_tools(tools=[get_temperature])
    | output_parser
)

# Invoke Chain
print(chain.invoke('What is the temperature in Hangzhou today?'))

Example code output：

[{'type': 'get_temperature', 'args': {'city': 'Hangzhou'}}]

In fact, LangChain offers extensive support for a multitude of built-in tools and tool libraries. @tool merely provides streamlined support for creating tools, but to customize complex tool behaviors, you must implement the BaseTool tool interface yourself. There is also a plethora of implementations under the BaseToolkit interface, such as vector storage, SQL databases, GitHub, and so on. Users can choose or extend these according to their own needs.

3.9 Agent

General Artificial Intelligence (AGI) is almost universally acknowledged as the ultimate form of AI. By analogy, constructing an intelligent agent (Agent) is currently considered as the "ultimate form" of AI engineering application.

3.9.1 What's Agent?

By referencing the definition of an Agent in LangChain, we can get a glimpse of the difference between an Agent and a Chain.

The core idea of agents is to use a language model to choose a sequence of actions to take. In chains, a sequence of actions is hardcoded (in code). In agents, a language model is used as a reasoning engine to determine which actions to take and in which order.

The most characteristic feature of an Agent compared to Chain is "autonomy". It can autonomously decide which knowledge to acquire and actions to take, leveraging the reasoning capabilities of LLMs, until it reaches the user's ultimate goal.

Therefore, as an intelligent agent, it must have the following core capabilities:

Planning: Leveraging the LLM's strong reasoning abilities to plan and self-reflect on task objectives.
Memory: Possessing short-term memory (context) and long-term memory (vector storage), as well as rapid knowledge retrieval capabilities.
Action: Correctly calling tools according to the task requirements broken down to achieve the task's purpose.
Collaboration: Working interactively with other agents to accomplish more complex task objectives.

3.9.2 Create Agent

We continue with the example that was not finished in the Tool section using an Agent. Here we create a simple OpenAI tool Agent using the create_openai_tools_agent method. The AgentExecutor will automatically handle the tool invocation actions. If you want to endow the Agent with memory abilities, you can still adopt the approach mentioned in the previous Memory section with RunnableWithMessageHistory's solution.

Example code:

import random

from langchain.agents import create_openai_tools_agent, \
    AgentExecutor
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder, \
    HumanMessagePromptTemplate, SystemMessagePromptTemplate
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI

# Create LLM
llm = ChatOpenAI()

# Define Tool
@tool
def get_temperature(city: str) -> int:
    """Get the current temperature of a specified city"""
    return random.randint(-20, 50)


# Create Agent prompt Template
prompt = ChatPromptTemplate.from_messages([
    SystemMessagePromptTemplate.from_template('You are a helpful assistant'),
    MessagesPlaceholder(variable_name='chat_history', optional=True),
    HumanMessagePromptTemplate.from_template('{input}'),
    MessagesPlaceholder(variable_name='agent_scratchpad')
])

# Create Agent
tools = [get_temperature]
agent = create_openai_tools_agent(llm, tools, prompt=prompt)

# Execute Agent
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
print(agent_executor.invoke({'input': 'What is the temperature in Hangzhou today?'})['output'])

Example code output：

> Entering new AgentExecutor chain...

Invoking: get_temperature with {'city': 'Hangzhou'}

16 The temperature in Hangzhou is 16 degrees Celsius today.

> Finished chain.

The temperature in Hangzhou is 16 degrees Celsius today.

LangChain's Hub feature offers a convenient way to manage and share Agent prompt templates. The example code's Agent prompt templates are equivalent to those defined by hwchase17/openai-tools-agent.

A prompt can be directly referenced through the code prompt = hub.pull("hwchase17/openai-tools-agent").

4. LangChain Architecture

From API to Agent, we "brainstormed" the entire trajectory of AI engineering applications, through which I believe everyone should have a deeper understanding of the design philosophy of LangChain.

Lastly, let's revisit the product architecture of LangChain. Besides LangServe—which was not mentioned in this article and deploys Chain as a RESTful service—no further elaboration is needed.

5. Epilogue

Some of you may wonder why someone who works with graph computing is now dabbling in AI tools. Putting aside grand slogans like "embracing the industry and following trends", purely from a work-need perspective, using AI tools to accelerate content production can indeed save me a significant amount of time and energy. The "leisure" time I gain from this can be excellently spent on parenting...

Online, although the discussion about LangChain may not be excessively abundant, it's still readily available. As someone who has "entered the field" later, having been involved in architecture and development of data warehouses, middleware, and cloud products for many years and lately working on foundational software like graph computing, I wanted to try and combine my diverse engineering experience to understand the underlying design philosophy of LangChain. I aim to offer a unique perspective for everyone to observe, experience, and reflect on AI engineering. These past few days, the news about "AI programmers" has been all over the place, and a tweet from Adam Rackis resonated deeply with me: "Be ready to grind & learn deeply. Expertly guiding AI to a good result might be our value-add." Perhaps, this could indeed be where AI engineering truly becomes indispensable.

6. Reference

Retrieval-Augmented Generation for Large Language Models: A Survey：https://arxiv.org/abs/2312.10997
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models：https://arxiv.org/pdf/2201.11903.pdf
Tree of Thoughts: Deliberate Problem Solving with Large Language Models：https://arxiv.org/pdf/2305.10601.pdf
ReAct: Synergizing Reasoning and Acting in Language Models：https://arxiv.org/abs/2210.03629
API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs：https://arxiv.org/abs/2304.08244
LangChain Docs：https://python.langchain.com/
OpenAI Docs：https://platform.openai.com/docs
LangGraph Docs：https://python.langchain.com/docs/langgraph
LangChain：[https://github.com/langchain-ai/langchain]
LangGraph：[https://github.com/langchain-ai/langgraph]
TuGraph Analytics：https://github.com/TuGraph-family/tugraph-analytics
TuGraph DB：https://github.com/TuGraph-family/tugraph-db
Langchain-Chatchat：https://github.com/chatchat-space/Langchain-Chatchat
LLM Powered Autonomous Agents：https://lilianweng.github.io/posts/2023-06-23-agent/
Emerging Architectures for LLM Applications：https://a16z.com/emerging-architectures-for-llm-applications/
Prompt Engineering Guide：https://www.promptingguide.ai/
Bond Copilot: Unleashing Refinitiv Data Library Search API with AI (LLM) ：https://developers.lseg.com/en/article-catalog/article/bond-copilot--unleashing-rd-lib-search-api-with-ai-llm-langchain

DEV Community

From API to Agent: Insight into Engineering Design of LangChain

1. Introduction

1.1 What's LangChain?

1.2 What's AI Engineering?

1.3 How to Design LangChain?

2. Environment

3. Design

3.1 API

3.1.1 Chat Completion API

3.1.2 Completion API

3.2 Chat

3.3 SDK

3.4 IO

3.5 Chain

3.5.1 HelloWorld

3.5.2 RunnablePassthrough

3.5.3 DAG

3.5.4 LangGraph

3.6 Memory

3.6.1 Memory Interface

3.6.2 History Interface

3.7 RAG

3.8 Tool

3.9 Agent

3.9.1 What's Agent?

3.9.2 Create Agent

4. LangChain Architecture

5. Epilogue

6. Reference

Top comments (0)

Read next

INTRODUCTION TO SQL

Blocker: How to load environment variables on broswer environment using webpack.config.js

LeetCode Meditations: Construct Binary Tree from Preorder and Inorder Traversal

What is Mina Protocol