DEV Community

Cover image for Integrate Orquesta with LangChain
Olumide Shittu for Orquesta

Posted on

Integrate Orquesta with LangChain

Orquesta provides your product teams with no-code collaboration tooling to experiment, operate, and monitor LLMs and remote configurations within your SaaS. As an LLMOps engineer, using Orquesta, you can easily perform prompt engineering, prompt management, LLMOps, experimentation in production, push new versions directly to production, and have full observability and monitoring.

LangChain is a framework for developing applications powered by large language models. It enables applications that are data-aware to connect a language model to other sources of data, and it allows a language model to interact with its environment.

In this article, you will learn how to integrate Orquesta and LangChain. We will explain how to set a prompt in Orquesta, and request it from LangChain to predict an output. All this is possible with the help of the Orquesta Python SDK and can be implemented in a few easy steps.

Prerequisites

For you to be able to follow along in this tutorial, you will need the following:

  • Jupyter Notebook (or any IDE of your choice).

  • Orquesta Python SDK.

Step 1 - Install SDK and create a client instance

You can easily install the Python SDK and Cohere via the Python package installer pip.

pip install orquesta-sdk
pip install langchain
Enter fullscreen mode Exit fullscreen mode

This will install the Orquesta SDK and LangChain on your local machine, but you need to understand that this command will only install the bare minimum requirements of LangChain. A lot of the value of LangChain comes when integrating it with various model providers, data stores, etc.

Grab your API Key from Orquesta (https://my.orquesta.dev/<workspace-name>/settings/developers ) which will be used to create a client instance.

import os
import time
from orquesta_sdk import OrquestaClient, OrquestaClientOptions
from orquesta_sdk.prompts import OrquestaPromptMetrics, OrquestaPromptMetricsEconomics
from orquesta_sdk.helpers import orquesta_openai_parameters_mapper

from langchain.schema import AIMessage, HumanMessage, SystemMessage
from langchain.chat_models import ChatOpenAI
from langchain.callbacks import get_openai_callback
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • Import the time module to calculate the total time for the program to run.

  • The OrquestaClient and the OrquestaClientOptions classes which are already defined in the orquesta_sdk module are imported.

  • To be able to log all the interactions with the LLM, we use the OrquestaPromptMetrics class.

  • Orquesta has many helper functions that map and interface between Orquesta and specific LLM provider, for this integration, we will make use of the orquesta_openai_parameters_mapper helper.

  • The AIMessage class is a message from an AI, HumanMessage is a message from a human, and the SystemMessage is a message for priming AI behaviour, usually passed in as the first of a sequence of input messages.

  • The ChatOpenAI class is an OpenAI Chat large language models API. To be able to use it, you should have the OpenAI Python package installed and the environment variable OPENAI_API_KEY set with your API key.

# Initialize Orquesta client
from orquesta_sdk import OrquestaClient, OrquestaClientOptions

api_key = "ORQUESTA-API-KEY"
options = OrquestaClientOptions(api_key=api_key, ttl=3600)
client = OrquestaClient(options)
Enter fullscreen mode Exit fullscreen mode
  • An instance of the OrquestaClient class is created and initialized with the previously configured options object. This client instance can now interact with the Orquesta service using the provided API key for authentication.

  • In the next line of code, we create the instance of the OrquestaClientOptions and configure it with the api_key and the ttl (Time to Live) in seconds for the local cache; by default, it is 3600 seconds (1 hour).

Step 2 - Set up a chat prompt

Set up your chat prompt in the Orquesta dashboard. Make sure it is a chat prompt and not a completion prompt. Set your prompt key and domain (if you have any), and Publish.

Set up a chat prompt

Once that is set up, create your first chat prompt, give it a name prompt, and add all the necessary information. Click on Save.

chat prompt

As you can see from the screenshot, the prompt message is “What is a good name for a company that makes good beard oil”, and the model is openai/gpt-3.5-turbo. Click Save.

Step 3 - Request a variant from Orquesta

To request a specific variant from your newly created prompt, the Code Snippet Generator can easily generate the code for a prompt variant by right-clicking on the prompt or opening the Code Snippet component.

Request a variant from Orquesta

Copy the code snippet and paste it into your editor.

prompt = client.prompts.query(
    key="customer-support-chat",
    context={"environments": ["test"]},
    variables={"customer_name": ""},
    metadata={"chain-id": "js2938js2ja"},
)
Enter fullscreen mode Exit fullscreen mode

Step 4 - Transform the message into LangChain format

The prompt from Orquesta is transformed into a format to pass into LangChain.

# Start time of the completion request
start_time = time.time()
print(f'Start time: {start_time}')

messages = []

for message in prompt.value.get("messages", []):
    role = message.get("role")
    content = message.get("content")

    if role == "system":
        messages.append(SystemMessage(content=content))
    elif role == "user":
        messages.append(HumanMessage(content=content))
    elif role == "assistant":
        messages.append(AIMessage(content=content))

parameters = orquesta_openai_parameters_mapper(prompt.value)

chat = ChatOpenAI(
    temperature=parameters.get("temperature"),
    max_tokens=parameters.get("max_tokens"),
    openai_api_key="api_key",
)

with get_openai_callback() as cb:
    result = chat(messages)

    # End time of the completion request
    end_time = time.time()
    print(f"End time: {end_time}")

    print(result.content)

    # Calculate the difference (latency) in milliseconds
    latency = (end_time - start_time) * 1000
    print(f'Latency is: {latency}')

    economics = OrquestaPromptMetricsEconomics(
        total_tokens=cb.total_tokens,
        completion_tokens=cb.completion_tokens,
        prompt_tokens=cb.prompt_tokens,
    )

    # Report the metrics back to Orquesta
    metrics = OrquestaPromptMetrics(
        economics=economics,
        llm_response=result.content,
        latency=latency
    )

    prompt.add_metrics(metrics=metrics)
Enter fullscreen mode Exit fullscreen mode

Explanation

  • Initialize an empty list named messages, which will store message objects.

  • A for loop iterates through the list of messages obtained from prompt.value. If no messages are found, an empty list is used as a default value.

  • Within the loop, the code extracts the role and content attributes from each message.

  • Depending on the role of the message ("system", "user", or "assistant"), a message object is created and appended to the messages list.

  • Pass in the value of the prompt into the Orquesta OpenAI helper and store them in the parameters variable.

  • A ChatOpenAI object is created with specified parameters, including the temperature and maximum tokens, which affect the behaviour of the language model. The openai_api_key is provided as an argument.

  • The chat object is invoked with the messages list as an argument. This processes the messages using the language model and generates a response.

Finally, the content of the response generated by the language model is printed to the console.

Final predictions

The response from the LLM is “Beard Bliss”.

Wrap up

In conclusion, the integration of Orquesta SDK with LangChain brings forth a powerful synergy that amplifies the capabilities of both platforms, and you have been able to set up a prompt in Orquesta, create a client, connect with LangChain, and get a response from the LangChain OpenAI API.

Links

Check out Orquesta documentation.

Full code

Here is the full code for this tutorial.

import os
import time
from orquesta_sdk import OrquestaClient, OrquestaClientOptions
from orquesta_sdk.prompts import OrquestaPromptMetrics, OrquestaPromptMetricsEconomics
from orquesta_sdk.helpers import orquesta_openai_parameters_mapper

from langchain.schema import AIMessage, HumanMessage, SystemMessage
from langchain.chat_models import ChatOpenAI
from langchain.callbacks import get_openai_callback

# Initialize Orquesta client
from orquesta_sdk import OrquestaClient, OrquestaClientOptions

api_key = "ORQUESTA-API-KEY"
options = OrquestaClientOptions(api_key=api_key, ttl=3600)
client = OrquestaClient(options)

prompt = client.prompts.query(
    key="customer-support-chat",
    context={"environments": ["test"]},
    variables={"customer_name": ""},
    metadata={"chain-id": "js2938js2ja"},
)

# Start time of the completion request
start_time = time.time()
print(f'Start time: {start_time}')

messages = []

for message in prompt.value.get("messages", []):
    role = message.get("role")
    content = message.get("content")

    if role == "system":
        messages.append(SystemMessage(content=content))
    elif role == "user":
        messages.append(HumanMessage(content=content))
    elif role == "assistant":
        messages.append(AIMessage(content=content))

parameters = orquesta_openai_parameters_mapper(prompt.value)

chat = ChatOpenAI(
    temperature=parameters.get("temperature"),
    max_tokens=parameters.get("max_tokens"),
    openai_api_key="api_key",
)

with get_openai_callback() as cb:
    result = chat(messages)

    # End time of the completion request
    end_time = time.time()
    print(f"End time: {end_time}")

    print(result.content)

    # Calculate the difference (latency) in milliseconds
    latency = (end_time - start_time) * 1000
    print(f'Latency is: {latency}')

    economics = OrquestaPromptMetricsEconomics(
        total_tokens=cb.total_tokens,
        completion_tokens=cb.completion_tokens,
        prompt_tokens=cb.prompt_tokens,
    )

    # Report the metrics back to Orquesta
    metrics = OrquestaPromptMetrics(
        economics=economics,
        llm_response=result.content,
        latency=latency
    )

    prompt.add_metrics(metrics=metrics)
Enter fullscreen mode Exit fullscreen mode

Top comments (0)