Kris Dover

Posted on Dec 14, 2022 • Edited on Dec 16, 2022

Migrating Twilio Flex Webchat from Autopilot to AWS Lex

#learning #career #discuss

Twilio Autopilot is a Natural Language Understanding Service (NLU), which I suspect, given Twilio's existing relationship with IBM, is just a façade over the Watson Digital Assistant. As part of a customer support platform we've integrated the default Flex Webchat Studio Flow with Autopilot to provide a simple chatbot for deflecting customer requests to our knowledge base articles so they can self-service and providing pre-engagement (e.g. selecting a issue to follow-up) before handing off to a customer support representative. This has all worked very well, up until we were forced to find a replacement for Autopilot.

In September Twilio announced the end of life of their Autopilot conversational AI Platform:

Twilio's Autopilot product will officially shut down on August 25, 2023. After this date, Autopilot APIs will no longer function, and starting February 25, 2023, Customer Support will cease responding to new requests.

While we don’t have a single in-house replacement product available at this time, we have a number of recommended solutions below for different use cases. For continuity of service, all Autopilot users must migrate their integrations prior to the August 25, 2023 end of service date.

While they haven't provided a replacement or migration strategy they have given a list of suggested alternatives, of which we decided to go with AWS's Lex NLU service. The only issue was that AWS Lex channel integrations for Twilio is currently limited to SMS.

So we started by doing an investigation into Twilio Webchat's current system architecture to understand what bits would need replacing to support integrating with Lex.

Architecture

Twilio Webchat with Autopilot (existing)

The following is an architectural diagram of how the Twilio Webchat (with Autopilot) currently works, based on documentation available at How Messaging Works in Twilio Flex. One point to note is that a Twilio Studio flow is at the heart of this design, and it orchestrates the initiate chat session with the Autopilot chatbot, up until the subsequent agent hand-off, after which it is removed from the chat channel and the Flex Agent Desktop joins.

Twilio Webchat with Lex Chatbot (new)

This new webchat design uses a new Webhook Flex Flow config to establish the customer chat session with a custom Lex integration webhook (implemented using AWS Lambda) which replaces the Studio Webchat Flow. It is this AWS lambda webhook which integrates the Twilio Programmable Chat channel with the Lex Chatbot and orchestrates the subsequent agent hand-off.

Twilio Webhook Integration with AWS Lambda

Before getting into the code it is worth understanding how the custom webhook is configured into Twilio. As shown in the screenshot below, we modified the stock WebChat Flex Flow and changed the Integration Type from Studio to Webhook, then configured the URL to the AWS Lambda (sitting behind API Gateway) into the Webhook URL field:

Webhook requests from Twilio are URL encoded (i.e. application/x-www-form-urlencoded) so we use a custom Velocity Templating Language (VTL) mapping in API Gateway so that the Lambda receives JSON:

Lambda Webhook Code Explanation

You'll need the following packages installed to get the code working:

npm install @aws-sdk/client-lex-runtime-v2 twilio aws-lambda
npm install -D @types/aws-lambda @twilio-labs/serverless-runtime-types

It is also assumed that the configuration parameters for the Lambda will be passed as environment variables, but for productionised code you might want to use the AWS Parameter Store for any secrets.

A full TypeScript code sample is available in a Gist, but the following is an explanation of the key bits:

Firstly, our Lambda function is configured in API Gateway to be invoked as an async function (i.e. it won't wait for a response), since Twilio expects the webhook to respond in less than 5 seconds, hence the Promise<void> return type:

export async function handler(event: ChatMessageEvent): Promise<void> {
   ....
}

This is achieved using the following API Gateway Request Header config:

As a security measure, we validate the signatures in the headers of all Twilio Webhook requests using our TWILIO_AUTH_TOKEN:

  if (
    !twilio.validateRequest(
      process.env.TWILIO_AUTH_TOKEN!,
      request.header["X-Twilio-Signature"]!,
      `https://${request.domainName}${request.requestPath}`,
      chatEvent
    )
  ) {
    console.warn(
      "X-Twilio-Signature validation failed:",
      request.header["X-Twilio-Signature"]
    );
    return;
  }

Currently, Twilio only invokes our webhook with onMessageSent events but it's best to future proof against this changing by existing early for unsupported EventTypes.

  if (chatEvent.EventType !== "onMessageSent") {
    return;
  }

This means that the Lex chatbot won't be invoked when chat is first initiated (i.e. no onChannelAdded event), but rather after the first messaging is sent by the User. In our customised WebChat UI we workaround this by auto-sending a hidden "Hi" message on the user's behalf when chat starts.

We retrieve the chat channel attributes for the request using Twilio's Node Helper Library because it contains the chat pre-engagement details and chat user name:

    const channelCtx = twilio(chatEvent.AccountSid, 
      process.env.TWILIO_AUTH_TOKEN!, {
      edge: "sydney", // change to match your AWS region
    })
      .chat.services(chatEvent.InstanceSid)
      .channels(chatEvent.ChannelSid);

    const channel = await channelCtx.fetch();

    const channelAttribs = JSON.parse(channel.attributes) as {
      // Flex WebChat UI Attributes
      from: string;
      pre_engagement_data: {
        friendlyName?: string;
        question?: string;
        location?: string;
      };
      // Custom attributes
      botBusy: boolean;
    };

Here we setup the LexV2 RecognizeTextCommand with the IDs of our chatbot - which for the purpose this article I assume you have already have or know how to create, as there is already good documentation on how to get started with AWS Lex V2.

Things to note are:

we use the unique chat channelSid as our Lex session ID,
pass some of the chatEvent and chatChannel attributes through as Lex requestAttributes so that our Lex Code Hook lambda can use them to retrieve/modify the chat channel, and
pass the entire user chat message as the text of the Lex RecognizeTextCommand.

    const requestAttributes: LexRequestAttributes = {
      channelType: "web",
      channelSid: chatEvent.ChannelSid,
      userIdentifier: chatEvent.ClientIdentity,
      name: channelAttribs?.from,
      subject: (channelAttribs?.pre_engagement_data?.question || "").replace(
        /\n+/g,
        ""
      ),
      instanceSid: chatEvent.InstanceSid,
      accountSid: chatEvent.AccountSid,
    };

    const command: RecognizeTextCommandInput = {
      botId: process.env.BOT_ID,
      localeId: process.env.BOT_LOCALE,
      botAliasId: process.env.BOT_ALIAS_ID,
      sessionId: chatEvent.ChannelSid,
      text: chatEvent.Body,
      requestAttributes,
    };

Now we send the Lex command and await the response:

    try {
      console.log("sending: ", command);
      response = await lexClient.recognizeText(command);
      console.log(response);
    } finally {
      // apply any channel status returned by the chatbot and
      // clear bot busy indicator (don't await)
      const { status } = response?.sessionState?.sessionAttributes || {};
      clearBotBusyPromise = channel.update({
        attributes: JSON.stringify({
          ...channelAttribs,
          botBusy: false,
          status,
        }),
      });
    }

You'll also notice that we any status returned by the chatbot in the sessionAttributes on the chat channel. This allows the chatbot to at any time signal the end of the chat engagement to the Flex WebChat UI (which changes state to disabled further messaging), simply by setting status: 'INACTIVE'.

At this point, if everything went according to plan we should have a response from our Lex chatbot which we just need to format and send back to our chat user as a message:

    if (response.messages) {
      const from = "Bot";
      for (const msg of response.messages) {
        const { contentType, content, imageResponseCard } = msg;
        if (contentType === "CustomPayload" && content) {
          // display the custom message
          const createOption: MessageListInstanceCreateOptions =
            JSON.parse(content);
          await channelCtx.messages.create({
            from,
            ...createOption,
          });
        } else if (contentType === "PlainText") {
          // display a text message
          await channelCtx.messages.create({
            from,
            body: content,
          });
        } else if (contentType === "ImageResponseCard") {
          // display quick-response buttons
          await channelCtx.messages.create({
            from,
            ...responseCardToTwilioMessage(imageResponseCard!),
          });
        }
      }
    }

This code handles multiple returned Lex messages with up to three (3) contentTypes currently used in our chatbot:

PlainText - this is the most basic and just consists of plain text which we can directly return as the body of a new Twilio chat message.
ImageResponseCard - this consists of a textual title and one or more option button text/value pairs. We return the textual version of title concatenated with the options, but also add the options into the message attributes so they can be used by our custom Flex WebChat UI code to render quick-response option buttons to the chat user (in place of the textual message body).
CustomPayload - this consists of just JSON data which we parse and pass directly to messages.create(), which allows our chatbot Code Hook to have direct control over all attributes of the created message.

Finally, the last bit of code in our Lambda handles unregistering the webhook during the execution of an agent handoff Lex chatbot intent, since we don't want the bot responding after an agent is added to the chat channel:

    const { intent } = response.sessionState || {};
    if (intent?.name === "agent-handoff" && intent?.state === "Fulfilled") {
      // remove this integration webhook for agent hand-off
      await channelCtx.webhooks(chatEvent.WebhookSid).remove();
    }

Lex Chatbot Code Hooks

Twilio Autopilot had a concept of a Actions, which describe what should be done in response the NLU engine matching a user message to a given Task. Handoff was one such pre-defined action, via which a chat could be handed-off from the Autopilot chatbot to an agent. Redirect was another such action, which allowed custom Twilio Serverless Functions which can be invoked as part of completing an Autopilot Task.

Well, in AWS Lex we have the Code Hook as a generalised analogue of the Autopilot's Actions, just as an Intent is the analogue of the Task. We can configure a given Lambda function to run as a Code Hook under the Alias Language settings:

Lambda Code Hook Explanation

Some sample code demonstrating how to create a Code Hook to do agent-handoff or end a chat session is also provided in a Gist, which again assumes you have created the associated intents in Lex (not covered here).

At the heart of the Code Hook is a switch on the invoked chatbot intent, which either creates a Twilio task-router chat task for agent-handoff or sets an inactive status for end-chat:

export const handler: LexV2Handler = async (event): Promise<LexV2Result> => {
  console.log("event:", event);

  const {
    sessionState: { intent },
  } = event;

  if (intent.name === "agent-handoff") {
    const { accountSid, instanceSid, channelSid: chatChannelSid } =
      event.requestAttributes as LexRequestAttributes;

    // create agent chat task
    await twilio(accountSid, process.env.TWILIO_AUTH_TOKEN!, {
      edge: "sydney",
    })
      .taskrouter.workspaces(process.env.TASKROUTER_WORKSPACE_SID)
      .tasks.create({
        attributes: JSON.stringify(event.requestAttributes),
        taskChannel: "chat",
        process.env.TASKROUTER_WORKFLOW_SID,
      });

    return plainTextResponse(event, "I'm connecting you with an agent now.");
  }
  if (intent.name === "end-chat") {
    const sessionAttribs =
      intent.confirmationState === "Confirmed"
        ? { status: "INACTIVE" }
        : undefined;

    return codeHookResult(event, "Fulfilled", sessionAttribs);
  }

  throw new Error(`Unsupported intent: ${intent.name}`);
};

Conclusion

Having fully transitioned from Twilio Autopilot to AWS Lex v2, the chat user experience has been great and the change in performance only slightly noticeable. This is mainly due to a reduction in chatbot responsiveness, which is caused by us using the default Twilio Region (i.e. where the data-processing happens) which is located on the US East Coast, whereas our AWS (Lex) infrastructure is running out of Sydney (i.e. Asia Pacific South East 2). Using Twilio's Sydney edge location has also helped to improve this tremendously, and you would have probably noticed this configuration on end of the webhook URLs configured into Twilio and the Twilio API calls in the Lambdas.

So if you've been considering to make a similar migration, but have been struggling to find information on how, I hope you've found this guide useful. And please, feel free to leave any thoughts, improvements, or even corrections you might have in the comments section. Bye for now.