Jimmy Dahlqvist for AWS Community Builders

Posted on Dec 7, 2023

Serverless and event-driven translation bot

#aws #serverless #ai #stepfunctions

In a talk I recently gave at a conference I did some live coding on stage, in that session I created a translation service using AWS and Slack, where you could directly do translations from Slack using a slash command. You also got a audio file where Polly read back the translation for you.

The entire setup was serverless and didn't use much code, instead I used the SDK and Service integrations in StepFunctions (like I always tend to do), and EventBridge to create an event-driven architecture.

In this post I will explain the solution, and I how it was setup end to end. All the source code is available on GitHub as well.

Architecture

First of all, let us do an overview of the architecture and what patterns that I use, before we do a deep dive.

In this solution we will combine the best of two worlds from orchestration and choreography. We have four domain services that each is responsible for a certain task. They will emit domain events so we can orchestrate a Saga pattern. Where services will be invoked in different phases and in response to domain events. Each of the service consists of several steps choreographed by StepFunctions to run in a certain order.

If we now add some more details to the image above, and start laying out the services we use. We have our hook that Slack will invoke on our slash command this is implemented with API Gateway and Lambda. The translation service that is implemented with a StepFunction and Amazon Translate. The text to voice service, which is also is setup with a StepFunction and Amazon Polly. The final service is a service responsible communicating back to Slack with both the translated text but also the generated voice file.

The services are invoked and communicate in an event-driven way over EventBridge event-buses, both a custom and the default bus. The default bus relay messages from S3 when objects are created.

With that short overview, let us dive deep into the different services, events, logic, and infrastructure.

Common infrastructure

In the common infrastructure we will create the custom EventBridge event-bus and we'll create a S3 bucket that we use as intermediate storage of translated text and generated voice.


AWSTemplateFormatVersion: "2010-09-09"
Description: Event-Driven Translation Common Infra
Parameters:
  Application:
    Type: String
    Description: Name of owning application
    Default: eventdriven-translation

Resources:
  ##########################################################################
  #   INFRASTRUCTURE
  ##########################################################################
  TranslationBucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketEncryption:
        ServerSideEncryptionConfiguration:
          - ServerSideEncryptionByDefault:
              SSEAlgorithm: AES256
      BucketName: !Sub ${Application}-translation-bucket
      NotificationConfiguration:
        EventBridgeConfiguration:
          EventBridgeEnabled: true
      Tags:
        - Key: Application
          Value: !Ref Application

  EventBridgeBus:
    Type: AWS::Events::EventBus
    Properties:
      Name: !Sub ${Application}-eventbus

  SlackBotSecret:
    Type: AWS::SecretsManager::Secret
    Properties:
      Description: Slack bot oauth token
      Name: /slackbot
      Tags:
        - Key: Application
          Value: !Ref Application

##########################################################################
#  Outputs                                                               #
##########################################################################
Outputs:
  TranslationBucket:
    Description: Name of the bucket to store translations in
    Value: !Ref TranslationBucket
    Export:
      Name: !Sub ${AWS::StackName}:TranslationBucket
  EventBridgeBus:
    Description: The EventBridge EventBus
    Value: !Ref EventBridgeBus
    Export:
      Name: !Sub ${AWS::StackName}:EventBridgeBus
  SlackBotSecret:
    Description: The Slack Bot Secret
    Value: !Ref SlackBotSecret
    Export:
      Name: !Sub ${AWS::StackName}:SlackBotSecret

With this common infrastructure created we can move on.

Slack Integration

Next let's create the Slack Application and create the API that the application will call. We'll also create the Notification service that will send messages back to out Slack channel.

Slash command hook API

This will create the API that Slack will send the slash commands to. We will create this using API Gateway with a Lambda function integration where we will parse the command, send a response to Slack, and post an event onto our custom event-bus that will be the start of our translations Saga. This is this small part of the architecture.

WARNING!
In this API setup there is no authorization! If you build this for anything else than a demo make sure you include authorization on you API.


AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Description: Event-driven Slack Bot

Parameters:
  Application:
    Type: String
    Description: Name of the application
  CommonInfraStackName:
    Type: String
    Description: Name of the Common Infra Stack

Globals:
  Function:
    Runtime: python3.9
    Timeout: 30
    MemorySize: 1024

Resources:
  ##########################################################################
  #  WEBHOOK INFRASTRUCTURE                                                #
  ##########################################################################

  ##########################################################################
  #  WebHook HTTP                                                          #
  ##########################################################################
  SlackHookHttpApi:
    Type: AWS::Serverless::HttpApi
    Properties:
      CorsConfiguration:
        AllowMethods:
          - GET
        AllowOrigins:
          - "*"
        AllowHeaders:
          - "*"

  ##########################################################################
  #  HTTP API Slackhook Lambdas                                           #
  ##########################################################################
  SlackhookFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/SlackhookLambda
      Handler: slackhook.handler
      Events:
        SackhookPost:
          Type: HttpApi
          Properties:
            Path: /slackhook
            Method: post
            ApiId: !Ref SlackHookHttpApi
      Policies:
        - EventBridgePutEventsPolicy:
            EventBusName:
              Fn::ImportValue: !Sub ${CommonInfraStackName}:EventBridgeBus
      Environment:
        Variables:
          EVENT_BUS_NAME:
            Fn::ImportValue: !Sub ${CommonInfraStackName}:EventBridgeBus

##########################################################################
#  Outputs                                                               #
##########################################################################
Outputs:
  ApiEndpoint:
    Description: HTTP API endpoint URL
    Value: !Sub https://${SlackHookHttpApi}.execute-api.${AWS::Region}.amazonaws.com

In the Lambda function we'll decode the slash command, this is an url encoded, base64 encoded, key:value pair string. We create the event that we need, post that on our event-bud and then return a 200 OK with a message to Slack.


import json
import base64
from urllib import parse as urlparse
import boto3
import os
import re

def handler(event, context):

    msg_map = dict(
        urlparse.parse_qsl(base64.b64decode(str(event["body"])).decode("ascii"))
    ) 
    commandString = msg_map.get("command", "err")
    text = msg_map.get("text", "err") 

    translateText = re.findall(r'"(.*?)"', text)[0]

    text = text.replace(translateText, "")
    text = text.replace('"', "")
    index = text.find("to")
    text = text.replace("to", "").strip()
    languages = text.split(",")

    languageArray = []
    for language in languages:
        language = language.strip()
        languageArray.append(
            {"Code": language},
        )

    commandEvent = {
        "Languages": languageArray,
        "Text": translateText,
        "RequestId": event["requestContext"]["requestId"],
    }

    client = boto3.client("events")
    response = client.put_events(
        Entries=[
            {
                "Source": "Translation",
                "DetailType": "TranslateText",
                "Detail": json.dumps(commandEvent),
                "EventBusName": os.environ["EVENT_BUS_NAME"],
            },
        ]
    )
    return {"statusCode": 200, "body": f"Translating........"}

Create Slack command

Now that we have the hook API up and running we can create the actual slash command in Slack, navigate to Slack API.
Click on Create New App to start creating a new Slack App.

I name my app "Translation", you can name it however you like, also associate it with your workspace.

When the app is created select it in the drop down menu and navigate to "Slash Commands"

Here we create a new Slash Command.

I create the "/translate" command, we need the url for the API that we created previously, the value is in the Output section of the Cloudformation template, copy the value for ApiEndpoint and paste it in Request URL box. A short description of the command is not mandatory, but I still enter a very basic description. After creation the Slash Command should be visible in the menu.

Next we need to give our application some permissions. That is done from the OAuth and Permissions menu. Our app need "chat:write", "commands", and "files:write" add these under the Scope section.

We are almost there now. To get the OAuth token we need, we first need to install the application to our workspace. Navigate to the top and click "Install to workspace".

After a successful installation we should now have the OAuth token that we need.

We need to copy this token and store it in the SecretsManager Secret that was created with the common infrastructure previous, so head over to the AWS Console and SecretsManager. Select the "/slackbot" secret and create a key/value pair with the key "OauthToken" and the value set to the token.

Final step now is to navigate to your workspace, find the Translation app under Apps in the left pane, click it and select "Add this app to a channel" and select the channel of your choice.

Translation

That was one long section on how to create and setup your Slack app. But with that out of the way we can now create the Translation service. This service looks like this.

It will start on an event from a custom EventBridge event-bus, this will start a StepFunction state-machine. Amazon Translate will use Amazon Comprehend to detect the source language and translate it to the destination. The translated text will be stored in the S3 bucket, that we created in the common infrastructure, and finally post a event back to the event-bus to move to the next step in our saga pattern. We can actually translate to several languages at once, for this we use the Map state in the state-machine to run the translation logic over an array. The StepFunction state-machine looks like this.

I only use the SDK or Optimized integrations, no need for any code or Lambda functions for performing this task, less code to manage.

SAM Tamplate:

AWSTemplateFormatVersion: "2010-09-09"
Transform: "AWS::Serverless-2016-10-31"
Description: Translate Text State Machine
Parameters:
  Application:
    Type: String
    Description: Name of owning application
  CommonInfraStackName:
    Type: String
    Description: Name of the Common Infra Stack

Resources:
  ##########################################################################
  ## TRANSLATE STATEMACHINE
  ##########################################################################
  TranslateStateMachineStandard:
    Type: AWS::Serverless::StateMachine
    Properties:
      DefinitionUri: statemachine/translate-broken.asl.yaml
      Tracing:
        Enabled: true
      DefinitionSubstitutions:
        S3Bucket:
          Fn::ImportValue: !Sub ${CommonInfraStackName}:TranslationBucket
        EventBridgeBusName:
          Fn::ImportValue: !Sub ${CommonInfraStackName}:EventBridgeBus
      Policies:
        - Statement:
            - Effect: Allow
              Action:
                - logs:*
              Resource: "*"
        - Statement:
            - Effect: Allow
              Action:
                - translate:TranslateText
                - comprehend:DetectDominantLanguage
              Resource: "*"
        - S3WritePolicy:
            BucketName:
              Fn::ImportValue: !Sub ${CommonInfraStackName}:TranslationBucket
        - EventBridgePutEventsPolicy:
            EventBusName:
              Fn::ImportValue: !Sub ${CommonInfraStackName}:EventBridgeBus
      Events:
        StateChange:
          Type: EventBridgeRule
          Properties:
            InputPath: $.detail
            EventBusName:
              Fn::ImportValue: !Sub ${CommonInfraStackName}:EventBridgeBus
            Pattern:
              source:
                - Translation
              detail-type:
                - TranslateText
      Type: STANDARD

StepFunction definition:

Comment: Translation State Machine
StartAt: Debug
States:
  Debug:
    Type: Pass
    Next: Map
  Map:
    Type: Map
    ItemProcessor:
      ProcessorConfig:
        Mode: INLINE
      StartAt: Translate Text
      States:
        Translate Text:
          Type: Task
          Parameters:
            SourceLanguageCode: auto
            TargetLanguageCode.$: $.TargetLanguage
            Text.$: $.Text
          Resource: arn:aws:states:::aws-sdk:translate:translateText
          ResultPath: $.Translation
          Next: Store Translated Text
        Store Translated Text:
          Type: Task
          Parameters:
            Body.$: $.Translation.TranslatedText
            Bucket: ${S3Bucket}
            Key.$: States.Format('{}/{}/text.txt',$.RequestId, $.TargetLanguage)
          Resource: arn:aws:states:::aws-sdk:s3:putObject
          ResultPath: null
          Next: Notify
        Notify:
          Type: Task
          Resource: arn:aws:states:::events:putEvents
          Parameters:
            Entries:
              - Source: Translation
                DetailType: TextTranslated
                Detail:
                  TextBucket: ${S3Bucket}
                  TextKey.$: States.Format('{}/{}/text.txt',$.RequestId, $.TargetLanguage)
                  Language.$: $.TargetLanguage
                  RequestId.$: $.RequestId
                EventBusName: ${EventBridgeBusName}
          End: true
    End: true
    ItemsPath: $.Languages
    ItemSelector:
      TargetLanguage.$: $$.Map.Item.Value.Code
      RequestId.$: $.RequestId
      Text.$: $.Text

Text to speech

Next part of the saga is the text to speech service, here we like to use Amazon Polly to read the translated text to us.

This service will be invoked by the translated text being stored in the S3 bucket by the Translation service. This will invoke a StepFunction state-machine that will load the text and start a Polly speech synthesis task. The state-machine will poll and wait for the task to finish, complete or fail. The generated speech mp3 file will be copied to the same place as the translated text. Finally an event is posted onto a custom event-bus that will invoke the last part of our saga.

Once again I only use the SDK or Optimized integrations, no need for any code or Lambda functions for performing this task, less code to manage.

SAM template

AWSTemplateFormatVersion: "2010-09-09"
Transform: "AWS::Serverless-2016-10-31"
Description: Generate Voice State Machine
Parameters:
  Application:
    Type: String
    Description: Name of owning application
  CommonInfraStackName:
    Type: String
    Description: Name of the Common Infra Stack

Resources:
  ##########################################################################
  ## VOICE STATEMACHINE
  ##########################################################################
  VoiceStateMachineStandard:
    Type: AWS::Serverless::StateMachine
    Properties:
      DefinitionUri: statemachine/voice-broken.asl.yaml
      Tracing:
        Enabled: true
      DefinitionSubstitutions:
        S3Bucket:
          Fn::ImportValue: !Sub ${CommonInfraStackName}:TranslationBucket
        EventBridgeBusName:
          Fn::ImportValue: !Sub ${CommonInfraStackName}:EventBridgeBus
      Policies:
        - Statement:
            - Effect: Allow
              Action:
                - logs:*
              Resource: "*"
        - Statement:
            - Effect: Allow
              Action:
                - polly:StartSpeechSynthesisTask
                - polly:GetSpeechSynthesisTask
              Resource: "*"
        - S3CrudPolicy:
            BucketName:
              Fn::ImportValue: !Sub ${CommonInfraStackName}:TranslationBucket
        - EventBridgePutEventsPolicy:
            EventBusName:
              Fn::ImportValue: !Sub ${CommonInfraStackName}:EventBridgeBus
      Events:
        StateChange:
          Type: EventBridgeRule
          Properties:
            EventBusName: default
            InputPath: $.detail
            Pattern:
              source:
                - aws.s3
              detail-type:
                - Object Created
              detail:
                bucket:
                  name:
                    - Fn::ImportValue: !Sub ${CommonInfraStackName}:TranslationBucket
                object:
                  key:
                    - suffix: ".txt"
      Type: STANDARD

StepFunction definition

Comment: Convert text to voice.
StartAt: Set Source Information
States:
  Set Source Information:
    Type: Pass
    ResultPath: $
    Parameters:
      TargetBucket.$: $.bucket.name
      Targetkey.$: States.Format('{}/{}/voice',States.ArrayGetItem(States.StringSplit($.object.key,'/'),0),States.ArrayGetItem(States.StringSplit($.object.key,'/'),1))
      SourceBucket.$: $.bucket.name
      SourceKey.$: $.object.key
      Langaguge.$: States.Format('{}',States.ArrayGetItem(States.StringSplit($.object.key,'/'),1))
    Next: Load Text
  Load Text:
    Type: Task
    Next: Start Speech Synthesis
    Parameters:
      Bucket.$: $.SourceBucket
      Key.$: $.SourceKey
    Resource: arn:aws:states:::aws-sdk:s3:getObject
    ResultPath: $.Text
    ResultSelector:
      Body.$: $.Body
  Start Speech Synthesis:
    Type: Task
    Parameters:
      Engine: neural
      LanguageCode.$: $.Langaguge
      OutputFormat: mp3
      OutputS3BucketName.$: $.TargetBucket
      OutputS3KeyPrefix.$: $.Targetkey
      TextType: text
      Text.$: $.Text.Body
      VoiceId: Joanna
    Resource: arn:aws:states:::aws-sdk:polly:startSpeechSynthesisTask
    ResultPath: $.Voice
    Next: Get Speech Synthesis Status
  Get Speech Synthesis Status:
    Type: Task
    Parameters:
      TaskId.$: $.Voice.SynthesisTask.TaskId
    Resource: arn:aws:states:::aws-sdk:polly:getSpeechSynthesisTask
    ResultPath: $.Voice
    Next: Speech Synthesis Done?
  Speech Synthesis Done?:
    Type: Choice
    Choices:
      - Variable: $.Voice.SynthesisTask.TaskStatus
        StringMatches: completed
        Next: Update Voice Object
        Comment: Completed!
      - Variable: $.Voice.SynthesisTask.TaskStatus
        StringMatches: failed
        Next: Failed
        Comment: Failed!
    Default: Wait
  Update Voice Object:
    Type: Task
    Next: Notify
    ResultPath: null
    Parameters:
      Bucket.$: $.TargetBucket
      CopySource.$: $.Voice.SynthesisTask.OutputUri
      Key.$: States.Format('{}_{}.mp3',$.Targetkey,$.Voice.SynthesisTask.VoiceId)
    Resource: arn:aws:states:::aws-sdk:s3:copyObject
  Notify:
    Type: Task
    Resource: arn:aws:states:::events:putEvents
    Next: Completed
    Parameters:
      Entries:
        - Source: Translation
          DetailType: VoiceGenerated
          Detail:
            VoiceBucket.$: $.TargetBucket
            VoiceKey.$: States.Format('{}_{}.mp3',$.Targetkey,$.Voice.SynthesisTask.VoiceId)
            Language.$: $.Langaguge
            Voice.$: $.Voice.SynthesisTask.VoiceId
          EventBusName: ${EventBridgeBusName}
  Completed:
    Type: Pass
    End: true
  Failed:
    Type: Pass
    End: true
  Wait:
    Type: Wait
    Seconds: 10
    Next: Get Speech Synthesis Status

Posting back to Slack

The final service involved in our saga is the notification service, that will post text and audio back to Slack. This service will be invoked by two different domain events, text translated, and audio generated. The state-machine need to handle both and uses a choice state to walk down different paths. In this state-machine we need to use a Lambda function to post to the Slack API. However, with the new HTTPS integration release at re:Invent 2023 we might be able to remove this as well.

SAM Template

AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Description: Event-driven Slack Bot Notification Service

Parameters:
  Application:
    Type: String
    Description: Name of the application
  CommonInfraStackName:
    Type: String
    Description: Name of the Common Infra Stack

Globals:
  Function:
    Runtime: python3.9
    Timeout: 30
    MemorySize: 1024

Resources:
  ##########################################################################
  #   LAMBDA FUNCTIONS                                                     #
  ##########################################################################
  PostToChannelFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/SlackPostToChannel
      Handler: postchannel.handler
      Policies:
        - AWSSecretsManagerGetSecretValuePolicy:
            SecretArn:
              Fn::ImportValue: !Sub ${CommonInfraStackName}:SlackBotSecret
      Environment:
        Variables:
          SLACK_CHANNEL: <your-slack-channel>
          SLACK_BOT_TOKEN_ARN:
            Fn::ImportValue: !Sub ${CommonInfraStackName}:SlackBotSecret

  UploadAudioToChannelFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/UploadAudioToChannel
      Handler: uploadchannel.handler
      Policies:
        - AWSSecretsManagerGetSecretValuePolicy:
            SecretArn:
              Fn::ImportValue: !Sub ${CommonInfraStackName}:SlackBotSecret
        - S3ReadPolicy:
            BucketName:
              Fn::ImportValue: !Sub ${CommonInfraStackName}:TranslationBucket
      Environment:
        Variables:
          SLACK_CHANNEL: <your-slack-channel>
          SLACK_BOT_TOKEN_ARN:
            Fn::ImportValue: !Sub ${CommonInfraStackName}:SlackBotSecret

  ##########################################################################
  #   STEP FUNCTION                                                        #
  ##########################################################################
  NotificationLogGroup:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: !Sub "${Application}/notificationstatemachine"
      RetentionInDays: 5

  SlackNotificationStateMachineStandard:
    Type: AWS::Serverless::StateMachine
    Properties:
      DefinitionUri: statemachine/statemachine.asl.yaml
      DefinitionSubstitutions:
        EventBridgeName:
          Fn::ImportValue: !Sub ${CommonInfraStackName}:EventBridgeBus
        PostToChannelFunctionArn: !GetAtt PostToChannelFunction.Arn
        UploadAudioToChannelFunctionArn: !GetAtt UploadAudioToChannelFunction.Arn
      Events:
        SlackNotification:
          Type: EventBridgeRule
          Properties:
            EventBusName:
              Fn::ImportValue: !Sub ${CommonInfraStackName}:EventBridgeBus
            Pattern:
              source:
                - Translation
              detail-type:
                - TextTranslated
                - VoiceGenerated
            RetryPolicy:
              MaximumEventAgeInSeconds: 300
              MaximumRetryAttempts: 2
      Policies:
        - Version: "2012-10-17"
          Statement:
            - Effect: Allow
              Action:
                - "cloudwatch:*"
                - "logs:*"
              Resource: "*"
        - EventBridgePutEventsPolicy:
            EventBusName:
              Fn::ImportValue: !Sub ${CommonInfraStackName}:EventBridgeBus
        - LambdaInvokePolicy:
            FunctionName: !Ref PostToChannelFunction
        - LambdaInvokePolicy:
            FunctionName: !Ref UploadAudioToChannelFunction
        - S3ReadPolicy:
            BucketName:
              Fn::ImportValue: !Sub ${CommonInfraStackName}:TranslationBucket
      Tracing:
        Enabled: true
      Logging:
        Destinations:
          - CloudWatchLogsLogGroup:
              LogGroupArn: !GetAtt NotificationLogGroup.Arn
        IncludeExecutionData: true
        Level: ALL
      Type: STANDARD

StepFunction definition

Comment: Translate App Slack Notification service
StartAt: Debug
States:
  Debug:
    Type: Pass
    Next: Event Type ?
  Event Type ?:
    Type: Choice
    Choices:
      - Variable: $.detail-type
        StringEquals: TextTranslated
        Next: Text Translated
      - Variable: $.detail-type
        StringEquals: VoiceGenerated
        Next: Voice Generated
    Default: Unknown Event Type
  Text Translated:
    Type: Pass
    Next: GetObject
    ResultPath: $
    Parameters:
      TextBucket.$: $.detail.TextBucket
      TextKey.$: $.detail.TextKey
      Language.$: $.detail.Language
      RequestId.$: $.detail.RequestId
  GetObject:
    Type: Task
    Parameters:
      Bucket.$: $.TextBucket
      Key.$: $.TextKey
    Resource: arn:aws:states:::aws-sdk:s3:getObject
    ResultSelector:
      Body.$: $.Body
    ResultPath: $.Text
    Next: Post Text To Channel
  Post Text To Channel:
    Type: Task
    Resource: arn:aws:states:::lambda:invoke
    OutputPath: $.Payload
    Parameters:
      Payload.$: $
      FunctionName: ${PostToChannelFunctionArn}
    Retry:
      - ErrorEquals:
          - Lambda.ServiceException
          - Lambda.AWSLambdaException
          - Lambda.SdkClientException
          - Lambda.TooManyRequestsException
        IntervalSeconds: 1
        MaxAttempts: 3
        BackoffRate: 2
    Next: Done
  Done:
    Type: Succeed
  Voice Generated:
    Type: Pass
    ResultPath: $
    Parameters:
      VoiceBucket.$: $.detail.VoiceBucket
      VoiceKey.$: $.detail.VoiceKey
      Language.$: $.detail.Language
      Voice.$: $.detail.Voice
    Next: Upload Audio To Channel
  Upload Audio To Channel:
    Type: Task
    Resource: arn:aws:states:::lambda:invoke
    OutputPath: $.Payload
    Parameters:
      Payload.$: $
      FunctionName: ${UploadAudioToChannelFunctionArn}
    Retry:
      - ErrorEquals:
          - Lambda.ServiceException
          - Lambda.AWSLambdaException
          - Lambda.SdkClientException
          - Lambda.TooManyRequestsException
        IntervalSeconds: 1
        MaxAttempts: 3
        BackoffRate: 2
    Next: Done
  Unknown Event Type:
    Type: Fail

Post translated text

import json
import os
import boto3
from symbol import parameters
from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError

SLACK_CHANNEL = os.environ["SLACK_CHANNEL"]

def handler(event, context):
    set_bot_token()

    text = f"{event['Language']}:\n{event['Text']['Body']}"

    client = WebClient(token=os.environ["SLACK_BOT_TOKEN"])
    client.chat_postMessage(channel="#" + SLACK_CHANNEL, text=text)

    return {"statusCode": 200, "body": "Hello there"}


def set_bot_token():
    os.environ["SLACK_BOT_TOKEN"] = get_secret()


def get_secret():
    session = boto3.session.Session()
    client = session.client(service_name="secretsmanager")

    try:
        secretValueResponse = client.get_secret_value(
            SecretId=os.environ["SLACK_BOT_TOKEN_ARN"]
        )
    except ClientError as e:
        raise e

    secret = json.loads(secretValueResponse["SecretString"])["OauthToken"]
    return secret

Upload audio file

import json
import os
import boto3
from symbol import parameters
from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError

SLACK_CHANNEL = os.environ["SLACK_CHANNEL"]

def handler(event, context):
    set_bot_token()

    path = download_audio_file(
        event["VoiceBucket"], event["VoiceKey"], event["Voice"], event["Language"]
    )
    upload_audio_file(event["Language"], path)

    return {"statusCode": 200, "body": "Hello there"}


def download_audio_file(bucket, key, voice, language):
    s3 = boto3.client("s3")
    path = f"/tmp/{language}_{voice}.mp3"
    s3.download_file(bucket, key, path)
    return path

def upload_audio_file(language, path):
    client = WebClient(token=os.environ["SLACK_BOT_TOKEN"])
    client.files_upload(
        channels="#" + SLACK_CHANNEL,
        initial_comment=f"Polly Voiced Translation for: {language}",
        file=path,
    )

    return {"statusCode": 200, "body": "Hello there"}

def set_bot_token():
    os.environ["SLACK_BOT_TOKEN"] = get_secret()

def get_secret():
    session = boto3.session.Session()
    client = session.client(service_name="secretsmanager")

    try:
        secretValueResponse = client.get_secret_value(
            SecretId=os.environ["SLACK_BOT_TOKEN_ARN"]
        )
    except ClientError as e:
        raise e
    secret = json.loads(secretValueResponse["SecretString"])["OauthToken"]
    return secret

Test it

To test the solution we send a slash command with the pattern /translate "text to translate" language_code_1,language_code_2,language_code_n

Final Words

In the era of Generative AI it was interesting to build a solution using the more traditional AI services that has been around for several years. The performance on these are really good and the translations and voice files are created very quickly. Building this in a serverless and event-driven way creates a cost effective solution as alway. There are improvements and extensions that can be done to the solution. Stay tuned as I make this changes and update this blog. Also this solution will be powering my new feature turning this blog into multi language.

Don't forget to follow me on LinkedIn and X for more content, and read rest of my Blogs

DEV Community

Serverless and event-driven translation bot

Architecture

Common infrastructure

Slack Integration

Slash command hook API

Create Slack command

Translation

Text to speech

Posting back to Slack

Test it

Final Words

Top comments (0)

Read next

Building Scalable Applications in AWS

Microsoft's Phi-4: Smaller AI Model Achieves Big Results Through Clean Training Data

World's Largest Telegram Dataset Reveals How Information Spreads Across 120,000+ Channels

NVIDIA Ada Lovelace architecture for AI and Deep Learning