DEV Community

Cover image for Metahuman with AWS
yusufgbagci
yusufgbagci

Posted on

Metahuman with AWS

Introduction

Integrating AWS services with Unreal Engine can significantly enhance your project's capabilities, particularly if the actors in Unreal Engine utilize AWS AI services. While Amazon GameLift is a common service for deploying, operating, and scaling dedicated, low-cost servers in the cloud for session-based multiplayer games in a UE environment, other AWS services can also be integrated based on the use case. Services like EC2, API Gateway, and S3 can seamlessly integrate with Unreal Engine to provide various functionalities.

In our work, we integrated Amazon Transcribe, Amazon Bedrock, and Amazon Polly services into Unreal Engine using the C++ SDK and Python SDK. This integration enables the creation of advanced interactive experiences within the Unreal Engine environment.

MetaHuman is a comprehensive framework that empowers creators to develop and use fully rigged, photorealistic digital humans in various projects powered by Unreal Engine. The purpose of our work is to integrate a MetaHuman with AWS to create a real-time LLM (Language Learning Model) - RAG (Retrieval-Augmented Generation) powered avatar that can interact and communicate in real time. This post will guide you through the essential steps to set up and run AWS on Unreal Engine 4.27, covering the necessary configurations, modules, and scripts that make up this powerful combination.

Image description

By following this guide, you will learn how to leverage AWS AI services to enhance the interactivity and realism of your Unreal Engine projects, making it possible to create lifelike digital avatars capable of real-time communication.

Prerequisites

Before diving into the integration, ensure you have your AWS STS (Security Token Service) token ready. This token is crucial for secure interactions with AWS services. Follow the steps below to generate and configure your AWS STS token:

Generate AWS STS Token

First, you need to configure your .config and .credentials files as explained here.

Obtain the tokens by running the following command:

aws sts get-session-token --profile <profilename> --serial-number <your mfa device in aws> --token-code <MFA_CODE> 
Enter fullscreen mode Exit fullscreen mode

Set Environment Variables:
After obtaining your tokens, set your environment variables using the following commands:

setx AWS_ACCESS_KEY_ID ************
setx AWS_SECRET_ACCESS_KEY ***********
setx AWS_SESSION_TOKEN ********** 
Enter fullscreen mode Exit fullscreen mode

Unreal Engine version

Although Unreal Engine 5.4 has been announced, our integration works only with versions 4.27 or 4.26. The reason for this is the compatibility of the C++ SDK integration with these specific versions of Unreal Engine. Ensure you have one of these versions installed before proceeding with the integration.

Modules Overview

The Metahuman project consists of several essential modules, each serving a specific purpose. Let's break down these modules:

Content

Integrating Unreal Engine with AWS is a complex process that requires multiple steps. Here’s a streamlined approach to get you started:

Cloning the AWS SDK Repository
First, check if the SDK repository directory exists. If it does not, clone the AWS SDK from GitHub:

git clone https://github.com/aws/aws-sdk-cpp
Enter fullscreen mode Exit fullscreen mode

Next, run CMake to configure the build system for Visual Studio:

cd aws-sdk-cpp
mkdir build
cd build
cmake .. -G "Visual Studio 16 2019"
Enter fullscreen mode Exit fullscreen mode

After configuring the build system, build the SDK using MSBuild

msbuild ALL_BUILD.vcxproj /p:Configuration=Release
Enter fullscreen mode Exit fullscreen mode

Finally, install the built SDK:

msbuild INSTALL.vcxproj /p:Configuration=Release
Enter fullscreen mode Exit fullscreen mode

By following these steps, you’ll have the AWS SDK ready for integration with Unreal Engine. This setup ensures you have the necessary tools to proceed with more advanced AWS functionalities within Unreal Engine.

Plugins

Plugins extend the functionality of Unreal Engine, and our project will utilize the following plugin:

CallExePlugin: This custom plugin, created by Jens Krenzin, allows you to run Python scripts within your Unreal Engine project.

The plugin has two types of functionality:

  • Blocking Mode: While the script is running in Unreal Engine, it blocks other applications. Once the script runtime is over, Unreal Engine resumes normal operation.

  • Non-blocking Mode: This version does not interrupt other applications while running.

For this project, we use the non-blocking version to ensure that MetaHuman can function without any interruptions.
In this plugin, you need to specify the directory where Python is set up on your local PC as well as the directory of the Python script you want to run. This configuration ensures that the plugin can locate and execute the necessary Python scripts seamlessly within your Unreal Engine project.

Here's a brief overview of how to set it up:

Python Directory: Provide the path to your local Python installation.
Script Directory: Specify the directory where your Python scripts are located.
By configuring these directories, the CallExePlugin will be able to execute your Python scripts, extending the functionality of your Unreal Engine project without interruptions.

CallExePlugin Blueprint

Scripts

The Scripts directory houses various Python scripts essential for the project's core features. Here’s an overview of the key script:

main.py: This Python file takes the user's voice input and converts it to text using Amazon Transcribe. The transcribed text is then sent to Amazon Bedrock for further processing.

  • Voice to Text: Amazon Transcribe's real-time feature is used to convert the user's voice into text. This allows for immediate processing and response.
  • Text to LLM: The transcribed text is forwarded to Amazon Bedrock. Bedrock is an AWS service that hosts multiple language models (LLMs).

In our project, we use Claude 3 Sonnet, located in the Frankfurt region.
This seamless integration between Amazon Transcribe and Amazon Bedrock enables real-time interaction, allowing the LLM to provide immediate and relevant responses. This setup ensures a smooth and efficient workflow, enhancing the overall user experience.

Blueprint Communication and Polly Integration

Each avatar has its own Blueprint that must connect to the Scene-Blueprints (BP_UIWidget) for user interaction.

The Metahuman Blueprint is the main place to set up all AWS services and plugins. Here, we configure AWS Polly using the C++ SDK.
Amazon Polly provides two important capabilities that make this project work seamlessly. First, Polly can turn text into natural-sounding speech as an audio file. Second, Polly can generate a JSON-formatted list of mouth shapes corresponding to the sounds in that audio file. These mouth shapes are called "visemes." Here are what the visemes for the word "human" look like:

{"time":2,"type":"viseme","value":"o"}
{"time":52,"type":"viseme","value":"k"}
{"time":196,"type":"viseme","value":"p"}
Enter fullscreen mode Exit fullscreen mode

Because this viseme data includes timestamps for each mouth shape, we can use that information to achieve believable lip sync between the audio playback and MetaHuman facial animations.

We created a Generate Speech function, which takes the text from our CallExePlugin and creates speech using Amazon Polly. This function is connected to another function that updates the AnimGraph. Specifically, it updates the "Viseme" variable to drive a Blend Pose node. This node determines which one of the possible viseme animation assets will be applied to the MetaHuman's face.

Viceme Animation Blueprint

By leveraging Amazon Polly's capabilities, we ensure that the MetaHuman's lip movements are synchronized with the generated speech, creating a realistic experience.

Image description

Conclusion

Integrating AWS with Unreal Engine 4.27 involves several crucial steps, including configuring AWS STS tokens, setting up essential modules, and utilizing plugins and scripts to enhance MetaHuman capabilities.

In our use case, we configured our LLM (Language Learning Model) in a CEO role and created a RAG (Retrieval-Augmented Generation) system with the necessary data. This configuration allows our MetaHuman to act as a company's CEO. Users can interact with the avatar by asking questions about the company via voice, and the avatar responds with natural-sounding speech.

Here's a video showing how we interact with our avatar: https://www.youtube.com/watch?v=URaJv-YHVT8

Although I am publishing this article, there is a great team behind this project. Thanks for the amazing work: Artur Schneider, Lydia Delyova, Jens Krenzin, and Willi Schwarzbach.

Top comments (0)