DEV Community

Cover image for AWS Step Functions In-Depth | Serverless
awedis for AWS Community Builders

Posted on • Edited on

AWS Step Functions In-Depth | Serverless

In this article we are going to learn about Step Functions, its main components, and will build some examples using serverless framework

The main parts of this article:

  1. About Finite-State Machine
  2. Step Functions (Main components)
  3. Examples

1. What is Finite-State Machine

The two keywords that you need to remember are States and Transitions. The FSM can change from one state to another in response to some inputs; the change from one state to another is called a transition

2. Step Functions

Step Function is AWS-managed service that uses Finite-State Machine (FSM) model

Step Functions is an orchestrator that helps you to design and implement complex workflows. When we need to build a workflow or have multiple tasks that need to be orchestrated, Step Functions coordinates between those tasks. It simplifies overall architecture and provides us with much better control over each step of the workflow

Step Functions is built on two main concepts: Tasks and State Machine

All work in the state machine is done by tasks. A task performs work by using an activity or an AWS Lambda function, or passing parameters to the API actions of other services

  • State Types Itโ€™s essential to remember that States arenโ€™t the same thing as Tasks since Tasks are one of the State types. There are numerous State types, and all of them have a role to play in the overall workflow:

State Type should be one of these values:

  • Task - Represents a single unit of work performed by a state machine
  • Wait - Delays the state machine from continuing for a specified time
  • Pass - Passes its input to its output, without performing work, Pass states are useful when constructing and debugging state machines
  • Succeed - Stops an execution successfully
  • Fail - Stops the execution of the state machine and marks it as a failure
  • Choice - Adds branching logic to a state machine
  • Parallel - Can be used to create parallel branches of execution in your state machine
  • Map - Can be used to run a set of steps for each element of an input array. While the Parallel state executes multiple branches of steps using the same input, a Map state will execute the same steps for multiple entries of an array in the state input

3. Examples

In this part we are going to build 4 step functions

Note: Step Functions definition can be written in JSON or YAML

These are the 4 examples:
Image description

I- Example

The first example is a simple one we have 2 Lambda functions orchestrated, the first one is adding our input by 10, then passing it to the second Lambda which is later adding by 20, and our final result will be 40

Flow Diagram:
Image description

Step Functions Definition:

firstLambdaARN: &FIRST_ARN arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-firstState
secondLambdaARN: &SECOND_ARN arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-secondState
states:
  ExampleOneSF:
    name: ExampleOneSF
    definition:
      Comment: "Example One"
      StartAt: firstState
      States:
        firstState:
          Type: Task
          Resource: *FIRST_ARN
          Next: secondState
        secondState:
          Type: Task
          Resource: *SECOND_ARN
          Next: success
        success:
          Type: Succeed
Enter fullscreen mode Exit fullscreen mode

Note: here I'm using Alias inside the YAML file

The Two Lambda Functions (firstState & secondState):

module.exports.firstState = async (event) => {
  console.log(event);
  const {
    value,
  } = event;

  const result = 10 + value;
  return {
    value: result
  };
};

module.exports.secondState = async (event) => {
  console.log(event);
  const {
    value,
  } = event;

  const result = 20 + value;
  return {
    value: result,
    status: 'SUCCESS'
  };
};
Enter fullscreen mode Exit fullscreen mode

Inside routes:

firstState:
  handler: src/modules/StepFunction/controller/exampleOne.firstState
  timeout: 300
secondState:
  handler: src/modules/StepFunction/controller/exampleOne.secondState
  timeout: 300
Enter fullscreen mode Exit fullscreen mode

The input:

{
  value: 15
}
Enter fullscreen mode Exit fullscreen mode

Workflow inside the console:
Image description

The final result is 45 since my input was 15, the first state added it 10 and the second one 20

As we can see it's very easy to pass data between states, this makes step functions very useful service to build decoupled architecture

II- Example

In example 2 we are going to imitate how to create a form upload, adding Choice Type. First we have a "validateForm" that will check for validation either it fails or continues and passes the data to the other Lambda "processForm" will do some process, then we have "uploadForm" which may finally write our data to database for example

Flow Diagram:
Image description

Step Functions Definition:

  ExampleTwoSF:
    name: ExampleTwoSF
    definition:
      Comment: "Example Two"
      StartAt: validateForm
      States:
        validateForm:
          Type: Task
          Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-validateForm
          Next: isFormValidated
        isFormValidated:
          Type: Choice
          Choices:
          - Variable: "$.status"
            StringEquals: SUCCESS
            Next: processForm
          - Variable: "$.status"
            StringEquals: ERROR
            Next: fail
        processForm:
          Type: Task
          Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-processForm
          Next: uploadForm
        uploadForm:
          Type: Task
          Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-uploadForm
          Next: isFormUploaded
        isFormUploaded:
          Type: Choice
          Choices:
          - Variable: "$.status"
            StringEquals: SUCCESS
            Next: success
          - Variable: "$.status"
            StringEquals: ERROR
            Next: fail
        success:
          Type: Succeed
        fail:
          Type: Fail
Enter fullscreen mode Exit fullscreen mode

Lambda Functions:

module.exports.validateForm = async (event) => {
  console.log(event);
  // validate form

  // if not valid
  // return {
  //   status: 'ERROR'
  // };

  return {
    status: 'SUCCESS'
  };
};

module.exports.processForm = async (event) => {
  console.log(event);
  // add simple process

  return {
    processData: 1000,
  };
};

module.exports.uploadForm = async (event) => {
  console.log(event);
  // upload data for example to DynamoDB

  return {
    status: 'SUCCESS'
  };
};
Enter fullscreen mode Exit fullscreen mode

Workflow inside the console:
Image description

III- Example

In this example we are going to use the wait, pass and parallel Types. After the user uploads his/her profile, we wait 5 seconds if all is good it notifies the user by running two Lambda functions in parallel, one to send an Email and one for SMS, in addition we can see there is a Pass type state that just adds some data that I defined (admin details...) and passes to the Parallel State

Note: I tried to build some simple examples relating to real world features, this may vary based on your needs

Flow Diagram:
Image description

Step Functions Definition:

ExampleThreeSF:
    name: ExampleThreeSF
    definition:
      Comment: "Example Three"
      StartAt: uploadProfile
      States:
        uploadProfile:
          Type: Task
          Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-uploadProfile
          Next: waitFiveSeconds
        waitFiveSeconds:
          Type: Wait
          Seconds: 5
          Next: addAdminPayload
        addAdminPayload:
          Type: Pass
          Result:
            admin_name: "Admin Name"
            admin_phone: "Admin Number"
            admin_email: "Admin Email"
          ResultPath: "$.adminDetails"
          Next: notifyCustomer
        notifyCustomer:
          Type: Parallel
          End: true
          Branches:
          - StartAt: sendEmail
            States:
              sendEmail:
                Type: Task
                Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-sendEmail
                End: true
          - StartAt: sendSMS
            States:
              sendSMS:
                Type: Task
                Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-sendSMS
                End: true
Enter fullscreen mode Exit fullscreen mode

Note: Parallel states are used when different types of tasks, workflows need to be performed concurrently

Lambda Functions:

module.exports.uploadProfile = async (event) => {
  console.log(event);

  const {
    data,
  } = event;

  return {
    status: "Profile uploaded",
    data,
  };
};

module.exports.sendEmail = async (event) => {
  console.log(event);
  // Send Email

  return {
    status: `Email sent to ${event.data.email}`,
  };
};

module.exports.sendSMS = async (event) => {
  console.log(event);
  // Send SMS

  return {
    status: `SMS Sent to ${event.data.phone}`,
  };
};
Enter fullscreen mode Exit fullscreen mode

As we can see the two Lambda tasks are taking same input, however each one of them is working based on a specific attributes from that data, for example my sendEmail needs the email value, whereas the sendSMS needs the phone details

Input:

{
  data: {
    name: 'Test User',
    email: 'test@test.com',
    phone: '0123456789'
  }
}
Enter fullscreen mode Exit fullscreen mode

Workflow inside the console:
Image description

If any branch fails, the entire Parallel state is considered to have failed. If error is not handled by the Parallel state itself, Step Functions stops the execution with an error

IV- Example

In this example we are going to use Map type, which can be used to run a set of steps for each element of an input array. Map state provides us the capability of running multiple sequential workflows in parallel

I'm going to run 4 concurrent workflows all of them are going to do same business logic, my first state concatenates the orderID with ID ID-${orderID} and my second state returns a message this is order number ${orderID}, which has ${quantity} orders

Flow Diagram:
Image description

Step Functions Definition:

  ExampleFourSF:
    name: ExampleFourSF
    definition:
      Comment: "Example Four"
      StartAt: uploadData
      States:
        uploadData:
          Type: Map
          InputPath: "$.detail"
          ItemsPath: "$.data"
          MaxConcurrency: 4
          Iterator:
            StartAt: manipulateObject
            States:
              manipulateObject:
                Type: Task
                Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-manipulateObject
                Next: createFinalObject
              createFinalObject:
                Type: Task
                Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-createFinalObject
                End: true
          ResultPath: "$.detail.data"
          End: true
Enter fullscreen mode Exit fullscreen mode

Note:
InputPath: to select a subset of the input
ItemsPath: to specify a location in the input to find the JSON array to use for iterations
MaxConcurrency: how many invocations of the Iterator may run in parallel

Lambda Functions:

module.exports.manipulateObject = async (event) => {
  console.log(event);

  const {
    orderID,
    quantity,
  } = event;

  return {
    orderID_manipulated: `ID-${orderID}`,
    quantity,
  };
};

module.exports.createFinalObject = async (event) => {
  console.log(event);

  const {
    orderID_manipulated,
    quantity,
  } = event;

  const status = `this is order number ${orderID_manipulated}, which has ${quantity} orders`;
  return {
    status,
  };
};
Enter fullscreen mode Exit fullscreen mode

The input:

{
  detail: {
    title: "My fourth example",
    data: [
      { orderID: 1, quantity: 10 },
      { orderID: 2, quantity: 24 },
      { orderID: 3, quantity: 32 },
      { orderID: 4, quantity: 5 },
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

Workflow inside the console:
Image description

The code that I am triggering my Step Functions:

const { StepFunctions } = require('aws-sdk');
const stepFunctions = new StepFunctions();

module.exports.get = async (event) => {
  try {
    const stepFunctionResult = stepFunctions.startExecution({
      stateMachineArn: process.env.EXAMPLE_FOUR_STEP_FUNCTION_ARN,
      input: JSON.stringify({
        detail: {
          title: "My fourth example",
          data: [
            { orderID: 1, quantity: 10 },
            { orderID: 2, quantity: 24 },
            { orderID: 3, quantity: 32 },
            { orderID: 4, quantity: 5 },
          ]
        }
      }),
    }).promise();
    console.log('stepFunctionResult =>', stepFunctionResult);

    return {
      statusCode: 200,
      body: JSON.stringify({
          message: `This is test API`,
      }, null, 2),
    };
  } catch (error) {
    console.log(error);
  }
};
Enter fullscreen mode Exit fullscreen mode

And finally you need to always make sure your step functions are handling the errors. Any state can encounter runtime errors. Errors can happen for various reasons. By default, when a state reports an error, AWS Step Functions causes the execution to fail entirely. For more about error handling you can visit this link

Conclusion

Step Functions are very useful service, it helps you to build a complex features, decouple your code, and create orchestrated services

Through the examples above I tried to showcase some real world features how can be made, you can end up making thousand of different workflows based on your requirements

For more articles like this and in order to keep on track with me, you can always follow me on LinkedIn

Top comments (0)