DEV Community

Cover image for How to filter and manipulate AWS Step Functions Input and Output data?

How to filter and manipulate AWS Step Functions Input and Output data?

taavirehemagi profile image Taavi Rehemägi ・14 min read

In this handbook, we'll explain the AWS Step Functions Input and Output manipulation.

There's plenty to talk about AWS Step Functions. There are numerous articles available online talking about AWS Step Functions ever since Step Functions were introduced in 2016. Most of these articles might make you think that Step Functions are actually an extension of the Lambda function, allowing you to combine several Lambda functions to call each other.

However, that's not the case, and Step Functions are far greater than that. AWS Step Functions allows users to design and build the entire flow executing modules within your application in a very simplified way. All this enables developers to focus exclusively on ensuring that every module runs its primary task, and you won't have to worry about connecting every module with all others.

You can learn more about Step Functions in our Ultimate Guide To Step Functions and in The Best Step Function Use Cases articles..

What is Step Functions input/output?

To effectively design and implement workflows in AWS Step Functions, it's vital to understand the flow of information from one state to another. Still, it's also crucial to learn how to manipulate and filter this data.

In the diagram below, you'll notice the movement of JSON information throughout a task state:

InputPath, OutputPath, ResultPath, Parameters, and ResultSelector, will manipulate JSON as it goes through every state within your workflow.


A path is a string beginning with $ within the Amazon States Language, and you can use it to identify components found inside the JSON text. You can specify a path that will allow you access to subsets of the input when you're determining the values for InputPath, ResultPath, and the OutputPath.

Reference Paths

A reference path is also a path, but its syntax is limited in a way that it's capable of identifying only a single node within the entire JSON structure:

  • You'll successfully access object fields by utilizing only dot (.) along with the square bracket ([ ]) notation.
  • These operators @ .. , : ? * are the ones that aren't supported.
  • Functions like length() also aren't supported.

A good example of this is if the state input data has these values:

Moreover, these reference paths will return:

Some states utilize paths, and they also reference paths so they could control the entire flow of a state machine or even configure a state's options or settings.

InputPath, ResultSelector and Parameters

The InputPath, ResultSelector, and Parameters fields give you the possibility to manipulate JSON while it's moving throughout your workflow. InputPath can limit the input that's passed by filtered JSON notation and by utilizing a path. The ResultSelector field is the one that provides you with a possibility to manipulate the state's result even before the ResultPath is applied. The Parameters field gives you the possibility of passing a collection of key-value pairs. These values will either be the ones you've selected from the input utilizing a path or static values you'll define within your state machine definition.

AWS Step Functions will apply the InputPath field as the primary one, and only after, it'll apply the Parameters field. It would be best if you first filtered the raw input to a selection you want via InputPath. You should then apply Parameters so you could further manipulate that particular input or add new values. After you're done with this step, you can then use the ResultSelector field so you could manipulate the state's output before the ResultPath is even applied.


It would be best if you used InputPath so you could select a segment of the state input. A good example would be if you suppose that the input to your state includes:

Then, you can apply the InputPath:

Considering the previous InputPath, this is the JSON that's passed as the input:

It's essential to know that the path can yield a selection of specific values like:

In case you've applied the path $.a[0:2], this will be the result:


Utilize the ResultSelector field so you can manipulate a state's result even before ResultPath is applied. The ResultSelector field allows you to create a collection of key-value pairs if these values are static or even selected from the state's result. Moreover, the ResultSelector's output will replace the state's result, and it'll be passed to the ResultPath.

ResultSelector is not a mandatory field in these states:

  • Task
  • Parallel
  • Map

Additionally, Step Functions service integrations will return metadata to the result's payloadResultSelector is capable of selecting fragments of the result and merge them all with the state input within the *ResultPath. This example shows that you can select the *resourceType and ClusterId so you could merge that together with the state input from Amazon's EMR createCluster.sync. Here's the example:

By using ResultSelector, you'll be able to select the resourceType and ClusterId:

Considering the given input, utilization of ResultSelector will produce:


Using the Parameters field will help you create a collection of key-value pairs that are all passed as input. These values can be selected from an input or context object with a path, or they can be static values that you've included within your state machine definition. The key name must end in *.$ for key-value pairs whose value was selected using a path*.

Take a look at the following input example:

Specifying these parameters within your state machine definition will enable you to select some of the information.

Considering the previous input along with the Parameters field, this is the JSON that'll pass:

In addition to the provided input, you'll easily access a special JSON object that's known as "context object." This object includes all information regarding your state machine execution.

It's worth mentioning that the Parameters field is also capable of passing information to other connected resources. In case your task state orchestrates an AWS Batch job, you'll easily pass all the relevant API parameters straight to the API actions of that service.


The ItemsPath field is utilized in a Map state so you could select an array within the input. A Map state is used to iterate steps for every item within an array found in the input. A Map state will set ItemsPath to $ by selecting the whole input by default. In case the input to the Map state is a JSON array, it'll run an iteration for every item found within the array, and it'll pass that item further to the iteration as input.

The ItemsPath field will allow you to choose a location within the input so you could find the JSON array to utilize for iterations. The ItemsPath's value has to be a Reference Path, and it also has to identify the value that's a JSON array. Think of input to a Map state that has two arrays included like in the following example:

This case shows that you can specify which array you want to use for Map state iterations by simply choosing a specific array with *ItemsPath. This state machine definition specifies the *ThingsPiratesSay array within the input utilizing ItemsPath solely. Still, it'll run a pass state iteration of the SayWord for every item located within the ThingsPiratesSay array.

ItemsPath is applied after the InputPath when processing input. Moreover, when InputPath has filtered the input completely, it'll operate on the effective input to the state.


The output of any state can also be a copy of its input, a combination of its result and input, or the result it produces (ex. output that came from a Task state's Lambda function). Using ResultPath allows you to control which one of these combinations will be forwarded to the state output.

These are the state types that can generate a result, which can also include ResultPath:

  • Task
  • Parallel
  • Pass

It would help if you utilized ResultPath for combining a task input with task result or even choose one of these options. The path that you provide to ResultPath will control all information that passes to the output.

Additionally, ResultPath is limited to utilizing reference paths, limiting the scope to identify just a single node within JSON.

Utilizing ResultPath

You can utilize ResultPath to:

  • Replace Input with Result
  • Discard Result and Keep Input
  • Include Result with Input
  • Update a Node in Input with Result
  • Include Input and Error in a Catch

Utilize ResultPath for Replacing the Input with the Result

In case you don't specify a *ResultPath, the default behavior will be the same as if you had specified "*ResultPath": "$". Since this tells the state to replace the whole input with the result, the state input will become replaced entirely by the result that came from the task result.

In this diagram, you can see how the ResultPath will entirely replace the input with the result of the given task:

Use Lambda function and the state machine via this input:

Then, the Lambda function will provide you with this result:

In case the ResultPath wasn't specified within the state, or if "ResultPath": "$" is set, the state's input is replaced by the Lambda function's result, while the state's output is as follows:

ResultPath is utilized to include all content from the input result before the content is passed to the output. However, in case ResultPath wasn't specified, it'll replace the entire input by default.

Discarding the Result and Keeping the Original Input

In case you set the ResultPath to null, it'll pass the original input straight to the output. By utilizing "ResultPath": "null", the input payload of the state will be directly copied to the output, regardless of the result.

This diagram showcases how a null ResultPath copies the input straight to the output.

Utilize ResultPath to Include the Result with the Given Input

This example shows how the ResultPath includes the result within the input.

Utilizing Lambda function and the state machine, you can pass this input:

The result of Lambda function will be:

To preserve this input, you should insert Lambda function's result and then pass the combined JSON to the next state, and you'll be able to set ResultPath:

This also includes the Lambda function's result with the original input.

Lambda function's output is located at the end of the original input and presented as a value for taskresult. The input, along with the newly inserted value, is forwarded to the next state.

You can put the result into the input's child node, and the ResultPath will be set as follows:

Begin the execution by utilizing this input:

The Lambda function's result is inserted in the input as a child of the strings node.

The state output will now include the original JSON input resulting in a child node.

Utilize ResultPath to Update a Node Within the Input with the Result

The diagram below showcases how ResultPath can update the existing JSON nodes' value within the input with task result's values:

Utilizing the example of Lambda function and the state machine described within the "Tutorial for Creating a Step Functions State Machine That Uses Lambda," you can pass this input:

The Lambda function result is:

Instead of inserting the result as a new node within JSON and preserving the input, you can entirely overwrite an existing node.

For example, setting a "ResultPath": "$" will overwrite the node in its entirety, and you can specify a single node you want to overwrite with the result.

Since the comment node already exists within the state input, set the ResultPath to "$.comment" to replace that node within the input with the Lambda function result. This will be passed to the output without any further filtering made by *OutputPath*:

The comment node's value, "This is an input and output test of a Task state.", will be replaced with the Lambda function's result: "Hello, AWS Step Functions!" within the state output.

Utilize ResultPath to Include Both Input and Error in a Catch

The tutorial "Use Step Functions State Machine for Handling Error Conditions" explains how you can utilize a state machine to catch errors. Sometimes, you might wish to save the original input that includes the error. Utilize ResultPath within a Catch so you could include the error from the original input.

In case that the previous Catch statement catches any errors, it also includes the result within an error node from the state input. The following input is a good example of this case:

This is how the state output when catching an error looks like:


OutputPath allows you to choose a fragment of the state output to pass it to the next state. This also allows you to filter out all unwanted information and only pass the JSON segments you really care about.

By not specifying an OutputPath, the default value will always be $. This will pass the whole JSON node that's been determined by the task result, ResultPath, including state input, straight to the next state.

InputPath, OutputPath and ResultPath Examples

Any state except a Fail state can include InputPath, OutputPath, or ResultPath. All of them allow you to utilize a path to filter the JSON as it moves throughout your workflow.

For example, you can modify the state machine, so it includes InputPath, OutputPath, and ResultPath:

Utilize this input to start an execution:

Assume that extra nodes and the comment can be discarded and include AWS Lambda function output to save the information located within the data node.

The Task state is modified to process the input to the task within the updated state machine.

This line within the state machine definition will limit the task input exclusively to the Lambda node from within the state input. The Lambda function will only receive the JSON object {"who": "AWS Step Functions"} as an input.

This particular ResultPath will tell the state machine to insert the Lambda function result in a node called "lambdaresult" as a data node's child within the original state machine input. The state's input now includes the Lambda function result with the original input and without any further processing with OutputPath.

Now, since the goal is to only save the data node while including the Lambda function resultOutputPath filters combined JSON before passing it to the state output.

It chooses exclusively the data node that came from the original input to go further towards the output. From here, the filtration of the entire state output begins:

Furthermore, in this particular Task state:

  • InputPath will send only the Lambda node from within the Lambda function input.
  • OutputPath will filter the state input (that includes the Lambda function result) to pass the data node further to the state output.
  • ResultPath will insert the result as a data node's child within the original input.

Context Object

The context object is also known as an internal JSON structure that's available during execution, while it contains information regarding both execution and your state machine. Thanks to this, your workflows will gain access to information regarding their specific execution. Accessing the context object is possible from these fields:

  • InputPath
  • OutputPath
  • ResultSelector
  • ItemsPath (within Map states)
  • Variable to other variable comparison operators
  • Variable (within Choice states)

Context Object Formats

The context object has information about execution, state, task, and the state machine. Moreover, this JSON object includes nodes for every type of data and is usually formatted like this:

During each execution, the context object is filled with data relevant to the Parameters field where it's accessed from. The Task field value is null in case the Parameters field is entirely outside of a task state.

Content from a running execution also includes these specifics in this format:

How to Access the Context Object?

To successfully access the context object, you first need to specify the parameter name by placing .$ to the end, which is the same thing you do when choosing state input with a path. Furthermore, to access context object data instead of accessing the input, start the path with $$.. This will inform the AWS Step Functions to use the path for selecting a node within the context object.

This Task state example utilizes a path to pass and retrieve the Amazon Resource Name (ARN) execution to an Amazon SQS message.

Context Object Data for Map States Processing

Two more items are available within the context object when processing a Map state: Value and Index. The Index item contains the index number meant for the array item processed within the current iteration. The context object within a Map state includes:

However, these are only available in a Map state, and they can be specified before the Iterator section within the Parameters field. Additionally, you have to define parameters from the context object within the Parameters block of the main Map state and not inside the states included in the Iterator section.

It's possible to inject information from the context object if you're given a state machine with a simple Map state:

By executing the previous state machine with the provided input, both Value and Index will be inserted in the output.

The execution output is:

Wrapping up

A Step Functions' accurate description should be "state-as-a-service." Without them, we wouldn't have a chance to maintain each states' execution with multiple Lambda activities.

It's vital to keep on top of your Step Functions' performance because workflows can go wrong, which can severely affect your end-user. AWS Step Functions publishes events and metrics to CloudTrail and CloudWatch which are monitored by Dashbird and imported to a single dashboard, combining other AWS services' metrics and translated into easy-to-understand actionable data.

Dashbird's Insights engine detects errors related to state machine definitions or task execution failures in real-time and notifies you immediately, via Slack or email, when something within your workflows breaks or is about to go wrong. The Insights engine is based on AWS Well-Architected best practices and constantly runs your whole serverless infrastructure's data against its rules, to help you make sure your app optimized and reliable at any scale.

Prevent serverless errors with AI-driven insights

You can give Dashbird a try -- it's free!

  • No code changes
  • No credit card required
  • Simple 2-minute set up
  • Get access to all premium features
  • Start receiving automated alerts and securely working with your data immediately
  • Find and debug known and unknown errors in seconds
  • Get customized actionable insights to improve and Well-Architect your system to be able to take on more complexity over time
  • Simple, clean, and easy to understand interface
  • One of the most budget-friendly monitoring and troubleshooting solutions in the market
  • Supportive and friendly all around 🙂 -- see what Dashbird users are saying

Discussion (0)

Editor guide