DEV Community

Kohei (Max) MATSUSHITA for AWS Heroes

Posted on

How to set up cross-account access to Amazon Kinesis Data Streams

Overview - Amazon Kinesis Data Streams and separated AWS accounts

Amazon Kinesis Data Streams (hereafter referred to as KDS) is a managed data processing service designed for the real-time collection of high traffic of data and facilitating its transfer to subsequent AWS services. It is particularly suited for handling streaming data, such as logs, where order matters, making it a commonly used service for IoT data collection. For example, it can be specified as the data export destination for Amazon Monitron, which allows for predictive maintenance of industrial equipment through machine learning.

KDS serves as an intermediary between data producers and data consumers, employing this architecture for its operation.

Typical Architecture using Amazon Kinesis Data Streams

When fully leveraging AWS Cloud, it's not uncommon to operate separate AWS accounts for purposes of distinction and cost management. In such cases, there might be a need to process streaming data collected by KDS in another AWS account.

This blog introduces the procedure for sharing a data stream with another AWS account and referencing it from AWS Lambda (specifying the Lambda function as a trigger) using the resource-based policy of Amazon Kinesis Data Streams (update as of November 2023). A data stream, in this context, is akin to a pipeline through which data flows.

The Architecture and Setup

The architecture is as follows.

The architecture for cross-account access

Key Points for Setup

Here are the key points for setup:

  • The execution role of the Lambda function on the data processing side(Account B) requires the AWSLambdaKinesisExecutionRole policy.
  • For the KDS data stream on the data producer side(Account A), the resource-based policy should specify the IAM role ARN of the execution role of the Lambda function on the data processing side as the Principal, and set the Allow Actions to include kinesis:DescribeStream, kinesis:DescribeStreamSummary, kinesis:GetRecords, kinesis:GetShardIterator and kinesis:ListShards.

Especially regarding the second point on resource-based policies, it's important to note that specifying kinesis:DescribeStream might be missed when using the dialog in the management console. It needs to be manually added using the JSON editor (as of Feb. 2024, reported).

The following official documents might also be helpful:

Steps

  • The steps involve several cross-accounts: data producer side(Account A) → data processing side(Account B) → data producer side → data processing side. Make sure not to mix up the targets.
  • Everything must be in the same region; sharing is not possible across different regions (e.g., if the data stream is in us-west-2 and the Lambda function is in ap-northeast-1). If you wish to send data to a different region, consider the architecture using Amazon EventBridge mentioned in the epilogue.

Step 1) On the data producer side - Account A

1-1. Create a data stream in Amazon Kinesis (e.g., kds-sharing-example1)

See here for the creation method (Step1: Create Data Stream).

Create Data Stream

For testing or small data volumes, the "Provisioned" capacity mode with "1" provisioned shard is sufficient.

NOTE: that while this guide assumes the creation of a new data stream, existing data streams can also be repurposed.

Step 2) On the data processing side - Account B

2-1. Create a Lambda function in AWS Lambda (e.g., kds-reader1).

See here for the creation method (Create a Lambda function with the console).

Create a Lambda function

The source code is as follows (Python 3). It simply emits the event to Amazon CloudWatch Logs, which is sufficient for operational verification. After modifying the code source as below, click "Deploy" to deploy.

def lambda_handler(event, context):
    print(event)
    return True
Enter fullscreen mode Exit fullscreen mode

NOTE: There's a template for testing the function with Amazon Kinesis Data Streams sample data, which can be used for testing.

Test data for Lambda function

2-2. Attach the AWSLambdaKinesisExecutionRole policy to the execution role of kds-reader1.

After viewing the details of kds-reader1, go to "Cofiguration" > "Permissions" and click on the role name assigned to the execution role to view the role's settings (in the figure below, click on kds-reader1-role-mp67l1v2).

Cofiguration > Permissions on Lambda Function

Go to "Add Permission" > "Attach Policy" for the permission policy to display the list of policies to attach.

Attach Policy on AWS IAM role

Select AWSLambdaKinesisExecutionRole from the list of "Other Permission Policies" and then click "Add Permission".

AWSLambdaKinesisExecutionRole

Policy addition is complete.
Check for AWSLambdaKinesisExecutionRole is added to the list of allowed policies as shown below.

list of allowed policies

NOTE: We assume that the role for executing Lambda functions is an IAM role that is automatically created and appended when a Lambda function is created. Existing IAM roles can also be used.

2-3. Noted the ARN of the IAM role.

This IAM role's ARN (in this step, kds-reader1-role-q6zcv9kq) will be used on next step.

Step 3) Back on the data producer side - Account A

3-1. configure resource-based policy for kds-sharing-example1 in Amazon Kinesis

After viewing the details of kds-sharing-example1, go to "Data stream sharing" > "Create Policy".

Create Policy

In Policy Details, select the Visual Editor, check "Data stream sharing throughput read Access," enter the ARN of the IAM role you wrote down earlier in "Specify Principal(s)," and click "Create Policy."

Visual Editor

NOTE: The principal should be the ARN of the IAM role; specifying an ARN other than the IAM role's ARN (e.g., the ARN of a Lambda function or data stream) will result in an error and the policy cannot be created.

When the resource-based policy appears, click "Edit" to display the JSON editor. Here, add "kinesis:DescribeStream", to the list of Actions as shown below. Finally, click "Save Changes".

Add a privilege

NOTE: If the same privileges have already been added at the visual editor stage, the above editing process is not necessary.

Configuration of the resource-based policy is complete.
Check to that the ARN of the IAM role is attached to the Principal and the five permissions are attached to the Action, as shown below.

Resource-based policy

3-2. Noted the ARN of kds-sharing-example1.

kds-sharing-example1 (KDS Stream's) ARN will be used on next step.

Step 4) Back on the data processing side - Account B

4-1. set up a trigger for kds-reader1 in AWS Lambda

After viewing the details of kds-reader1, go to "Configuration" > "Triggers" > "Add trigger".

Add a Trigger

In the Trigger configuration, select Kinesis for "Select a source". Then set the ARN of kds-sharing-example1 to "Kinesis stream" in the displayed contents. Leave the other items as they are and click "Add".

Trigger Configuration

NOTE1: When the focus moves to the text box, the message "No item" is displayed. No problem, ignore.

NOTE2: If you get an API error when clicking "Add", check the following two things (1) Resource-based policy permissions on the data stream side(Account A). In particular, make sure that kinesis:DescribeStream is included. (2) Permissions for the Lambda function execution role. In particular, make sure that the AWSLambdaKinesisExecutionRole policy is attached.

Trigger configuration is complete.
You can see that Kinesis has been added to the trigger (input source) of kds-reader1 as shown below.

Configuration Function

How to check

To check, send data to the data stream (kds-sharing-example1 in this example) on the data generator side(Account A) and check the Amazon CloudWatch Logs output on the data processor side(Account B).

AWS CloudShell on the data generator side(Account A) sends data to the data stream via AWS CLI.

aws --cli-binary-format raw-in-base64-out \
  kinesis put-record --stream-name kds-sharing-example1 \
  --partition-key DUMMY1 \
  --data '{"this_is": "test record"}'
Enter fullscreen mode Exit fullscreen mode

If the result of the command execution is as follows, the data transmission has succeeded.

{
    "ShardId": "shardId-000000000000",
    "SequenceNumber": "49649718468451075013017298672854645152715037125279481858"
}
Enter fullscreen mode Exit fullscreen mode

If the following log is confirmed in the log of kds-reader1 in CloudWatch Logs on the data processing side (Account B), the setting was successful.

{'Records': [{'kinesis': {'kinesisSchemaVersion': '1.0', 'partitionKey': 'DUMMY1', 'sequenceNumber': '49649683864473953433843496278632352517587667908684677122', 'data': 'eyJ0aGlzX2lzIjogInRlc3QgcmVjb3JkIn0=', 'approximateArrivalTimestamp': 1709100911.667}, 'eventSource': 'aws:kinesis', 'eventVersion': '1.0', 'eventID': 'shardId-000000000000:49649683864473953433843496278632352517587667908684677122', 'eventName': 'aws:kinesis:record', 'invokeIdentityArn': 'arn:aws:iam::888800008888:role/service-role/kds-reader1-role-q6zcv9kq', 'awsRegion': 'us-east-1', 'eventSourceARN': 'arn:aws:kinesis:us-east-1:999900009999:stream/kds-sharing-example1'}]}
Enter fullscreen mode Exit fullscreen mode

Epilogue - Architecture with Amazon EventBridge

In this article introduced sharing data streams using Amazon Kinesis Data Streams resource-based policies. This will allow, for example, the Amazon Monitron data introduced at the beginning of this article to be used by other AWS accounts, which will give you more flexibility in the operation of your AWS account.

Other possible architectures for using Amazon Kinesis Data Streams data streams with other AWS accounts include sending them through the Amazon EventBridge event bus.

The architecture using Amazon EventBridge

The advantage of this would be that it can be configured as a no-code and managed service. It can also be used across different regions, and although there is a fee for using Amazon EventBridge, the architecture is well worth it.

Here are some URLs to help you create this architecture.

I personally believe that the ability to create a configuration that matches the skills you possess and the type of operation you wish to achieve is the best part of building blocks.

Not only this configuration, but it would be a good idea to TRY a configuration that is appropriate for the time, along with new features that may come out in the future!

[EoT]

Top comments (0)