On this blog post, we are going to show how you can control access to documents on AWS OpenSearch Services. The solution we designed and implemented base on the specific use cases, but the solution could be tweaked to be used in other use cases.
The technologies we will use in this solution are:
- AWS OpenSearch Services.
- AWS CloudFormation.
- AWS Lambdas.
- Python.
- Azure AAD (Optional)
Solution Overview
The were two Elasticsearch domains mainly to store the applications' logs. One domain is dedicated to the production logs and the other domain is dedicated to other environments logs such as development and testing. The solution was lacking authentication and authorization, who has VPN access can access Kibana and search any document stored in the Elasticsearch domain. The main drive to upgrade to AWS OpenSearch is control access to data stored in the domain based on RBAC model and also have the new nice feature in OpenSearch such as Warm and Cold Storage which eventually reduce cost.
The requirements for this solution are:
- Authenticate users using Azure AAD.
- Allow teams to access their data that belongs to their applications or infrastructure.
- Have one OpenSearch domain for all environments log data.
In this post will not explain how to flow data from your applications or infrastructure to OpenSearch, this will be another blog post series.
Before we go into details, I recommend to go through fine-grained access control
Authentication Implementation
Now we have three requirements, we will start the implementation one by one. This requirement started by collaborating with Azure AD team to design the authentication process.
- Each team will be assigned two Azure AD groups Dev and Lead.
- Azure AD team creates an application on AAD.
- OpenSearch configures the authentication using SAML 2.0 with AAD.
- User will access OpenSearch Dashboards (previously Kibana) from myapplications.microsoft.com.
- The Authentication process will start by checking the user permission to access the AAD application or not (Assigned by AAD team), then the user is authenticated against AAD and if successful return SAML ticket is returned to OpenSearch.
Notice: Before configuring SAML with OpenSearch, login using your admin user and map your Azure AAD group to the predefined role all_access otherwise you will not be able to perform most of actions.
In this process you are only required to configure OpenSearch with SAML authentication, the following blog post show you can configure SAML for OpenSearch here
Authorization Implementation
First, we need to explain some concept about RBAC on OpenSearch. Permissions are managed on OpenSearch using roles, users and mappings.
- User: is an entity that represent a user who accessing OpenSearch, the user can be internal that exists on and managed by OpenSearch or external, as in our case the user externally exists on AAD and managed by external Identity Provider.
- Role: is an entity that hold the permissions the user can performs. The permission are cluster-level, index-level, document-level and field-level. The OpenSearch comes with pre-defined roles which we will use.
- Mapping: is an entity that map the users to the roles. The has two model. Map roles to backend roles e.g. Azure AD groups, IAM role or map to user. Back to the second requirement, Allow teams to access their data that belongs to their applications or infrastructure. To achieve that we made the following decisions:
- Each team will have an OpenSearch Role. The role will have the required permissions to allow the team query data only. No permission to manipulate the cluster or the indices. Since the applications and infrastructures operate in AWS and each team has a set of of AWS accounts we will make Document Level Security on the role based on AccountId.
- Each team will have a dedicated Tenant.. This tenant will allow the team shares their work.
To keep things simple, we will do the above as manual steps first, then we can automate the process using AWS CloudFormation and Lambda.
CREATE ROLE
- On OpenSearch, Choose Security --> Roles --> Create Role.
- Provide a name for the role e.g. Team_A.
- Provide cluster permissions. Our use cases was the following permissions
'kibana_all_write',
'cluster_composite_ops_ro',
'indices:data/write/bulk*',
'cluster:admin/opendistro/reports/definition/list',
'cluster:admin/opendistro/reports/definition/list',
'cluster:admin/opendistro/reports/instance/list',
'cluster:admin/opendistro/reports/instance/get',
'cluster:admin/opendistro/reports/definition/create',
'cluster:admin/opendistro/reports/definition/update', 'cluster:admin/opendistro/reports/definition/on_demand',
'cluster:admin/opendistro/reports/definition/delete',
'cluster:admin/opendistro/reports/definition/get',
'cluster:admin/opendistro/reports/menu/download',
'cluster:admin/opendistro/alerting/alerts/get',
'cluster:admin/opendistro/alerting/alerts/ack',
'cluster:admin/opendistro/alerting/monitor/write',
'cluster:admin/opendistro/alerting/monitor/delete',
'cluster:admin/opendistro/alerting/monitor/execute',
'cluster:admin/opendistro/alerting/monitor/get',
'cluster:admin/opendistro/alerting/monitor/search',
'cluster:admin/opendistro/alerting/destination/get',
'cluster:admin/opendistro/alerting/destination/write',
'cluster:admin/opendistro/alerting/destination/delete', 'cluster:admin/opendistro/alerting/destination/email_account/delete', 'cluster:admin/opendistro/alerting/destination/email_account/get',
'cluster:admin/opendistro/ad/detector/search',
'cluster:admin/opendistro/ad/detector/delete',
'cluster:admin/opendistro/ad/detector/info',
'cluster:admin/opendistro/ad/detector/jobmanagement',
'cluster:admin/opendistro/ad/detector/preview',
'cluster:admin/opendistro/ad/detector/run',
'cluster:admin/opendistro/ad/detector/stats',
'cluster:admin/opendistro/ad/detector/write',
'cluster:admin/opendistro/ad/result/search'
- Under Index Permission, provide the index patterns e.g. logs-*
- Under Allowed Actions, provide read.
- Under Document-Level Security, provide your query e.g.
{
"terms":{
"accountId":[
123,
456
]
}
}
CREATE Tenant
- Open OpenSearch Dashboards.
- Choose Security, Tenants, and Create tenant.
- Give the tenant a name and description.
- Choose Create.
Role Mapping
After creating a tenant, give a role access to it using OpenSearch Dashboards:
- Read-write (kibana_all_write) permissions let the role view and modify objects in the tenant.
- Read-only (kibana_all_read) permissions let the role view objects, but not modify them.
- Open OpenSearch Dashboards.
- Choose Security, Roles, and a role.
- For Tenant permissions, add tenants, press Enter, and give the role read and/or write permissions to it.
- Choose the Mapped users tab and Manage mapping.
- Specify users or external identities (also known as backend roles). Here it will by our Azure AAD groups
- Choose Map.
So far so good, but manual process is error born, in case of changes, you will need to go through all roles and apply the changes. For enterprise it will be hard to manage, so let's automate this process.
We will have one AWS Lambda function, this lambda will be responsible to make REST API calls to OpenSearch, and an AWS CloudFormation stack that responsible to create or update the mentioned lambda.
Another CloudFormation that will use the lambda function as custom resource, this stack will take the input values form the user to create team role, tenant and do the required mapping.
Let's start with lambda definition, we will use python in this lambda.
To enable lambda to be called from another CloudFormation stack, we use CfnResource from crhelper package. The following all import statement
import boto3
import requests
from requests_aws4auth import AWS4Auth
import json
from munch import DefaultMunch
import kibanarole
import os
from crhelper import CfnResource
We will get some environment variables from CloudFormation stack that will create or update the AWS Lambda
esHost = os.getenv("OpenSearchEndpoint")
# Your OpenSearch URL including https:// and trailing /
region = os.getenv("AwsRegion") # e.g. us-west-1
Some global variables like the REST API endpoints for OpenSearch
GET_ROLES_PATH = '_opendistro/_security/api/roles'
GET_TENANTS_PATH = '_opendistro/_security/api/tenants'
CREATE_ROLE_PATH = '_opendistro/_security/api/roles/__ROLE_NAME__'
CREAT_TENANT_PATH = '_opendistro/_security/api/tenants/__TENANT_NAME__'
CREATE_ROLE_MAPPINGS_PATH ='_opendistro/_security/api/rolesmapping/__ROLE_NAME__'
DELETE_TENANT_PATH = '_opendistro/_security/api/tenants/__TENANT_NAME__'
DELETE_ROLE_PATH = '_opendistro/_security/api/roles/__ROLE_NAME__'
GET_ACTION_GROUP_PATH = '_plugins/_security/api/actiongroups/' + SQUAD_ACTION_GROUP_NAME
CREATE_ACTION_GROUP_PATH = '_plugins/_security/api/actiongroups/' + SQUAD_ACTION_GROUP_NAME
PATCH_ACTION_GROUP_PATH = '_plugins/_security/api/actiongroups/' + SQUAD_ACTION_GROUP_NAME
DELETE_ACTION_GROUP_PATH = '_plugins/_security/api/actiongroups/' + SQUAD_ACTION_GROUP_NAME
CREATE_INDEX_PATTERN_PATH = 'saved_objects/index-pattern/__PATTERN_NAME__'
The AWS authentication using boto3
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
headers = {"Content-Type": "application/json"}
crhelper initialization and marking the main python method to be called from the CloudFormation
helper = CfnResource()
@helper.create
@helper.update
def apply_kibana_security(event, _):
# Logic Here
Lambda main method
def handler(event, context):
# Call method which marked with @helper.create or @helper.update
helper(event, context)
Now let's go with apply_kibana_security method implementation, first we will get some of the passed parameters form CloudFormation using event['ResourceProperties']['PARMA_NAME']
@helper.create
@helper.update
def apply_kibana_security(event, _):
# Param to decide create or update an existing team
create_or_update = event['ResourceProperties']['CreateOrUpdate']
team_name = event['ResourceProperties']['TeamName']
# List of Azure AAD groups or any Backend roles
backend_roles = event['ResourceProperties']['BackendRoles']
# List of accountIds which will be used in Document-Level Security. Could be any field in your documents.
account_ids = event['ResourceProperties']['AccountIds']
create_or_update_action_group()
if create_or_update == "Create":
print("Creating team ...")
response_message = create_squad_in_kibana(team_name, account_ids, backend_roles)
print("response_message: " + response_message)
# Return response message to CloudFormation
helper.Data['Results'] = response_message
elif create_or_update == "Update":
response_message = update_team_in_kibana(team_name, account_ids, backend_roles)
helper.Data['Results'] = response_message
else:
helper.Data['Results'] = "Not supported option"
def create_team_in_kibana(team_name, account_ids, backend_roles):
tenant_name = team_name + "_tenant"
tenant_description = "This tenant is assigned to " + squad_name + " squad."
role_name = team_name + "_role"
# step one: get list of tentat
tenant_list = get_kibana_tenants()
# check that the tenant in not exists
if tenant_name in tenant_list:
return "Tenant " + tenant_name + " already exists."
# step two: get list of roles
role_list = get_kibana_roles()
# check that the role is not exist
if role_name in role_list:
return "Role " + role_name + " already exists."
# step three: creat the tenant
response = create_tenant(tenant_name, tenant_description)
print(response.text)
if response.status_code == 201:
# create tenant default index patterns
# create_index_patterns(tenant_name)
# step four: create the role
response = create_kibana_role(role_name, tenant_name, is_cp_amin, account_ids)
print(response.text)
if response.status_code == 201:
# step five: creat the role mapping
response = create_kibana_role_mappings(role_name, backend_roles)
print(response.text)
if response.status_code == 201:
return "Team has been created successfully."
else:
# role back for role created by deleting role
response = delete_role(role_name)
print(response.text)
if response.status_code == 200:
print("Role created rolled back successfully")
return "Can't create role mappings for team" + team_name
else:
return "An error occured, can't role back"
else:
# role back for tenant created by deleting the tenant
print("Rolling back tenant creation")
response = delete_tenant(tenant_name)
print(response.text)
if response.status_code == 200:
print("Tenant created rolled back successfully")
return "Can't create role for squad " + team_name
else:
return "An error occured, can't role back"
else:
return "Can't create tenant for squad " + team_name
The following are example of single operations against OpenSearch
Get List of Roles
def get_kibana_roles():
url = esHost + GET_ROLES_PATH
response = requests.get(url, auth=awsauth, headers=headers)
print(response)
if response.status_code == 200:
roles = json.loads(response.text)
roles_list = []
for key, value in roles.items():
role_data = roles[key]
role_object = DefaultMunch.fromDict(role_data)
if role_object.reserved == False:
roles_list.append(key)
return roles_list
Create a Role and map it to the tenant in one step
def create_kibana_role(role_name, tenant_name, is_cp_admin, account_ids):
role_json_template = kibanarole.create_kibana_role_template(tenant_name, account_ids, SQUAD_ACTION_GROUP_NAME)
url = esHost + CREATE_ROLE_PATH.replace("__ROLE_NAME__", role_name)
response = requests.put(url, auth=awsauth, data=role_json_template, headers=headers)
print(response)
return response
def create_kibana_role_template(tenant_name, account_ids, action_group_name):
accounts = account_ids.split(",")
account_json = ""
for account in accounts:
account_json += "\"" + account + "\","
dls = SQUAD_DLS_TEMPLATE.replace("__ACCOUNT_IDS__", account_json.rstrip(','))
KIBANA_ROLE_JSON_TEMPLATE = {
'cluster_permissions': [
action_group_name
],
'index_permissions': [{
'index_patterns': [
INDEX_PERMISSION_PATTERN
],
'dls': dls,
'allowed_actions':[
SQUAD_ALLOWED_ACTIONS
]
}],
'tenant_permissions': [{
'tenant_patterns': [
tenant_name
],
'allowed_actions': [
'kibana_all_write'
]
}]
}
Create Tenant
def create_tenant(tenant_name, tenant_description):
tenant_json_template = kibanarole.create_tenant_template(tenant_description)
# print(tenant_json_template)
url = esHost + CREAT_TENANT_PATH.replace("__TENANT_NAME__", tenant_name)
# print(url)
response = requests.put(url, auth=awsauth, json=tenant_json_template, headers=headers)
print(response.text)
return response
def create_tenant_template(tenant_description):
tenant_template = {
'description': tenant_description
}
return tenant_template
Create Role Mapping
def create_kibana_role_mappings(role_name, backend_roles):
url = esHost + CREATE_ROLE_MAPPINGS_PATH.replace("__ROLE_NAME__", role_name)
# print(url)
role_mapping_json_template = kibanarole.create_role_mappings_template(backend_roles)
print(json.dumps(role_mapping_json_template))
response = requests.put(url, auth=awsauth, data=json.dumps(role_mapping_json_template), headers=headers)
print(response)
return response
def create_role_mappings_template(backend_roles):
print(backend_roles)
if len(backend_roles) > 0:
# json_str = json.loads(backend_roles)
role_mapping_template = {
'backend_roles': backend_roles
}
else:
role_mapping_template = {
'backend_roles': ''
}
return role_mapping_template
Well, we have made the Lambda logic, let's create the CloudFormation that will create the lambda. I named it CreateLambda.yml
AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Description: 'SAM template to create an AWS Lambda which will be invoked by CloudFromation to apply OpenSearch Dashboard Security.'
Parameters:
OpenSearchDomainArn:
Description: "The Arn for the OpenSearch Domain which the Lambda will call APIs."
Type: String
Default: ""
OpenSearchDomainEndpoint:
Description: "The OpenSearch Domain endpoint to apply security on."
Type: String
Default: ""
AWSRegion:
Description: "The AWS region where the Elsatic Search Domain resides."
Type: String
Default: "us-west-1"
AllowedValues:
- "us-west-1"
VpcSubnetIds:
Description: "The Subnet Ids where the OpenSearch domain resides."
Type: CommaDelimitedList
Default: ""
SecurityGroupIds:
Description: "The Security group Ids applied to the Lambda function."
Type: CommaDelimitedList
Default: ""
Resources:
KibanaSecurityLambda:
Type: AWS::Serverless::Function
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
Action:
- 'sts:AssumeRole'
CodeUri: lambda/
Handler: kibana-security.handler
Timeout: 300
Runtime: python3.7
Environment:
Variables:
OpenSearchEndpoint: !Ref OpenSearchDomainEndpoint
AwsRegion: !Ref AWSRegion
VpcConfig:
SecurityGroupIds: !Ref SecurityGroupIds
SubnetIds: !Ref VpcSubnetIds
Policies:
- AWSLambdaVPCAccessExecutionRole # Allow Lambda to Create VPC ENI to communicate with OpenSearch Domain
- Statement:
- Sid: ElsaticDomainAccess
Effect: Allow
Action:
- es:* # es:ESHTTP*
Resource:
!Join
- ''
- - !Ref ElasticDomainArn
- '/*'
Outputs:
KibanaSecurityLambdaArn:
Description: "The Lambda Function Arn."
Value: !GetAtt KibanaSecurityLambda.Arn
This is very important step, to allow the AWS Lambda makes calls to OpenSearch, we need to map the Lambda's Role ARN as backend role. The easiest way to map the Role ARN to the all_access role in OpenSearch, but that is not recommended as you granting the Lambda all access to the cluster, index and documents operations. The recommended way is to create a custom role in OpenSearch and add the least operation the lambda needs to acheive your use cases.
Now the second CloudFormation which will take user inputs and call the Lambda. I called it apply-kibana-security.yml
AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Description: 'SAM template to to apply OpenSearch Dashboard Security. Call custom resource Lambda function.'
Metadata:
AWS::CloudFormation::Interface:
ParameterGroups:
-
Label:
default: "Basic Configurations"
Parameters:
- KibanaSecurityLambdaArn
- CreateOrUpdate
-
Label:
default: "Squad Configurations"
Parameters:
- SquadName
- AccountIds
- BackendRoles
Parameters:
SquadName:
Description: "The team name, separate names with underscores e.g team_a."
Type: String
AccountIds:
Description: "List of AWS accounts that belong to team, Comma separated e.g. 123,456"
Type: String
Default: ""
BackendRoles:
Description: "List of Azure AD groups, Comma separated e.g. team_a_dev,team_a_lead"
Type: String
Default: ""
CreateOrUpdate:
Description: "Specify the option to create new team security in OpenSearvh or update existing team."
Type: String
Default: Create
AllowedValues:
- "Create"
- "Update"
KibanaSecurityLambdaArn:
Description: "The AWS Lambda which will imported as Custom resource and invoked from this CF."
Type: String
Default: ""
Resources:
ApplyKibanaSecurity:
Type: "Custom::SecurityApplier"
Properties:
ServiceToken: !Ref KibanaSecurityLambdaArn
CreateOrUpdate: !Ref CreateOrUpdate
SquadName: !Ref TeamName
BackendRoles: !Ref BackendRoles
AccountIds: !Ref AccountIds
Outputs:
ResponseMessage:
Description: "The final out response message from Lambda."
Value: !GetAtt ApplyKibanaSecurity.Results
Alright, now everything is in place, go ahead and test the solution and let's move to the third requirement.
From two domains to one OpenSearch domain - One domain to rule them all :D
This requirement is a little bit tricky, Production log data has different retention period, filtering data now depends on specific fields for example you want to look for data coming from dev environment only. We have two approaches:
- Make the data in one index for all environment, user uses the envirnoment field to filter data, but we can't have different retention period.
- Make the production data in separated indecis from other environment. Create index patterns for each envirnoment.
We picked option two, but that required us to change in Logstash and other processor that flow data to OpenSearch. We will see in flow data to OpenSearch post.
We also used the nice feature of OpenSearch to keep the data in hot for x days, hot means you can write and read from the index, then we move the data to warm state, which you can read from index only. This option create some problems with Logstash, but we resolved them using some logic and mutation filters in Logstash.
This part is a big topic and we will have another post for it.
That's is hope you find this useful.
Top comments (0)