On this post, we explore the options to ship your applications' logs to OpenSearch using Filebeat. The applications will be running as AWS Elastic Container Service ECS, Kubernetes Pods, or applications running on AWS EC2.
We will use a very popular tool Filebeat. Filebeat is Lightweight shipper for logs. Whether you’re collecting from security devices, cloud, containers, hosts, or OT, Filebeat helps you keep the simple things simple by offering a lightweight way to forward and centralize logs and files.
The stack we will use also know as ELK (Elasticsearch, Logstash, and Kibana). The same stack will work for OpenSearch as well. We will discuss in another post using FluentD instead of Logstash, but using Filebeat remains the same.
Let's have an overview of the solution as in the following diagram:
From the above diagram, we have the following:
- The ECS cluster is composed from one or many EC2 instance.
- The applications running as container on EC2 instance.
- The containers are configured to write logs from stdout and stderr to a location on the host using volumes.
- Each EC2 instance has a running instance of Filebeat as a container.
- The Filebeat read and forward logs to Logstash/FluentD.
- The Logstash/FluentD collect, transform the logs, and them to OpenSearch.
Optionally, you can configure the Filebeat to send the data directly to OpenSearch if you don't need to have a transformation on the logs data.
Configure and Build Filebeat Container Image
Now, we will build a container image that will setup and configure a Filebeat instance as a container. This Filebeat will collects the logs as per filebeat.yml configuration file.
Filebeat configurations file:
#=========================== Input for Harvesters ===============================
filebeat.inputs:
- type: log
fields:
type: syslogMessages
scan_frequency: 5s
close_inactive: 1m
backoff_factor: 1
backoff: 1s
paths:
- /host/var/log/messages
tail_files: true
fields_under_root: true
- type: log
fields:
type: ecsAgent
scan_frequency: 5s
backoff_factor: 1
close_inactive: 10s
backoff: 1s
paths:
- /host/var/log/ecs/ecs-agent.log.*
fields_under_root: true
- type: container
enabled: true
fields_under_root: true
overwrite_keys: true
fields:
type: docker
paths:
- /host/var/lib/docker/containers/*/*.log
stream: all
#================================ General =====================================
filebeat.shutdown_timeout: 5s
logging.metrics.enabled: true
logging.metrics.period: 60s
logging.level: info
fields_under_root: true
fields:
accountId: "'${ACCOUNTID}'"
instanceId: ${INSTANCEID}
instanceName: ${INSTANCENAME}
region: ${REGION}
az: ${AZ}
environment: ${ENV}
#================================ Outputs =====================================
output.logstash:
hosts: ["__OUTPUT.LOGSTASH__"]
compression_level: 1
worker: 1
bulk_max_size: 1024
ttl: 18000
ssl.certificate_authorities: ["/etc/filebeat/logstash.pem"]
max_retries: -1
# Uncomment to enable console output for debugging, Disable the above logstash output by commenting
# output.console:
# pretty: true
The above configuration file has the following:
- Under
filebeat.inputs:
, we telling filebeat to collect logs from 3 locations. The first one for the host logs, the EC2 logs, the second for ecsAgent logs, and the third is the any logs from the containers running on the host. - Then, we are enabling the filebeat metrics and enriching the logs entries with additional data such as the AWS AccountId, EC2 Instance Id and Name, AWS Region and environment.
- Lastly, we telling Filebeat where to forward data, we configure it to logstash endpoint. You can have a look on all filebeat configuration options from here
The next step is to create a container image and push it to AWS ECR. The Dockerfile is
FROM docker.elastic.co/beats/filebeat:7.16.3
USER root
COPY docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh
RUN chmod +x /usr/local/bin/docker-entrypoint.sh
COPY filebeat.yml /etc/filebeat/filebeat.yml
COPY logstash.pem /etc/filebeat/logstash.pem
ENTRYPOINT ["/usr/local/bin/docker-entrypoint.sh"]
CMD [ "filebeat", "-e", "-c", "/etc/filebeat/filebeat.yml", "-httpprof", "0.0.0.0:9700"]
As you noticed, the Dockerfile is copying 3 files:
- docker-entrypoint.sh, this is the entry file for Filebeat container, we see it in the next step.
- filebeat.yml: this the filebeat configuration file, we just created above.
- logstash.pem: this is Logstash public key certificate file.
The docker-entrypoint.sh
#!/bin/bash
set -e
# Add filebeat as command if needed
if [ "${1:0:1}" = '-' ]; then
set -- filebeat "$@"
fi
exec "$@"
Alright, now you have everything to build a container image. Use docker build command and push it your repository.
Now the tricky part to have a Filebeat running as a container on EC2 instance. First, we have an ECS cluster that running on EC2s. The Cluster instance is manage by Auto-Scaling group that responsible to keep the desired number of instances running. In the LaunchConfiguration for the Autoscaling group we will execute some commands to bring a Filebeat container on each EC2 instance.
We will create a shell script file that do this job. We called the file filebeat.sh. This file will be uploaded to S3 bucket as part of the build pipeline and then download during EC2 spin up from the LaunchConfiguration.
#!/bin/bash
ACCOUNTID=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | grep -oP '(?<="accountId" : ")[^"]*(?=")')
REGION=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | grep -oP '(?<="region" : ")[^"]*(?=")')
AZ=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | grep -oP '(?<="availabilityZone" : ")[^"]*(?=")')
INSTANCEID=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | grep -oP '(?<="instanceId" : ")[^"]*(?=")')
INSTANCENAME=$(aws ec2 describe-tags --filters "Name=key,Values=Name" "Name=resource-id,Values=$INSTANCEID" --region $REGION --query Tags[0].Value --output text)
ENV=$1
#download filebeat docker
AwsAccountId="123456789"
$(aws ecr get-login --no-include-email --region eu-west-1 --registry-ids $AwsAccountId)
ECR="$AwsAccountId.dkr.ecr.eu-west-1.amazonaws.com"
docker run --restart=always \
--detach \
--memory=200m \
--memory-reservation=55m \
--memory-swap=300m \
--name='filebeat' \
--env "ACCOUNTID=$ACCOUNTID" \
--env "ENV=$ENV" \
--env "REGION=$REGION" \
--env "AZ=$AZ" \
--env "INSTANCEID=$INSTANCEID" \
--env "INSTANCENAME=$INSTANCENAME" \
--publish 9700:9700 \
--volume /var/lib/docker/containers/:/host/var/lib/docker/containers/ \
--volume /var/log:/host/var/log/ \
--volume /opt/filebeat/:/host/opt/filebeat/ \
--log-driver=json-file \
--log-opt labels=loggingstack.program,loggingstack.program-version \
--log-opt max-size=50m \
--log-opt max-file=5 \
$ECR/logging/filebeat:latest
The shell script is doing the following 3 steps:
- Extract the additional fields we have configured in filebeat.yml file such as AWS accountId from the EC2 metadata endpoint.
- Login to ECR repository that has the filebeat container image. This step vary based on the container image registry.
- The last step is using docker run command to create an filebeat container. The command pass the env variables, the image and some voulumes.
Have you noticed that we making the container have access to other containers logs by using
--volume /var/lib/docker/containers/:/host/var/lib/docker/containers/
The last step is making the LaunchConfiguration execute the shell script file. We used the AWS CloudFormation to create the ECS cluster. If you are creating from AWS Console, just copy the content under UserData. The LaunchConfiguration as follow:
ECSLaunchConfiguration:
Type: 'AWS::AutoScaling::LaunchConfiguration'
Properties:
IamInstanceProfile: !GetAtt ECSInstanceProfile.Arn
ImageId: !FindInMap [AmazonMachineImages, !Ref 'AWS::Region', Id]
InstanceType: !Ref ECSInstanceType
KeyName: !Ref KeyPairName
SecurityGroups:
- !Ref ECSSecurityGroup
UserData:
Fn::Base64: !Sub |
#!/bin/bash -x
aws s3 cp [S3_PATH] filebeat.sh
chmod +x filebeat.sh
./filebeat.sh ${EnvType}
Don't forget to replace [S3_PATH] with your path. The ${EnvType} is passed from the CloudFormation stack parameter.
The next post is configuring the Logstash to complete the scenario.
Thanks.
Top comments (0)