This blog post will show you how to install Apache Kafka on an Amazon EC2 instance and connect to it locally to take use of the potential of real-time data. We will guide you through every step of the process, including setting up Kafka on the cloud and managing data streams using a producer and consumer. By the end, you will be able to use a specialized GitHub repository to conduct an analysis of the stock market. For those who want to learn more about real-time analytics and data streaming, this extensive guide is ideal.
So let's begin
Steps to Setup Kafka
Create an EC2 Instance
Edit Security Group
Connect to EC2 Instance
1.Go to the Downloads folder where the key is located
2.Fetch the SSH Connect Command
- Sign in to the AWS Management Console.
- Navigate to the EC2 service.
- In the EC2 Dashboard, click on "Instances" in the left sidebar.
- Select the instance you want to connect to by clicking on its checkbox.
- Click the "Connect" button at the top of the page.
- In the "Connect to instance" page that opens, select the "SSH client" tab.
- Scroll down to the section titled "Example". Here, you'll see the exact SSH command to use for connecting to your instance.
- You can copy this command directly from the dashboard. It will look something like this:
ssh -i "your-key-name.pem" ec2-user@ec2-12-34-56-78.compute-1.amazonaws.com
Now you are connected to remote EC2 Instance
Run this command to Download Kafka
wget https://downloads.apache.org/kafka/3.7.0/kafka_2.13-3.7.0.tgz
Install Amazon Corretto (OpenJDK):
Amazon Corretto is a no-cost, multiplatform, production-ready distribution of the Open Java Development Kit (OpenJDK).
sudo yum install -y java-1.8.0-amazon-corretto-devel
Navigate to the kafka folder
cd kafka_2.13-3.7.0/
Now Similarly open two terminal connecting to the EC2 Instance
In the first terminal run the following command to start zookeeper:
bin/zookeeper-server-start.sh config/zookeeper.properties
In the second terminal run the following command to start server:
export KAFKA_HEAP_OPTS="-Xmx256M -Xms128M"
cd kafka_2.13-3.7.0/
bin/kafka-server-start.sh config/server.properties
Hurray you can see your server starting !!
Now stop both the services and now focus on an important step.
There is an important step here:
To make our local system connect to the Kafka that is on the remote EC2 instance we need to make a little bit of configurational changes.
Run the following command in your terminal
sudo nano config/server.properties
Find this line:
ADVERTISED_LISTENERS
- In your case, it will be localhost.
- Change it to the public IPV4 of your instance.
- Press CTRL+X and SHIFT+Y and then Enter
And we are done !!
We can check the running of kafka
cd kafka_2.13-3.7.0
bin/kafka-topics.sh --create --topic sample_topic --bootstrap-server {Public IP of your EC2 Instance:9092} --replication-factor 1 --partitions 1
- Start Producer:
bin/kafka-console-producer.sh --topic sample_topic --bootstrap-server {Public IP of your EC2 Instance:9092}
Start Consumer:
Duplicate the session & enter in a new console --
cd kafka_2.13-3.7.0/
bin/kafka-console-consumer.sh --topic sample_topic --bootstrap-server {Public IP of your EC2 Instance:9092}
Write some messages in producer side and you can receive the same in consumer side.
Now we can build our project locally and by properly mentioning the details we can connect to the Kakfa.
I have worked on a project that leads to stock market analysis and plots the graph in real time. You can find the code below.
Top comments (1)
Interesting one. Keep going!!