DEV Community

Andrei Fer
Andrei Fer

Posted on

Virus scan MinIO buckets using ClamAV, Fission and Kafka

All organizations want to protect their systems and have a good strategy in order to stay away from malware or other potential threats. Before introducing files and binaries into your system organization is very important to scan them and respond immediately based on a predefined strategy.

Short introduction of Fission

Fission is an open-source and Kubernetes-native serverless framework that lets developers to run code functions easily. Kubernetes has powerful orchestration capabilities to manage and schedule containers while Fission takes advantage of them, being flexible. In other words, Fission can focus on developing the function-as-a-service (FaaS) features. Fission supports many programming languages such as Pyhton, NodeJs, PHP, Go and C#. Also, this tool allows you to create a message queue trigger like Apache Kafka, AWS SQS, AWS Kinesis, GCP Pub Sub, Nats Streaming. In this way, the product becomes an open-source version of AWS Lambda.

Fission has three core concepts: Function, Environment , and Trigger. Their relationships are presented in the following figure:

Trigger, Function, Environment

  1. Function: the code that is written by using a specific language for execution. In our case we have Python code.
  2. Environment: the special language environment that is used to run user function
  3. Trigger: used to associate functions and event sources

Implementation in Action

This walkthrough will show you how to build a pipeline in order to scan your MiniIO files using asynchronous triggers. When a new file will be pushed in our MinIO buckets (1)a new notification will be send to the Kafka (2) and the trigger will start the function (3), scanning the file for viruses (4).

Process of scanning files

For this walkthrough, you should have the following prerequisites:

• MinIO Server and 2 buckets: test-bucket, infected-objects
• Apache Kafka and 3 topics: bucketevents, bucketevents-response, bucketeventserror
• Fission in Kubernetes environment, install Fission client CLI.(please use this link: https://fission.io/docs/installation/

In order to implement what was described you have to follow the steps:

Step 1- Enable MinIO to send notification using Kafka when we put new objects in the bucket

MinIO supports updating Kafka endpoints on a running MiniIO server process using the MinIO client (mc) and mc admin config set command and the notify_kafka configuration key.

mc admin config set ALIAS/ notify_kafka:IDENTIFIER brokers="<ENDPOINT>"
topic=”<string>”
Enter fullscreen mode Exit fullscreen mode

Replace IDENTIFIER with a unique descriptive string for the Kafka service endpoint. Replace ENDPOINT with a comma separated list of Kafka brokers (e.g: 192.168.10.10:9092,192.168.10.11:9092,192.168.10.12:9092). The topic name in our case is bucketevents. You must restart the MinIO server process to apply any new or updated configuration settings.

Use the mc event add command to add a new bucket notification (when you put new objects) with the configured Kafka service as a target:

mc event add ALIAS/BUCKET arn:minio:sqs::IDENTIFIER:kafka --event put
Enter fullscreen mode Exit fullscreen mode

Step 2 - Create Python code which uses ClamAV

Here I’ve created a Python code which is available in Github.You have to change MinIO credentials in order to connect to it. Your objects will be copied inside the container and scanned using clamdscan from ClamAV. If the file is infected will be moved to “/tmp/infected-files/” and placed in a special bucket (infected-objects) for deep investigations.

In a default scenario Fission use image based on language environments (in our case Python), but you can create your own image using Dockerfile. After that, you can upload new image in your Docker registry and when you create a new environment you will mention your custom image and secret if it’s necessary. Our custom image install ClamAV and what is necessary for scanning files.

Now, let’s create a function with Python as environment:

fission environment create --name python --image --image=YOUR_DOCKER_IMAGE
--imagepullsecret="regcred"
Enter fullscreen mode Exit fullscreen mode

Let’s assumed that Python code is saved in minio-scan.py file. Using this file you have to create your Fission function like so:

fission function create --name=minio-scan --env python --code minio-scan.py
Enter fullscreen mode Exit fullscreen mode

You can use the following command to create a Kafka-based message queue trigger mqttest.This trigger subscribes to messages of the input topic (bucketevents), and it immediately triggers function execution when it receives any message.The function execution result is written to the output topic (bucketevents-response) if a function execution succeeds, or to the error topic (bucketevents-error) if it fails.

fission mqt create --name mqt-test --function minio-scan --mqtype kafka --
mqtkind keda --topic bucketevents --resptopic bucketevents-response --
errortopic bucketevents-error --metadata bootstrapServers=ENDPOINT:PORT --
metadata consumerGroup=fission-test --metadata topic=bucketevents
Enter fullscreen mode Exit fullscreen mode

Replace ENDPOINT:PORT with a comma separated list of Kafka brokers.

Step 4 - Check the logs and Fission behaviour
Your function will be deployed as a pod in fission-function namespace into Kuberentes cluster. Using kubectl logs -f -c python $POD_NAME you can check the logs in real time to see what happens there. Also you have to take a look at Kafka topics (response or error) in order to know the output of your function.

Conclusion

Fission is a versatile framework that can be used with many different programming languages and help us to build FaaS in Kubernetes. In this scenario, I used some basic feature of Fission for scanning new S3 object but I totally recommend you to read more regarding Fission technologies on Fission blog which are very interesting.

Top comments (0)