DEV Community


Posted on

Building a distributed network scanner (Cthulhu-net)


This experimental work, was conducted to hands-on appreciate the concepts of vertical scaling vs horizontal scaling in relation to distributed systems.

Vertical scaling, simply could be defined as the augmentation of the capacities (CPU, RAM, Disks size, bandwidth) or a unique computing device, to handle the increase in load/scale of a computational activity it should perform.

Horizontal scaling on the other hand addresses the problem of load increase/scale by uniformly spreading the computational activity to be performed over a pool of relatively medium capacity devices, such that the devices individual outputs contribute to the overall output of the system.

The domain in which this concept was explored is the context of this work is that of network scanners.

Traditional network scanners which run on single hosts have huge performance trade-offs when we use them to scan very large IP address spaces. Their performance may increase only with increase in the available bandwidth of the device from which the scanner is run (vertical scaling)

This project aims for exploratory purpose to spread the pool of ip addresses to scan over a group of relatively medium capacity devices (horizontal scaling), in order to reduce scan time. An added advantage is that, in the case we geographically separate
the scanner nodes (we call them bots in our system), the scan results will reflect each node’s unique view of the network.

Overall architecture

Overall system architectureFig 1.0: Overall system architecture

The BotPool is a set of general purpose machines(virtual machines) equipped with a GOLANG based agent, dedicated to receiving scan tasks from the server, running them and feeding back the server with the results of the scan.

C2 Server:
The Command and Control Server (C2) is the core/main server with responsibilities of scheduling/orchestrating scan tasks for bots of the botpool.
It receives scan jobs from the Operators, breaks them into smaller trackable tasks for the botpool, and stores the results of the scan for subsequent visual reporting.

Backend Services:
The various responsibilities of the C2 server are broken down into independent collaborating domains of responsibilities. They are the ones which collaborate together to realize each of the
expected responsibilities of the overall system required by external actors.

C2 Operator Space:
The Operator space consists mainly of a very simple Command Line Utility from which he can push new scan jobs to the system, and a web-based Grafana interface for statistics visualizations.

Operator CLIFig 1.1: Operator CLI

Operator grafana DashboardFig 1.2: Operator Grafana Dashboard

Intended workflow

We define the following entities within a successful work session of the system.

System entities descriptionFig 2.0: System Entities description

And The following sequence of actions describe a successful session in our system

1. Discovery (Tracking phases)

Discovery phaseFig 2.1: Discovery Phase

During this phase of our system, live worker bots advertise their presence to the main server for prior registration.

2. Job Queueing By Operator

Job Queuing by operatorFig 2.2: Job Queuing and task breakdown

The system operator provides a scan job against a designated network ( in our example.
This subnet is broken down by the server into smaller chunks, and queued for tracking.
The smaller chunks are later referred to as Tasks in our system, we consider a Job to be made up of multiple Tasks.

3. Pool Feeding

Pool feedingFig 2.3: Pool feeding

Tasks (scan chunks) are progressively dequeued from the Job Queue, and assigned to the live bots who initially announced themselves to the server.

Reports from PoolFig 2.4: Pool feeding

Results from executing the Tasks are sent back to the server by the bot, which in turn provides them to the Operator.


Top comments (0)