Priyank Sevak

Posted on Sep 26 • Edited on Dec 5

DevOps: Linux Performance Monitoring

#webdev #devops #devjournal #beginners

In my previous post DevOps: Understanding Process Monitoring on Linux I discussed how a Linux process works and why it's important to monitor process.

Let's dig deeper into how we can keep an eye out for how a Linux server is "Performing".

TL;DR:

This article dives into Linux process monitoring tools and techniques, helping you keep an eye on your server's performance. It covers command-line tools like top, htop, vmstat, and sar for in-depth monitoring, along with system utilities like System Monitor for a graphical overview. The article also demonstrates a sample script using top and uptime to monitor CPU, memory, and system uptime, laying the groundwork for integrating push notifications.

Performance Monitoring

1.`/proc`

In my previous post, I explained that "Everything is a File in Linux system" so where are these process files stored?

go to your CMD and just type ls /proc and you will see your PIDs in there. This is where the Linux process resides. the /proc directory contains files that contain (including but not limited to):

Current state of Linux Kernel.
Information about System Hardware.
Currently running process.

Try running the below commands to find out more about what does /proc consists of:

cat /proc/cpuinfo

cat /proc/devices #list serial ports, Network Interface, etc.

cat /proc/cmdline #useful in boot failures

/proc can be modified and can be used to communicate configurational changes directly to the kernel.

The Linux kernel is equipped with procps package which contains useful tools such as ps, top, iostat, etc. to help us in performance and process monitoring.

In addition to previously discussed top and ps there are other alternatives to the top which can provide additional or graphical alternatives to the traditional top.

`htop`

htop offers a more visually appealing interface with color-encoded bars for CPU and memory utilization.

It views processes in a tree-like structure making it easier to understand the relationship between processes.

`atop`

atop has the ability to be configured and run on remote systems, making it suitable for large-scale monitoring environments.

atop provides long-term monitoring and analysis. It logs system data to a file which allows to review historic trends and identifies performance issue over time.

2.Where's my task manager?

Isn't it easy to find out what's going on in my system and process on Windows by just hitting "Ctrl+Alt+Del" and going to "Task Manager"? Why doesn't Linux provide something like that?

If you are in a GNOME environment you can find a similar tool under your apps by searching for "System Monitor". System Monitor has 4 tabs:

System: Shows basic system info. -Process: Lists all the running processes. Can sort them and also perform operations such as Kill, stop, or terminating that process.
Resources: Lists current CPU usage, Memory and Swap usage, Network usage, and Disk usage.
File System: Lists all currently mounted file systems and additional info such as mount point, system type, and memory usage.

3.Virtual Memory statistics: `vmstat`

As the name suggests the vmstat command provides detailed info regarding the processes, memory, paging, Input/Output blocks, traps and disk and CPU activity.

The first time you run vmstat it lists the average since the last reboot. the subsequent reports are from the sampling period of provided 'delay'.

Some useful options with vmstat:

vmstat -s #lists memory and scheduling statistics

From the above image you can see that running vmstat -s gives you info regarding:

Amount of used memory: Total memory, currently used memory, Active/Inactive memory, Free, Buffer, Cache, etc.
CPU statistics: High and low priority process, Kernel Process, I/O management, Software interrupts, etc.
Memory Paging: Total pages paged in and paged out from virtual memory, total pages read from and written to swap memory.
Event Counters: Total interrupts, context switches, timestamps, and forks since last boot time.

4.System Activity Reporter: `sar`

Go to your terminal and write the below command:

ls /var/log/sysstat

You will see a bunch of directories either named saDD or saYYYYMMDD where YYYY, MM, and DD stand for Year, Month, and Day. These are "Standard System Activity Daily Data Files".

These are the directories created by sar, which collects and reports information about system activity that has occurred so far since the system started. It is possible to store the output of sar to a different file by the below command:

sar -o [filename] #save output to a different file

sar -1 # shows sar output from the previous day

Real world example:

Problem statement:

You want to keep a check on the current performance of your Linux server. You want to get notified if either CPU usage, Memory Usage, or System usage is going over a certain threshold and prevent unintentional system overutilization.

Assumptions & lab setup

I will be using the below command which is provided by Linux and is a way to benchmark the hardware or software component. It can generate various types of load, including I/O, CPU, Memory, and Network:

stress --cpu 8 --io 4 --vm 4 --vm-bytes 1024M --timeout 10s

I have specified:

CPU load equivalent to 8 CPU cores.
4 I/O concurrent operations.
1024 MB of 4 virtual memory workload.

Solution & Explaination:

#!/bin/bash

# Set the interval for monitoring (in seconds)
interval=5

while true; do
  # Get CPU usage and average load
  cpu_usage=$(top -n 1 | grep 'Cpu(s):' | awk '{print 100 - $8}')
  avg_load=$(uptime | awk '{print $8, $9, $10}')

  # Get memory usage
  mem_total=$(free -m | grep Mem | awk '{print $2}')
  mem_used=$(free -m | grep Mem | awk '{print $3}')
  mem_free=$(free -m | grep Mem | awk '{print $4}')
  mem_usage=$(( ($mem_used * 100) / $mem_total ))

  # Get network statistics
  echo "Network Packets:"
  # Iterate over each interface
  for interface in $(ifconfig | grep 'flags' | awk '{print $1}' | cut -d':' -f1); do
    # Get RX packets and bytes
    packets_transferred=$(ifconfig $interface | grep 'RX packets')


    # Print the interface name and transfer the data
    echo "$interface : $packets_transferred"
  done

  # Get system uptime
  uptime_hours=uptime | awk -F, '{sub(".*up ",x,$1);print $1,$2}'


  echo "CPU Usage: $cpu_usage%"
  echo "Average Load: $avg_load"
  echo "Memory Usage: $mem_usage%"
  echo "System Uptime: $uptime_hours"
  echo


  sleep $interval
done

Code explanation:

Getting the CPU usage:

top -n 1 | grep 'Cpu(s):' | awk '{print 100 - $8}'

Running the top and grepping details regarding the CPU. The top command will show the details regarding the tasks, CPU details, Swap, and Physical Memory in the system.

Getting the Average system load:

In addition to vmstat and sar commands, we can use uptime command to get concise details about the system. uptime command will output the current time, how long the system has been running, how many users are currently logged on, and the system load averages for the past 1, 5 and 15 minutes.

avg_load=$(uptime | awk '{print $8, $9, $10}')

I am simply manipulating the output to only fetch the required average system load from uptime.

Later in the script, I manipulated the same uptime output to get the current uptime. I am using some RegEX to accommodate different uptime. i.e. 15 days, 12 hours, 2 minutes, and 45 seconds

uptime_hours=uptime | awk -F, '{sub(".*up ",x,$1);print $1,$2}'

Getting the Memory usage:

free is another useful command to get detailed output regarding the memory available, used, and free on the system. You can think of it as a more concise version of vmstat -s.

I am manipulating the string returned by free to get the precise memory currently being used.

The output

Here's the output, I am printing out the CPU usage, Memory Usage, or System usage for now. We can extend the bash code and use Push notification services such as pushover, sendmail, etc.

Additional Performance Monitoring Tools:

I would just want to list some additional GUI tools which can help you monitor performance of your linux server better:

1.stacer:
GUI for CPU/Memory and other things

2.saidar:
similar to atop or htop

3.cpu-x:

My personal favorite as it gives very precise details on the CPU, memory and Disk usage and feels familiar to use.

You can also run the stress command that I ran in the beginning to benchmark and stress test the CPU directly inside the cpu-x

Conclusion:

This article effectively explores various Linux process monitoring tools. From command-line utilities offering detailed insights to GUI tools providing a visual representation, you're equipped to choose the tools that best suit your needs. The provided script example demonstrates the practical application of these tools and opens doors for further customization with push notifications.

Feel free to ask any questions or share your preferred monitoring tools and techniques! Let's keep the discussion going!

DEV Community

DevOps: Linux Performance Monitoring

TL;DR:

Performance Monitoring

1.`/proc`

`htop`

`atop`

2.Where's my task manager?

3.Virtual Memory statistics: `vmstat`

4.System Activity Reporter: `sar`

Real world example:

Problem statement:

Assumptions & lab setup

Solution & Explaination:

Code explanation:

Getting the CPU usage:

Getting the Average system load:

Getting the Memory usage:

The output

Additional Performance Monitoring Tools:

Conclusion:

Top comments (0)

Read next

Multicloud Cost Management Guide for FinOps Practitioners

GitHub Compliance – All You Need To Know

Flutter 3.27.0 Release Notes: In-Depth Analysis

Optimise AWS Costs: Automate Unused EBS Snapshot Cleanup with Lambda

TL;DR:

Performance Monitoring

1./proc

htop

atop

2.Where's my task manager?

3.Virtual Memory statistics: vmstat

4.System Activity Reporter: sar

Real world example:

Problem statement:

Assumptions & lab setup

Solution & Explaination:

Code explanation:

Getting the CPU usage:

Getting the Average system load:

Getting the Memory usage:

The output

Additional Performance Monitoring Tools:

Conclusion:

Read next

Multicloud Cost Management Guide for FinOps Practitioners

GitHub Compliance – All You Need To Know

Flutter 3.27.0 Release Notes: In-Depth Analysis

Optimise AWS Costs: Automate Unused EBS Snapshot Cleanup with Lambda

1.`/proc`

`htop`

`atop`

3.Virtual Memory statistics: `vmstat`

4.System Activity Reporter: `sar`