DEV Community


top-threads: real-time runqueue latency per thread

cspinetta profile image Cristian Spinetta ・3 min read

I developed a python script to watch the run queue latency of the threads of a given process:

GitHub logo cspinetta / top-threads

A tiny command line tool that provides a dynamic real-time view of the active threads for a given process with stats of CPU, disk and scheduling.

For some time I've been using it to solve problems in production and now I think is time to drop a post since someone else might find it useful.


In some cases it's interesting to know how much time the threads spend trying to get onto CPU, instead of being on CPU. This type of information can be easily obtained with tracing tools like perf and BCC, but these tools require root privilege, something that is often not available in production servers, so you can't use them for troubleshooting.

What came up?

I ended up developing a script in Python that allows me to visualize in real-time the run queue latency per task by reading the /proc/{pid}/schedstat file.

In fact, it shows some extra stats:

  • CPU usage: total, %usr, %system, %guest and %wait
  • Disk usage: kB read per second and kB written per second
  • Scheduler stats:
    • time spent on the cpu.
    • time spent waiting on a run queue (runqueue latency)
    • number of timeslices run on the current CPU.
  • Java details: in case the target is a Java process that can be attached with jstack, some extra details is shown such as thread name and stack traces.

It requires:

It only works in Linux since it uses /proc/{pid}/schedstat to get scheduling stats.

How to use it?

I kept the code in a single file, so it's as easy as downloading the script:

wget -O '' \
  && chmod +x

Example of usages:

# watch <pid>'s threads with default values
./ -p <pid>

# print output in the terminal
./ -p <pid> --display terminal

# sorting by run queue latency
./ -p <pid> --sort rq

# in case a java process, change the number of stack traces to display
./ -p <pid> --max-stack-depth 10

# enable debug log for troubleshooting
./ -p <pid> --debug

Some things to keep in mind:

  • The first output is with stats from the first execution of the process.
  • --display refresh provides a view similar to top or watch (the default) while terminal prints the output on each iteration in the terminal like pidstat.

What is a good use case for this tool?

I often use this script when I have to analyze a performance problem at thread level and I want to inspect the dynamic usage of CPU or Disk.

Some questions this tool helps me to answer:

  • Which thread is eating the entire CPU?
  • How long are the threads waiting to take the CPU?
  • What threads are using the disk right now?

Example in pictures

  • With --display refresh (the default):

Top Java Threads Refresh

  • With --display terminal:

Top Java Threads in Terminal


usage: [-h] -p PID [-n [NUMBER]]
                      [--max-stack-depth [STACK_SIZE]]
                      [--sort [{cpu,rq,disk,disk-rd,disk-wr}]]
                      [--display [{terminal,refresh}]] [--no-jstack] [--debug]

Tool for analysing active Threads

optional arguments:
  -h, --help            show this help message and exit
  -p PID                Process ID
  -n [NUMBER]           Number of threads to show per sample. Default: 10
  --max-stack-depth [STACK_SIZE], -m [STACK_SIZE]
                        Max number of stack frames (only when jstack can be
                        used). Default: 1
  --sort [{cpu,rq,disk,disk-rd,disk-wr}], -s [{cpu,rq,disk,disk-rd,disk-wr}]
                        Field used for sorting. Default: cpu
  --display [{terminal,refresh}], -d [{terminal,refresh}]
                        Select the way to display the info: terminal or
                        refresh. Default: refresh
  --no-jstack           Turn off usage of jstack to retrieve thread info like
                        name and stack
  --debug               Turn on logs for debugging purposes


Editor guide