perf on Linux is a great way to understand your CPU consumption. On modern kernels, it is mostly harmless, especially when running for a short time, but try it on a test environment first if cou can.
Here is an example where I gather 99 samples per second, during 10 seconds, for PostgreSQL and YugabyteDB processes and threads:
ssh firstname.lastname@example.org 'sudo bash -c " perf record -o - --call-graph fp -F99 -e cpu-cycles \ -p $(pgrep -d, postgres\|yb-tserver\|yb-master) sleep 10 | perf script -F +pid "' > $( date +%Y%m%d%H%M%S | tee /dev/stderr ).perf
This generates a file that can be read by the Firefox Profiler.
Go to: https://profiler.firefox.com/ and load the file generated.
On the upper right, you can upload it to share it. I did this, which means that you can see mine here: https://share.firefox.dev/4576kKY
I gathered the perf statistics during 10 seconds while running this on YugabyteDB:
drop table if exists demo; create table demo ( id bigint ); insert into demo select generate_series(1,300000); \watch
Maybe you already see what's wrong here: I didn't define a primary key. But how does that impact my CPU usage?
In YugabyteDB, the
postgres process is running the query layer, with additional threads (like
pggate to talk to the storage layer):
This shows the call tree on which I can navigate, as with
perf top or
perf report but with a graphical interface.
My query was not active during the whole 10 seconds of sampling. It has been running a few seconds and then
\watch was waiting.
Here, for example, I see that I spend some time in reading from a virtual filesystem. I focus on this part:
The stack gives lot of information:
_extract_crng urandom_read_nowarn.isra.34 urandom_read __vfs_read vfs_read ksys_read __arm64_sys_read el0_svc_common el0_svc_handler el0_svc __read boost::uuids::detail::random_provider_base::get_random_bytes yb::GenerateObjectId yb::pggate::PgApiImpl::TupleIdBuilder::Build YBCPgBuildYBTupleId YBCGetYBTupleIdFromTuple
Of course, I can guess from the function names, but YugabyteDB is Open Source and I can search for them. What happens here is that I didn't declare a Primary Key for my table and then an internal one (
ybctid) is generated, because secondary indexes need a key to address the table row. This ID generation calls
/dev/urandom. I made this simple example to show that low-level traces can give a clue about high level data model problems.