With Vector taking care of mining and shipping our logs, we need a place to store all these logs.
I built this logging service for simple and quick debugging and alerting for dev team. My goals are: fast and cheap. The log service can fail and it's okie to fail and get restarted again quickly.
I came up with the following table schema for my log (I want it to be as generic as possible to store logs from different sources)
CREATE TABLE IF NOT EXISTS log ( instance_id String, node_id String, project_id String, app_name String, app_version String, client_ip String, log_level String, log_message String, log_code Int32, log_context String, log_created_timestamp UInt32, log_time DateTime, log_date Date DEFAULT toDate(log_time) ) ENGINE = MergeTree() PARTITION BY toYYYYMM(log_date) TTL log_date + INTERVAL 1 Week ORDER BY (project_id, log_created_timestamp)
There are several things I would like to point out:
- I don't care about old logs, for now, so I decided that logs older than 1 week can be deleted.
- I want logs to always be listed by created date (newest logs first).
- I want to filter by certain criteria (to group logs by application, client, etc)
- I'm a Clickhouse newbie, and I'm still figuring things out. I secretly hope that someone will look at the table schema and point out improvements that I can do.
In the beginning, I logged error logs as well as access logs, and the number of records quickly reach billions after just a few days. The good thing is that the log service still runs surprisingly well on a rather small server instance (CPU: 2 vCore RAM: 4096 MB Storage: 80 GB SSD). The bad thing is that listing keeps timing out. I later found out that I didn't set up correct indexes for my table (you really really must setup indexes for the columns you want to filter and order by).
Next time I will show how I handle the logs using PHP.