See original post: https://camillovisini.com/coding/migrating-plausible-analytics
Plausible Analytics is a simple, open-source web analytics tool designed to provide essential website metrics without compromising on user privacy. Unlike traditional analytics platforms, Plausible focuses on simplicity and ease of use, offering clear insights without tracking personal data or using cookies.
At Visini AG, we self-host Plausible Analytics for us and our clients. This guide explains how we migrated Plausible from a multi-service server with virtual machines (VMs) to a dedicated machine. The new machine will be hosting Plausible exclusively via Docker, allowing for easier maintenance and version upgrades. In the old as well as the new setup, Plausible and the associated services (Postgres, ClickHouse) are run with Docker Compose.
Create a Backup of Plausible Data
Backing up Postgres and ClickHouse data ensures we preserve customer site configurations and historical analytics, which can be restored on the new machine. Data stored in Postgres includes settings and metadata, such as the sites and their configurations. ClickHouse stores the actual analytics data, such as page views and unique visitors.
Postgres Backup
First, let's create a dump of the Postgres database. We need to connect to the Postgres container to create the dump. Let's check which databases exist in the Postgres container, via the following command (replace the example container ID a52ab8083b6b
with the one of your Postgres container):
sudo docker exec -t a52ab8083b6b psql -U postgres -l
This should provide an output similar to the following:
Name | Owner | Encoding | Collate | Ctype |
-------------+----------+----------+------------+------------+
plausible_db | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
postgres | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
Looks like we need to create a dump of plausible_db
in order to save the configuration data. Let's create a dump of the entire database, like so:
sudo docker exec -t a52ab8083b6b pg_dump -U postgres plausible_db > /home/ubuntu/plausible_db.sql
Now let's copy the dump from the virtual machine to the VM host, and then to the local machine:
# First we copy the dump from the VM to the VM host
scp ubuntu@192.168.122.101:/home/ubuntu/plausible_db.sql /home/ubuntu
# Now from the VM host to the local machine
scp ubuntu@vm-host.com:/home/ubuntu/plausible_db.sql ~/Downloads
After completing the Postgres backup, the next step is to back up ClickHouse, which stores the analytics data.
ClickHouse Backup
Next, we need to create a dump of the ClickHouse data. ClickHouse stores the actual analytics data, such as page views and unique visitors. The relevant tables are ingest_counters
, sessions_v2
, and events_v2
.
Note: In case you've imported data from Universal Analytics (UA), please also migrate the imported_*
tables accordingly (not covered in the steps below). See a list of all tables via the following command (replace the example container ID aed6425a6303
with the one of your ClickHouse container):
sudo docker exec -it aed6425a6303 clickhouse client --database plausible_events_db
Running the command show tables
should provide an output similar to the following:
-- ┌─name───────────────────────┐
-- │ events │
-- │ events_v2 │
-- │ imported_browsers │
-- │ imported_devices │
-- │ imported_entry_pages │
-- │ imported_exit_pages │
-- │ imported_locations │
-- │ imported_operating_systems │
-- │ imported_pages │
-- │ imported_sources │
-- │ imported_visitors │
-- │ ingest_counters │
-- │ schema_migrations │
-- │ sessions │
-- │ sessions_v2 │
-- └────────────────────────────┘
Let's focus on the ingest_counters
, sessions_v2
, and events_v2
tables. We can create dumps of these tables by running the following commands (replace the example container ID aed6425a6303
with the one of your ClickHouse container):
# Clean up any old dumps
sudo docker exec -it aed6425a6303 bash -c "rm -f /var/lib/clickhouse/user_files/*.native"
# Create dumps of the relevant tables
sudo docker exec -it aed6425a6303 clickhouse-client --query "SELECT * FROM plausible_events_db.ingest_counters INTO OUTFILE '/var/lib/clickhouse/user_files/ingest_counters.native' FORMAT Native"
sudo docker exec -it aed6425a6303 clickhouse-client --query "SELECT * FROM plausible_events_db.sessions_v2 INTO OUTFILE '/var/lib/clickhouse/user_files/sessions_v2.native' FORMAT Native"
sudo docker exec -it aed6425a6303 clickhouse-client --query "SELECT * FROM plausible_events_db.events_v2 INTO OUTFILE '/var/lib/clickhouse/user_files/events_v2.native' FORMAT Native"
# Copy the dumps to the host
sudo docker cp aed6425a6303:/var/lib/clickhouse/user_files/ingest_counters.native ./ingest_counters.native
sudo docker cp aed6425a6303:/var/lib/clickhouse/user_files/sessions_v2.native ./sessions_v2.native
sudo docker cp aed6425a6303:/var/lib/clickhouse/user_files/events_v2.native ./events_v2.native
Now let's copy these files from the virtual machine to the VM host, and then to the local machine:
# First we copy the dumps from the VM to the VM host
scp ubuntu@192.168.122.101:/home/ubuntu/ingest_counters.native /home/ubuntu
scp ubuntu@192.168.122.101:/home/ubuntu/sessions_v2.native /home/ubuntu
scp ubuntu@192.168.122.101:/home/ubuntu/events_v2.native /home/ubuntu
# Now from the VM host to the local machine
scp ubuntu@vm-host.com:/home/ubuntu/ingest_counters.native ~/Downloads
scp ubuntu@vm-host.com:/home/ubuntu/sessions_v2.native ~/Downloads
scp ubuntu@vm-host.com:/home/ubuntu/events_v2.native ~/Downloads
With the backups of the Postgres and ClickHouse data complete, we can now proceed to import them into the new server.
Import Backup Data into New Server
Now that we have the dumps of the Postgres and ClickHouse data, we can import them into the new machine, which should be running Plausible Analytics inside Docker. We first import the Postgres data, followed by the ClickHouse data.
Import Postgres Backup
First, let's copy the dump of the Postgres database to the new server:
scp ~/Downloads/plausible_db.sql root@new-server-for-plausible.com:/root
Then we can import the dump into the Postgres container. First, connect to the Postgres container to import the dump. Stop the Plausible container to avoid conflicts. Drop the existing database, then create a new one for a fresh start. Use the following commands (replace the example container ID 5ab0dabcbaa4
with the one of your Postgres container, and 1455b8caae1c
with the one of your Plausible container):
docker cp plausible_db.sql 5ab0dabcbaa4:/plausible_db.sql
docker stop 1455b8caae1c # Stop plausible container
sudo docker exec -t 5ab0dabcbaa4 psql -U postgres -c "DROP DATABASE plausible_db"
sudo docker exec -t 5ab0dabcbaa4 psql -U postgres -c "CREATE DATABASE plausible_db"
sudo docker exec -t 5ab0dabcbaa4 psql -U postgres -d plausible_db -f /plausible_db.sql
docker start 1455b8caae1c # Start plausible container
Now the Postgres data has been imported into the new server. We can proceed to import the ClickHouse data.
Import ClickHouse Backup
Next, we need to import the ClickHouse data. We can copy the ClickHouse dumps to the new server:
scp ~/Downloads/ingest_counters.native root@new-server-for-plausible.com:/root
scp ~/Downloads/sessions_v2.native root@new-server-for-plausible.com:/root
scp ~/Downloads/events_v2.native root@new-server-for-plausible.com:/root
Then we can import the dumps into the ClickHouse container. We need to connect to the ClickHouse container to import the dumps. We truncate the existing tables and then import the data from the dumps. Use the following commands (replace the example container ID e92e926fb935
with the one of your ClickHouse container):
# Copy the dumps to the ClickHouse container
docker cp ingest_counters.native e92e926fb935:/root/ingest_counters.native
docker cp sessions_v2.native e92e926fb935:/root/sessions_v2.native
docker cp events_v2.native e92e926fb935:/root/events_v2.native
# Connect to the container and truncate the tables
docker exec -it e92e926fb935 clickhouse-client --query "TRUNCATE TABLE plausible_events_db.ingest_counters"
docker exec -it e92e926fb935 clickhouse-client --query "TRUNCATE TABLE plausible_events_db.sessions_v2"
docker exec -it e92e926fb935 clickhouse-client --query "TRUNCATE TABLE plausible_events_db.events_v2"
# Import the data from the dumps
docker exec -it e92e926fb935 bash -c "clickhouse-client --query 'INSERT INTO plausible_events_db.ingest_counters FORMAT Native' < /root/ingest_counters.native"
docker exec -it e92e926fb935 bash -c "clickhouse-client --query 'INSERT INTO plausible_events_db.sessions_v2 FORMAT Native' < /root/sessions_v2.native"
docker exec -it e92e926fb935 bash -c "clickhouse-client --query 'INSERT INTO plausible_events_db.events_v2 FORMAT Native' < /root/events_v2.native"
Now the ClickHouse data has been imported into the new server. We have successfully migrated Plausible Analytics to the new server.
Disable ClickHouse-internal Logging
In order to keep the ClickHouse-internal data (such as the trace_log
and metric_log
tables) from growing indefinitely, we can disable unnecessary logging.
Mount the following configuration file (clickhouse-config.xml
) to the ClickHouse container:
<clickhouse>
<logger>
<level>warning</level>
<console>true</console>
</logger>
<!-- Stop all the unnecessary logging -->
<query_thread_log remove="remove"/>
<query_log remove="remove"/>
<text_log remove="remove"/>
<trace_log remove="remove"/>
<metric_log remove="remove"/>
<asynchronous_metric_log remove="remove"/>
<session_log remove="remove"/>
<part_log remove="remove"/>
</clickhouse>
We can additionally disable logging of queries and query threads via clickhouse-user-config.xml
, which too should be mounted to the ClickHouse container:
<clickhouse>
<profiles>
<default>
<log_queries>0</log_queries>
<log_query_threads>0</log_query_threads>
</default>
</profiles>
</clickhouse>
Clean up ClickHouse Data (Optional)
ClickHouse-internal data may also be cleaned up manually. This might be required periodically, especially in case logging is not disabled as described above, and the data grows to an unmanageable size (which happened in our case on the old server). We can connect to the container and run clickhouse-client
to check the size of all tables, via the following command:
sudo docker exec -it aed6425a6303 clickhouse-client
Then we can run this query to get the size of all tables:
SELECT
database,
table,
formatReadableSize(sum(bytes)) AS size
FROM system.parts
GROUP BY
database,
table
ORDER BY
size DESC;
This should provide an output similar to the following:
┌─database────────────┬─table───────────────────┬─size───────┐
│ system │ asynchronous_insert_log │ 76.57 KiB │
│ system │ query_log │ 44.91 MiB │
│ system │ trace_log │ 4.21 GiB │
│ system │ part_log │ 27.36 MiB │
│ system │ asynchronous_metric_log │ 2.89 GiB │
│ system │ metric_log │ 2.45 GiB │
│ plausible_events_db │ ingest_counters │ 195.83 KiB │
│ plausible_events_db │ sessions_v2 │ 1.18 MiB │
│ plausible_events_db │ events_v2 │ 1.17 MiB │
└─────────────────────┴─────────────────────────┴────────────┘
Wow! Almost 10GB of logs accumulated over the past 12 months. We can delete data from these tables by running the following queries:
aed6425a6303 :) TRUNCATE TABLE system.trace_log;
aed6425a6303 :) TRUNCATE TABLE system.asynchronous_metric_log;
aed6425a6303 :) TRUNCATE TABLE system.metric_log;
Conclusion
In this guide, we've covered the process of migrating Plausible Analytics to a new server. We created dumps of the Postgres and ClickHouse data, and then imported them into the new server. We also disabled ClickHouse-internal logging and cleaned up old data to ensure the system runs smoothly and doesn't accumulate unnecessary data.
These steps will help you migrate Plausible Analytics seamlessly to a new server, ensuring no loss of configurations or historical data.
Top comments (0)