In high-frequency transaction systems, such as OpenRTB (Real-Time Bidding), managing user balances efficiently becomes a critical challenge. Transactions occur frequently, often in the form of microtransactions, which means the system must handle large volumes of small amounts in real time. In such systems, a user's balance can be thought of as a dynamic ledger (like an order book), constantly updated as users make deposits, withdrawals, or engage with various event-based activities.
ClickHouse, a columnar database designed for high-performance analytics, can be an excellent choice for handling user balances in such environments. Its ability to manage large datasets with high read and write throughput makes it particularly suited for use cases where rapid aggregation and querying are essential. This article explores how ClickHouse can be used to manage user balances in a system with an abundance of microtransactions, focusing on materialized views, Kafka batch processing, and four specific types of events that affect user balances: deposit
, withdrawal
, event_spent
, and event_earn
.
The Problem: High-Volume Microtransactions
Microtransactions in systems like OpenRTB occur frequently and in small amounts. For instance, when a user places a bid, wins an auction, or receives a payout, these actions trigger transactions that must be reflected in the user's balance almost instantaneously. As the volume of these transactions increases, traditional relational databases may struggle to handle the load due to their inability to efficiently process massive amounts of small data points in real time.
The goal is to ensure that each transaction—whether a deposit, withdrawal, or microtransaction related to an event—updates the user's balance accurately and quickly, without putting excessive load on the database.
ClickHouse and Materialized Views
ClickHouse excels at handling large datasets and providing fast query performance for analytical workloads. One of the most powerful features in ClickHouse for such scenarios is the use of materialized views. A materialized view in ClickHouse allows for precomputed query results to be stored and updated automatically when the underlying data changes. This enables real-time aggregation of user balances based on transaction data, eliminating the need to recompute balances from scratch on every query.
Table Structure
Let's define a raw transactions table to store all user transaction events. Each transaction will include a type (deposit, withdrawal, or event-related), an amount, a user ID, and a timestamp.
CREATE TABLE transactions
(
user_id UInt64,
amount Float64,
transaction_type String,
timestamp DateTime
) ENGINE = MergeTree()
ORDER BY (user_id, timestamp);
-
user_id
: The identifier for the user involved in the transaction. -
amount
: The amount involved in the transaction, which could be either positive (deposit) or negative (withdrawal or event-related). -
transaction_type
: The type of transaction. Possible values aredeposit
,withdrawal
,event_spent
, andevent_earn
. -
timestamp
: The time when the transaction occurred.
In this setup, transactions can be inserted in real time via a Kafka engine that ingests batches of microtransactions.
Kafka Integration for Batch Processing
ClickHouse supports Kafka as an engine, which can be used to ingest real-time transaction data in batches. Transactions related to events (such as bids or ad impressions) can be published to Kafka, and ClickHouse can periodically batch these messages to update the transactions table.
CREATE TABLE transactions_kafka
(
user_id UInt64,
amount Float64,
transaction_type String,
timestamp DateTime
) ENGINE = Kafka()
SETTINGS kafka_broker_list = 'kafka_broker1:9092', kafka_topic = 'transactions_topic', kafka_group_name = 'transaction_consumer_group';
This table will read microtransaction events from Kafka and insert them into the transactions
table. Kafka acts as a buffer, allowing the system to process and insert transactions in batches, which reduces the overhead of processing individual transactions one by one.
Materialized View for Aggregating User Balances
To calculate the current balance for each user efficiently, we can create a materialized view that aggregates the data based on the transaction types. The balance will reflect the cumulative effects of deposits, withdrawals, and event-based transactions (event_spent
and event_earn
).
CREATE MATERIALIZED VIEW user_balances
ENGINE = MergeTree()
ORDER BY user_id
AS
SELECT
user_id,
sumIf(amount, transaction_type = 'deposit') - sumIf(amount, transaction_type = 'withdrawal')
+ sumIf(amount, transaction_type = 'event_earn') - sumIf(amount, transaction_type = 'event_spent') AS balance
FROM transactions
GROUP BY user_id;
In this materialized view:
-
deposit
adds to the balance. -
withdrawal
subtracts from the balance. -
event_earn
adds to the balance (e.g., a user earning credits through an event). -
event_spent
subtracts from the balance (e.g., a user spending credits in an event).
The sumIf
function is used to filter transactions based on their type, ensuring that each transaction type impacts the balance in the correct way.
Benefits of Materialized Views in This Context
Real-time Aggregation: Materialized views allow for real-time aggregation of transactions. As new data is ingested (either via Kafka or direct insertion), the view is automatically updated, ensuring that the balance is always current.
Efficient Query Performance: Instead of recalculating the user balance every time from scratch, the system queries the precomputed balances stored in the materialized view, significantly speeding up the retrieval process.
Low Latency: By leveraging Kafka to ingest transactions in batches and using materialized views for aggregation, the system can handle high-frequency transactions with minimal delay, ensuring that the user balance is updated quickly in response to events.
Scalability: ClickHouse is designed to scale horizontally, so as the number of users or transactions grows, the system can handle the increased load by simply adding more resources.
Example Scenario: OpenRTB Microtransactions
In the context of OpenRTB, where bids, impressions, and payments occur frequently, the ability to update and query user balances efficiently is crucial. For example:
- A user deposits funds into their account (
deposit
). - The user places a bid for an ad spot (
event_spent
). - If the bid wins, the system will subtract the amount from the user's balance (
event_spent
), and they may earn credits or rewards based on impressions or clicks (event_earn
). - Finally, the user might withdraw their remaining balance (
withdrawal
).
ClickHouse's materialized views and Kafka batch processing ensure that these transactions are processed quickly, with balances updated in real time as users interact with the platform.
Conclusion
ClickHouse, when combined with materialized views and Kafka for batch processing, provides a powerful solution for managing user balances in systems with a high volume of microtransactions. This architecture ensures fast updates and queries, while also handling the complexities of different transaction types (deposit, withdrawal, event_spent, event_earn) and maintaining accuracy in real-time systems like OpenRTB. By using ClickHouse’s features effectively, such systems can scale to handle massive amounts of data while keeping performance high and costs low.
Top comments (0)