DEV Community

Discussion on: Building ClickHouse Dashboard and crunching WallStreetBets data πŸ’ΈπŸ€‘

igorlukanin profile image
Igor Lukanin Author

Hey Sergio @neskodi πŸ‘‹

Thanks for taking the time to dig deep into this content piece. Also, thanks for the feedback, that's very helpful. Appreciate it!

Daily "open" price should be the "open" of the first price within that day - but you are showing "average open for the day" - NOT OK
Daily "close" price should be the "close" of the last price within that day - but you are showing "average close for the day" - NOT OK

You're right, that's exactly how these measure are defined. In a general case, it's not okay. No doubt, the open and close prices should refer to the very first and the very last transactions in a given day.

However, there's a catch. The dataset I'm using in this example contains a single entry per ticker per day. It means that the "average open for the day" is exactly equal to the "daily open", right? If the dataset were more granular, it would be definitely an issue.

A much more valuable information for most readers though (myself included) would be the correct way to set up Cube.js to generate an accurate candlestick chart

The trick is to aggregate all open or close values (in Postgres, you can use ARRAY_AGG; in ClickHouse, there's groupArray), sort them by date, and pick the first element.

Here's a working example:

cube(`Stocks`, {
  sql: `
    SELECT 'INTC' AS Ticker, '2021-11-02T10:00:00Z'::timestamp AS Date,  1 AS Open, 10 AS Close, 15 AS High,  5 AS Low, 1000 AS Volume UNION
    SELECT 'INTC' AS Ticker, '2021-11-02T11:00:00Z'::timestamp AS Date, 10 AS Open, 20 AS Close, 25 AS High, 10 AS Low, 1500 AS Volume

  measures: {
    open: { sql: `(ARRAY_AGG(${CUBE}.Open ORDER BY ${CUBE}.Date ASC))[1]`, type: `number`, format: `currency` },
    close: { sql: `(ARRAY_AGG(${CUBE}.Close ORDER BY ${CUBE}.Date DESC))[1]`, type: `number`, format: `currency` },
    high: { sql: `${CUBE}.High`, type: `max`, format: `currency` },
    low: { sql: `${CUBE}.Low`, type: `min`, format: `currency` },

  dimensions: {
    ticker: { sql: `${CUBE}.Ticker`, type: `string` },
    date: { sql: `${CUBE}.Date`, type: `time` },
Enter fullscreen mode Exit fullscreen mode

The query yields correct results:

I hope it helps :–)

neskodi profile image
Sergio Neskodi

Thanks Igor! Totally makes sense now. I didn't know that your source dataset contained only one candle per day. In this case, taking the average is fully reasonable.

I imagined something like my dataset where I have one-minute candles and I must aggregate them by day.

Also, thank you for taking time to expand your answer with examples of aggregation. Very useful indeed!