Thanks for taking the time to dig deep into this content piece. Also, thanks for the feedback, that's very helpful. Appreciate it!
Daily "open" price should be the "open" of the first price within that day - but you are showing "average open for the day" - NOT OK
Daily "close" price should be the "close" of the last price within that day - but you are showing "average close for the day" - NOT OK
You're right, that's exactly how these measure are defined. In a general case, it's not okay. No doubt, the open and close prices should refer to the very first and the very last transactions in a given day.
However, there's a catch. The dataset I'm using in this example contains a single entry per ticker per day. It means that the "average open for the day" is exactly equal to the "daily open", right? If the dataset were more granular, it would be definitely an issue.
A much more valuable information for most readers though (myself included) would be the correct way to set up Cube.js to generate an accurate candlestick chart
The trick is to aggregate all open or close values (in Postgres, you can use ARRAY_AGG; in ClickHouse, there's groupArray), sort them by date, and pick the first element.
Here's a working example:
cube(`Stocks`,{sql:`
SELECT 'INTC' AS Ticker, '2021-11-02T10:00:00Z'::timestamp AS Date, 1 AS Open, 10 AS Close, 15 AS High, 5 AS Low, 1000 AS Volume UNION
SELECT 'INTC' AS Ticker, '2021-11-02T11:00:00Z'::timestamp AS Date, 10 AS Open, 20 AS Close, 25 AS High, 10 AS Low, 1500 AS Volume
`,measures:{open:{sql:`(ARRAY_AGG(${CUBE}.Open ORDER BY ${CUBE}.Date ASC))[1]`,type:`number`,format:`currency`},close:{sql:`(ARRAY_AGG(${CUBE}.Close ORDER BY ${CUBE}.Date DESC))[1]`,type:`number`,format:`currency`},high:{sql:`${CUBE}.High`,type:`max`,format:`currency`},low:{sql:`${CUBE}.Low`,type:`min`,format:`currency`},},dimensions:{ticker:{sql:`${CUBE}.Ticker`,type:`string`},date:{sql:`${CUBE}.Date`,type:`time`},},});
Thanks Igor! Totally makes sense now. I didn't know that your source dataset contained only one candle per day. In this case, taking the average is fully reasonable.
I imagined something like my dataset where I have one-minute candles and I must aggregate them by day.
Also, thank you for taking time to expand your answer with examples of aggregation. Very useful indeed!
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Hey Sergio @neskodi π
Thanks for taking the time to dig deep into this content piece. Also, thanks for the feedback, that's very helpful. Appreciate it!
You're right, that's exactly how these measure are defined. In a general case, it's not okay. No doubt, the open and close prices should refer to the very first and the very last transactions in a given day.
However, there's a catch. The dataset I'm using in this example contains a single entry per ticker per day. It means that the "average open for the day" is exactly equal to the "daily open", right? If the dataset were more granular, it would be definitely an issue.
The trick is to aggregate all open or close values (in Postgres, you can use ARRAY_AGG; in ClickHouse, there's groupArray), sort them by date, and pick the first element.
Here's a working example:
The query yields correct results:
I hope it helps :β)
Thanks Igor! Totally makes sense now. I didn't know that your source dataset contained only one candle per day. In this case, taking the average is fully reasonable.
I imagined something like my dataset where I have one-minute candles and I must aggregate them by day.
Also, thank you for taking time to expand your answer with examples of aggregation. Very useful indeed!