Welcome to the R.A.G, a guide about Amazon's Redshift Database written for the Analyst's out there in the world who use it.
Is WLM killing your query?
Do tables seem to not play ball?
There is a LOT to consider when trying to solve the above.
Here's the process for you to work through:
Do this for a single query at a time, not your whole script. But start by getting Redshift to tell you how it's going to execute your query. Make sure to look for actions with high costs, sequential scans or nested loops. If you can avoid them, or break your query into smaller tasks this will help you a lot.
Run the below query and make note of:
- distyle_and_key: How is the table distributed across nodes?
- row_skew_ratio: This is the effectiveness of the dist key, the closer to 1 the better.
- first_sortkey: How is it sorted on the node?
- no_sort_keys: How many sort keys?
- sortkey_skew_ratio: This is the effectiveness of the sort key, the closer to 1 the better.
- percent_unsorted: How long has this table been since vacuum
- stats_needed: If yes, the table needs 'analyst statistics' before the leader node knows how to handle it properly.
select i.schema as schema_location, i.table as table_name, i.encoded as are_columns_encoded, i.diststyle as distyle_and_key, i.sortkey1 as first_sortkey, i.sortkey1_enc as sortkey_compression, i.sortkey_num as no_sort_keys, i.skew_sortkey1 as sortkey_skew_ratio, i.size as size_in_blocks_mb, i.tbl_rows as total_rows, i.skew_rows as row_skew_ratio, i.pct_used as percent_space_used, i.unsorted as percent_unsorted, i.stats_off as stats_needed from svv_table_info i where i.table = 'table_name' limit 50
The table contains an EVENT and a SOLUTION table, this may provide some key information on how to make your query run faster.
select l.event, l.solution, q.querytxt from stl_alert_event_log l join stl_query q on q.query = l.query where l.userid in (select usesysid from pg_user where usename ilike '%name%') --change to your name order by l.event_time desc limit 10
select c.name as process_queue, w.slot_count, datediff(seconds,w.queue_start_time,w.queue_end_time) as q_wait_time_seconds, w.exec_start_time, w.exec_end_time, datediff(seconds,w.exec_start_time,w.exec_end_time) as exec_time_seconds, w.final_state, w.est_peak_mem, q.querytxt from stl_wlm_query w join stv_wlm_service_class_config c on c.service_class = w.service_class join stl_query q on q.query = w.query where w.userid in (select usesysid from pg_user where usename ilike '%user%') --change to your name order by w.xid desc;
header image drawn by me