Amazon Redshift MCQ
Explanation: DISTSTYLE ALL will ensure that data is distributed on first slice of each node in the Redshift cluster
explain select eventid, eventname, event.venueid, venuename from event, venue where event.venueid = venue.venueid
XN Hash Join DS_DIST_OUTER (cost=2.52..58653620.93 rows=8712 width=43) Hash Cond: ("outer".venueid = "inner".venueid) -> XN Seq Scan on event (cost=0.00..87.98 rows=8798 width=23) -> XN Hash (cost=2.02..2.02 rows=202 width=22) -> XN Seq Scan on venue (cost=0.00..2.02 rows=202 width=22) (519 rows)
There are some BI dashboards which query this data and show some key metrics such as total claim value and the number of claims. These dashboards are updated every hour through SQL queries. There is also a group of data scientists who query the database intermittently to analyse risks of some claims. Recently, the data scientists have complained of slow queries.
What will be the most cost-effective solution to increase the performance of your Redshift cluster?