DEV Community

[Comment from a deleted post]
Collapse
 
helenanders26 profile image
Helen Anderson

Redshift is used as the BI platform for data modelling by the Data Services team.

We move the data to Aurora so any data analyst in the business can use our models without interrupting what's happening on Redshift and without us interrupting what they do.

Collapse
 
urbanm0nk profile image
urbanm0nk

The ease of provisioning in the cloud can sometimes make us to spawn up instances we don't really need and thus, incur avoidable charges.

I don't know how many concurrent connections you expect or what you exactly mean when you say "We move the data to Aurora so any data analyst in the business can use our models without interrupting what's happening on Redshift and without us interrupting what they do". But know ye that Redshift is designed to scale and also massively handle parallel query execution.

 
helenanders26 profile image
Helen Anderson

Fair comment on the ease of spinning up new services.

The reason we keep things separate isn't just down to scale. We keep a level of abstraction from Redshift, used for our team's models and landing source tables. And the analyst DB, where any analyst in the business can run queries.

Our visualisation tool uses Redshift as a data source to provide operational reports for all parts of the business. We've found in the past when our team shared the same platform as the analysts, queries would hit during update windows, they would (accidentally) lock tables, drop tables, and the teams didn't have the same kind of release processes as our team, so it was hard to keep track of who was doing what. We were using the platform as a Production space with Operational reports and couldn't risk our service being interrupted.

We didn't want to police 60 users all over the world and they didn't want to be restricted by our teams release processes if they were using the space as a sandbox.

So we keep things separate.

 
urbanm0nk profile image
urbanm0nk

Makes sense