Facing high Snowflake costs?
Check out our practitioner's five-point checklist to reduce Snowflake costs and optimize your data warehouse usage.
Snowflake is an incredibly powerful and scalable data warehouse, but without proper management, your costs can skyrocket and get out of control. In this checklist, we will explore five key strategies for optimizing Snowflake costs, ensuring that your data warehouse remains efficient and cost-effective.
Auto-Suspend is a Snowflake feature that allows your warehouse to automatically suspend itself after a specified period of inactivity. This helps to reduce costs by preventing unnecessary credit usage when the warehouse is not in use.
The default auto-suspend value is 10 minutes, but this is often too long for many workloads. For example, if a query lasts 20 seconds, the warehouse will still be charged for 10 minutes of usage, leading to unnecessary costs.
To query the auto-suspend settings for all warehouses, use the following query:
auto_suspend column specifies how long a running warehouse can remain inactive, in seconds, before automatically suspending and stopping credit usage. A
null value means that the warehouse will never automatically suspend. You’ll want to avoid having any warehouse with a
null auto-suspend. Typically,
60 , meaning an auto-suspend of a minute is a good place to start.
To modify the auto-suspend setting for a specific warehouse to 1 minute (60 seconds), you can run the following:
ALTER WAREHOUSE "<warehouse_name>" SET AUTO_SUSPEND = 60;
Remote disk spillage occurs when a virtual warehouse runs out of memory and begins spilling intermediate results to remote storage. This can lead to increased query times and higher Snowflake costs due to increased I/O operations and additional credit usage.
To monitor remote disk spillage, you can use Snowflake's QUERY_HISTORY function. Here's an example query to identify the warehouses with remote disk spillage over the last 30 days:
SELECT WAREHOUSE_NAME, SUM(BYTES_SPILLED_TO_REMOTE_STORAGE) as TOTAL_BYTES_SPILLED_TO_REMOTE_STORAGE FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY WHERE START_TIME >= DATEADD('day', -30, CURRENT_TIMESTAMP()) GROUP BY WAREHOUSE_NAME;
To fix remote disk spillage, you can increase the size of your warehouse. By doing so, you will allocate more memory to the warehouse and reduce the need for remote storage, ultimately lowering costs. With a larger warehouse, your queries will complete more than twice as fast. Make sure to monitor and adjust warehouse sizes as needed to prevent spillage.
Sometimes, you require high-concurrency sub-second queries. These are critical for customer-facing dashboards, data APIs, and usage metering, where fast response times are essential.
A serving layer on top of Snowflake syncs data to high-speed storage, optimizes it, and serves it via an API without consuming Snowflake credits. This reduces Snowflake costs while providing high-performance analytics to your end-user applications.
It is a better alternative to traditional embedded analytics because it reduces the load on Snowflake. Furthermore, by providing an API, it offers greater flexibility and control.
Propel's data API platform provides engineering teams with a unified platform for delivering high-performance customer-facing analytics. Its serving layer solution offers an analytics backend with a GraphQL API and React UI component library, requiring no infrastructure scaling or management. To learn more about Propel, read the docs.
Inefficient queries are the source of a lot of waste. In this section, we show you how to identify them and share some tips to optimize them.
Inefficient queries can significantly contribute to higher Snowflake costs. To identify them, you can monitor the
QUERY_HISTORY view and look for queries with long execution times or high resource consumption.
You can use the following query to identify your slowest queries in the last 30 days in Snowflake:
SELECT QUERY_TEXT, SUM(TOTAL_ELAPSED_TIME) AS TOTAL_ELAPSED_TIME, SUM(BYTES_SCANNED) AS BYTES_SCANNED FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY WHERE START_TIME >= DATEADD('day', -30, CURRENT_TIMESTAMP()) GROUP BY QUERY_TEXT ORDER BY TOTAL_ELAPSED_TIME DESC
To make queries more efficient, consider the following tips:
- Use selective filtering and avoid SELECT *.
- Optimize JOIN operations.
- Make sure tables have an ORDER BY.
- Limit the number of rows returned using LIMIT.
- Use materialized views for pre-computed results.
- Leverage clustering keys to improve query performance.
Resource Monitors in Snowflake allow you to track and manage your credit usage, helping to prevent unexpected costs. You can set thresholds that, when exceeded, trigger actions such as suspending a warehouse or sending notifications.
To create a Resource Monitor that suspends a warehouse when credit usage exceeds a specified amount, you can use the following SQL command:
CREATE RESOURCE MONITOR "monitor_name" WITH CREDIT_QUOTA = <quota> TRIGGERS ON 100 PERCENT DO SUSPEND;
Optimizing Snowflake costs is crucial for maintaining an efficient and cost-effective data warehouse. By following this five-point checklist, you can optimize warehouse usage by preventing remote disk spillage, adjusting auto-suspend settings, leveraging a serving layer for high-concurrency queries, and monitoring resource consumption. So, start implementing these strategies today and make the most of your Snowflake investment.
- How to build a Snowflake API
- Introducing the new Metric Report API: Powerful reports for any app with a single GraphQL request
- Propel UI Kit: Data visualization and dashboard React components
If you don’t have a Propel account yet, you can try Propel for free and start building data apps.