How to improve Snowflake cost per query
Snowflake is a powerful platform that offers a number of tools for users working in the cloud. However, the expenses of running virtual warehouses, which constitute the majority of operating expenses, can add up. Learning to optimize Snowflake cost per query is a key component of cloud cost optimization.
What is Snowflake cost per query?
Cost per query (CPQ) is a metric that's convenient for measuring users' cost efficiency in Snowflake. It indicates the price of running individual queries in the data warehouse. As it's most commonly expressed, CPQ shows the cost of a single query execution, which can be multiplied by the number of queries executed to develop the total cost of operations.
Snowflake charges users by the second (after the first minute of automatic billing) for as long as the data warehouses are running, which is billed via Snowflake credits. This unique pricing model drives decision making about how users structure their queries and operate in the service.
For example, running queries in a medium-sized data warehouse might cost around four credits an hour. At $3 per credit, 60 minutes’ worth of runtime on this warehouse costs $12. A query that takes 10 minutes to run would cost about $2.
While this might not sound expensive, these queries quickly add up. On a single day in 2021, for example, Snowflake processed over 1 billion queries in a 24-hour period.
Snowflake itemizes costs per data warehouse, but not down to the level of the individual query. To identify exact costs, it can be helpful to work out what the cost per query (CPQ) is and how it affects the price for services.
Calculating Cost Per Query (CPQ)
One of the reasons CPQ is so helpful is that it shows users the cost impact of factors such as data transfer, storage use and overall resource consumption.These estimates combine the costs of several different sources. Factors that affect CPQ include:
-
Compute resources. Running queries consumes resources, such as memory and processing time. Structuring queries that consume more of these resources tends to drive up the CPQ.
-
Storage allocation. Accessing and managing data adds to the cost of Snowflake operations. Users pay for the allocation of storage space, and the data read and written impact costs.
-
Transfer costs. Transferring data is part of most query executions, and it adds to the cost per query. The amount involved can be negligible, but queries involving large datasets can run up the cost by a fair margin. This is also true for queries that require data to be moved across locations in large volumes.
-
Complexity of the query. This is a derived cost because the complexity of a query is not billed in itself but because it drives up the demand on other billed services, such as CPU time. More complex queries, such as those with a high number of operations and joins, place higher demands on platform resources and typically incur higher costs per query.
-
Execution time. As already noted, Snowflake costs are billed in per-minute blocks. Time spent running queries adds billable minutes to the total, which quickly adds up for heavy users. Long-running queries that take a lot of time to execute can cost more than faster-executing queries that don't require as much time.
Snowflake users can track total costs from the platform’s dashboard, which also allows time-derived breakdowns of query costs.
Methods for calculating and monitoring Snowflake cost per query
Many providers offer two pricing models, which teams can use to estimate costs per query based on usage:
-
On-demand pricing
-
Capacity pricing
The on-demand pricing model charges by the bytes processed per query, while the capacity pricing model charges by the number of slots reserved for processing, which is just a measure of how much CPU space is needed.
Various methods can be used to calculate and monitor Snowflake cost per query, including the on-demand and capacity pricing models mentioned. Snowflake's billing and usage reports offer insights into credits consumed by each operation, simplifying cost estimation. Additionally, Snowflake provides detailed query profiles, breaking down processing steps and their associated costs. For more granular analysis, users can install resource monitors to track credit usage at both the account and warehouse levels, enabling better cost management and optimization.
Leveraging Snowflake's billing and usage reports
The cost of querying is presented to the user as credits consumed by a given operation. Knowing the price of a credit and multiplying this value by the number of credits consumed by a query gives the cost for that operation.
It gets a little more complicated for devs who would like to see more detail than this. For instance, many Snowflake queries execute in at least two steps, each of which carries its own cost in credits. Finding this information requires a little navigation through the Snowflake UI.
Looking up costs per query in Snowflake
Here’s how to find a CPQ report in Snowflake’s Classic Console:
-
Choose the Query Profile report, which can be accessed by clicking the Query ID button from any page where it appears. This provides a breakdown for details relating to the query, including the query’s main components, processing plan and statistics for each. These appear as a data-rich graphic.
-
From the operator tree, it’s possible to pull up information specific to various query components, such as join and aggregate steps.
-
This pulls up a window that displays, among other data, processing times for each phase, as well as the percentage of total time occupied by each step. These are listed in descending order, so the most time-consuming operations are listed first.
Here’s how to find a CPQ report in Snowsight:
-
Start by navigating to the "Dashboards" section in Snowsight, which is accessible from the left-hand menu after logging in. This area is where all available reports and dashboards are stored.
-
Use the search bar at the top of the Dashboards section to locate the CPQ (Cost Per Quote) report. Simply type "CPQ" or "Cost Per Quote" into the search field, and Snowsight will list all related reports.
-
Click on the CPQ report from the search results. This opens a detailed dashboard that provides a comprehensive breakdown of the cost per quote metrics, including key components, trends, and statistical data.
-
Within the CPQ report, you can explore various widgets and data visualizations. Each section of the report offers insights into specific aspects of the CPQ process, such as cost trends, quote volumes, and efficiency metrics.
-
Clicking on individual data points or segments within the report will reveal more granular details, such as specific cost breakdowns, processing steps, and time metrics associated with each quote. These details help you identify the most time-consuming or costly operations.
-
For further analysis, use the filtering options available within the report to narrow down the data by date range, customer segment, or other relevant criteria. This allows for a more focused review of the CPQ performance.
-
If needed, you can export the report or share it with colleagues directly from the Snowsight interface by using the "Export" or "Share" buttons..
Best practices for optimizing Snowflake cost per query
Because Snowflake prices queries by the minute, it’s important to proactively monitor and optimize usage to manage costs effectively. A few best practices for optimizing the cost per query (CPQ) in Snowflake include:
-
Improving read efficiency
-
Improving processing efficiency
-
Optimizing warehouse configurations
-
Optimizing cost with data warehouse management tools
Query optimization techniques
Writing efficient queries from the start is one of the best ways to optimize Snowflake costs overall.
There are three main areas users can address here:
-
Improve query read efficiency
-
Improve query processing efficiency
-
Optimize their warehouse configuration for better performance and lower CPQ.
Improve read efficiency
TableScans are the time Snowflake spends reading data from the warehouse. This can quickly turn into a major resource burn, as unstructured queries are prone to indiscriminate database reading that burns through both bandwidth and processing. Almost any small improvement here can go a long way toward keeping the CPQ down across the board. Things to try include:
-
Reducing the number of columns accessed
-
Pruning queries and using table clustering
-
Clustering columns in each query’s join predicates
-
Use of pre-aggregated tables
Improve processing efficiency
Having read the data, everything a query pulls now has to be processed. As before, any small improvement in this area is helpful in reducing costs. Processing tricks to try in Snowflake include:
-
Run simpler and fewer operations per query.
-
Filter to reduce the total volume of data.
-
Minimize the calls to CTEs, which put extra strain on the system.
-
Minimize sorts, which also place demands on processing.
-
Choose window functions, as opposed to self-joins.
-
Avoid joins with an OR condition, which expands the dataset being processed.
-
Don’t query complex views.
-
Use caching and share access across teams.
Optimize warehouse configurations
There’s also room for improvement in the way data warehouses are configured for queries. Best practices here include:
-
Rightsizing and scheduling warehouses appropriately based on need
-
Adding to the cluster count beyond the default limit of eight, if necessary
-
Change the scaling policy
Cost optimization and data warehouse management tools
Tools like Query Advisor within Capital One Slingshot can also help optimize queries for better performance. Query Advisor centralizes insights and recommendations, helping admins quickly understand Snowflake query cost and runtime, while identifying inefficiencies or errors that could impact performance.
Slingshot offers Snowflake users helpful tools for resource optimization and performance.The platform monitors and provides detailed insights on usage to identify inefficiencies and support more efficient resource allocation. Additionally, Slingshot allows Snowflake users to enable or disable Query Acceleration Service (QAS), which enhances the efficiency of a warehouse by expediting specific segments of its query workload. QAS can boost overall performance by reducing the impact of outlier queries that demand more resources than typical queries.
Using Slingshot, teams can set up alerts and thresholds to identify CPQ spikes as they occur, as well as real-time monitoring and analysis to help manage the fluctuations.
Optimizing Snowflake cost per query for efficient cloud cost management
Mastering cost optimization for Snowflake queries is essential for efficient cloud cost management. Of all the metrics dev teams monitor, the cost per query for cloud services is arguably one of the most important to projects’ bottom lines. By implementing best practices and methods for calculating and monitoring Snowflake cost per query, your business can optimize cloud usage and costs.
Effective cost management in Snowflake is a lot easier with tools like Slingshot, which monitors and reports on usage patterns and helps users improve query efficiency.
Request a personalized demo with our team to see how Slingshot can help you maximize your Snowflake investment.