Optimization techniques to maximize the value of your data

Presented at Gartner Data & Analytics Summit 2025 by Rahul Mode, Sr. Director, Solutions Architecture, Capital One Software.

With enterprises in every sector prioritizing AI for business use, the need for data is one of the big drivers of AI costs. AI models rely on processing large volumes of data to learn and make accurate predictions. AI and ML applications also require a great deal of computing power for massive data storage, which can drive up costs. Through optimization, businesses can ensure their data spend is efficient while maintaining the level of performance necessary to support AI workloads, helping to maximize the value of their data.

Data landscape evolving with AI

Driven by technological advances like the internet, connected devices (Internet of Things) and cloud computing, an incredible amount of data exists in the world today and the volume continues to grow rapidly. According to Statista, the amount of data has more than doubled in the last five years. And by the end of the year, there will be 181 zettabytes of data globally.

Within this quickly expanding data landscape, the AI boom adds a layer of complexity for achieving data quality and efficiency. Organizations are seeking to efficiently manage their data to get it ready for AI applications on top of their other data needs. Companies need a lot of data to power AI and ways to ensure data quality for accuracy. But they are struggling with how to properly collect, store and process data for AI, which can lead to slowdowns and increased costs. 

At the same time, an IDC report predicts AI infrastructure spending will surpass $200 billion in the next five years following years of double-digit growth. Spending includes the costs of managing large datasets in the cloud for training AI and storing data for inference phases, the stages when AI makes predictions and generates outputs based on training data.

The combination of rapid data growth and the race to leverage AI requires companies to prove the value of their data spend within an environment of rising costs and multiple data priorities.

3 challenges for maximizing the value of data

At Capital One Software, we hear from customers every day about the struggle to manage data spend without stifling speed and innovation. 

These common problems can be broken down into the following:

  • Monitoring & forecasting: A customer may be nearing contract renewal with a data cloud provider, but doesn’t have a way to predict growth over the next few years. 

  • Environment optimization: A customer may want to implement sensible governance, without compromising speed and innovation.

  • Compute optimization: A customer may need to improve the efficiency of their warehouses and want to optimize resources without impacting performance.

A lack of effective strategies for data cost management can hinder progress with AI initiatives. We have found that organizations looking to tap into the full potential of their data for AI success must take a strategic approach that focuses on optimizing their data environment, computing resources, and monitoring and alerting capabilities. 

Why optimization is important

The main goal of optimization is not cost management, but rather efficiency. Optimization ensures businesses are striking the right balance between cost and performance to achieve maximum efficiency. Whether a process is efficient depends on the use case. For example, an intern running ad-hoc queries for gaining some insights from data does not need a strict Service Level Agreement (SLA). A regulatory report, however, should be published on time every day at a specific time. Organizations should focus on metrics that help them understand their efficiency and not solely concentrate on the cost of usage.

Meet Capital One Slingshot

A solution to help businesses scale their Snowflake Data Cloud

Environment optimization

A data environment includes the infrastructure, processes and tools to store, collect, process and analyze data. In an optimized environment, data costs are optimized through tools like compute scaling policies, data retention policies, data is stored and processed efficiently and data governance frameworks are in place. 

In order to give structure to cloud data cost management, organizations must establish governance and standards. Governance defines the rules of ownership, making certain that each data warehouse has an owner accountable for usage and costs. Standards define rules for optimal use of your data platform, such as the right configuration to be applied for each warehouse.

At Capital One, we federate data ownership at the team level while maintaining the necessary oversight. We ensure each warehouse has an owner from the business unit and this owner is accountable for the usage of the warehouse. The owner and finance teams work together to set up a budget for the warehouse. For any usage increase, we have an approval process that ensures all three teams (i.e., business organization, finance and data engineering) work together to get the best value for the increased cost. 

Environment optimization org charts

Through standards, organizations can ensure they have the right set up for their compute and storage infrastructures and data loading processes. For example, environment-specific configurations of warehouses will bring down costs while appropriately using resources. A rule may restrict warehouse sizes to small in QA environments and extra small in the development environment, which lowers overall costs. Adjusting the default query timeout will ensure long-running queries do not run up uninhibited costs, especially if the query initiation was in error.

For data storage, defining appropriate retention policies or time travel ensures data is only stored for as long as necessary without wastefulness in cost and resources. At Capital One, we may restrict time travel to one to three days for a sandbox environment and 14 to 21 days for critical use cases.

With sensible control policies, organizations can ensure they are making the most of their computing resources while data is stored and processed securely and efficiently.

Compute optimization

The efficient use of computing resources for large-scale workloads such as AI or ML is key to AI success. When optimizing, start with the fundamentals and make use of the features in the data cloud environment. Let’s look at Snowflake for example. The query acceleration service (QAS) improves the performance of queries by accelerating outlier queries while using a rightsized warehouse. Setting up auto-suspend policies allows organizations to avoid charges for idle compute resources. Additionally, the economy scaling policy for warehouses limits spinning new clusters to save credits in lower environments where there are no strict SLAs.

Advanced optimization techniques may be necessary for a given data environment. Going along with the principle that one size warehouse does not fit all, organizations should maintain different-sized warehouses and types of workloads on separate warehouses. Each team, whether they are developers, machine learning specialists or data analysts, maintain their own set of requirements and should rightsize warehouses to meet those requirements. 

Database chart

To save on costs while maintaining performance, companies should rightsize warehouses per workload patterns such as query size on any given day or time. Warehouses should be scheduled to change size based on the size of the workload since the workload is constantly changing. A third-party tool, like Capital One Slingshot, allows for dynamically changing warehouses.

Monitoring and alerting

User needs and workloads are constantly in flux and require continuous monitoring to avoid unwanted costs. Alerts on business thresholds, such as monthly spend and query execution times, can help keep data resources within set limits. Alerts can notify teams when compute or storage costs are about to go over agreed upon limits, helping teams stay within budgets and compliant with policies and regulations. 

While business thresholds are important, setting up thresholds based on leading indicators is important for more effective and granular controls. Tracking early warning signs of inefficiencies is a proactive approach to optimizing performance. Leading indicators may be query queuing when too many queries are waiting, query execution time breach indicating performance lags, and data spillage that can lead to wasting resources. Making changes and rightsizing the warehouse should be based on SLA requirements. 

Rather than reacting to overruns in cost and usage, monitoring user behavior and workloads helps businesses proactively optimize performance while managing costs. 

Key takeaways

From our own data journey that has laid the foundation for taking advantage of AI across our business, we have found that a data environment which optimizes costs while maximizing performance is critical for success. Final takeaways include:

  1. A strong governance strategy drives responsible growth. Establishing solid data governance ensures that as data volumes grow and AI priorities multiply, clear policies and controls prevent unnecessary spending while improving cost efficiency.

  2. Optimization and performance are not mutually exclusive. Optimizing data environments is all about saving on costs while maintaining the highest levels of performance, such as speed and data quality for effective AI.

  3. Measure what matters most. Tracking the right metrics is key to ensuring your business is addressing its highest priorities and driving value. Instead of focusing solely on costs, organizations should pay attention to insights that highlight inefficiencies and help the business make proactive adjustments for better performance. 

Optimizing data costs for better AI

To prepare for AI implementation, enterprises must include value-based spend conversations about data. With the right optimization techniques in place, companies can use data resources efficiently in the cloud while providing enough compute power and storage for AI initiatives.


Rahul Mode, Sr. Director, Solutions Architecture, Capital One Software

Rahul Mode is the Sr. Director of Solutions Architecture at Capital One Software, an enterprise B2B software business of Capital One. In his role, Rahul leads a team of solutions architects focused on ensuring that customers derive maximum value from Capital One Slingshot. With more than 15 years of experience, Rahul's expertise spans cloud computing, digital payments and IoT, with a strong background in presales and enterprise customer success.

Related Content

3 pixelated clouds on navy background
Article | December 18, 2024
Navy illustration
Article | March 6, 2025