5 data management best practices
A Forrester survey of data management decision-makers showed new data management models necessary to operate in the cloud.
Building and maintaining data centers is costly in the face of growing volumes of data. There is also significant overhead and required staffing that can be outside of a company’s core competency. With exponential data growth, data management decision-makers will need to implement best practices to manage data in the cloud and get the most value from it. However, while operating in the cloud reaps many benefits, there will be challenges along the way that require responsible data management.
A commissioned study conducted by Forrester Consulting on behalf of Capital One shared that new data management models are necessary to operate in the cloud. The survey of 157 data management decision-makers in North America showed that while many organizations have yet to manage most of their data on the cloud, these organizations will need to adopt new best practices to manage their data and overcome the common challenges they will face on this journey.
1. Forecasting and managing data costs
Forecasting and controlling costs is a critical aspect of managing a data ecosystem and one of the top challenges reported by decision makers. Most cloud costs come down to storage and compute, and in the on-prem data warehouse world, organizations know how much to budget for such capabilities. It is all capital expenditure (CapEx) – you budget for a fixed amount of server capacity, maintenance and support staff. But in the cloud, everything becomes an operating expense (OpEx). Organizations pay cloud service providers like Snowflake and Amazon Web Services (AWS) for the data and compute services they use, and consumption tends to vary quite a bit between billing cycles.
While paying only for what you use is good business, the increase in data volume, variable usage and near infinite elasticity can lead to unpredictable costs.
The Forrester survey revealed that 82% of data management decision-makers view forecasting and controlling costs as a crucial best practice for managing an effective data ecosystem, making cloud cost optimization strategies more important than ever.
2. Setting up data cataloging to improve data quality
Another challenge organizations face when shifting to the cloud is the inability to see, understand and utilize all their data. Nearly 80% of decision-makers surveyed by Forrester cited a lack of data cataloging–an organized inventory of data–as a top challenge.
The lack of data cataloging (e.g., understanding what you have, how it is being used and who the owners are) can lead to a host of other issues. If you can’t see who published the data and when, it is almost impossible to determine if the information is current, relevant or redundant, and it becomes difficult to use the data confidently.
This lack of visibility often leads to data quality issues, another challenge identified by 80% of respondents in the Forrester survey. Adopting thorough data cataloging best practices can significantly improve data quality and support informed decision-making.
3. Enhancing data observability
Close to 75% of decision-makers surveyed by Forrester reported the lack of data observability as a problem. Without proper observability, organizations struggle to navigate their data effectively.
If critical data resides on numerous on-premises and cloud-based servers, but isn’t integrated or observable, leaders cannot consider that information to make well-advised decisions, something 76% of those surveyed by Forrester cited as a challenge. Implementing data observability best practices ensures that leaders can confidently use data to make sound decisions.
4. Developing clear data governance policies
Another common challenge organizations face with cloud migration is data governance. The idea here is that to meet governance needs, which can differ by industry, organizations must know where their customer and financial data resides, ensure it is protected and monitor who is accessing it at any given time.
Data governance can be a challenge for decision-makers because policies can be confusing, according to 82% of Forrester survey respondents. This highlights the importance of developing clear data governance policies to simplify compliance.
5. Establishing a cloud governance framework
In cloud infrastructures, it can be difficult to track and enforce adherence to data governance policies in cloud infrastructures because of their dispersed nature. Forrester cited that 80% of data management decision-makers identified they have difficulty governing data at scale and lack the ability to enforce role-based access and entitlements to specific data.
It is a best practice to establish a clear cloud governance framework to manage access and ensure scalability across large and complex cloud environments.
Implementing data management best practices at Capital One
At Capital One, we’ve experienced many of these challenges on our own data journey. We exited our last on-premises data center in 2020 and became the first U.S. financial institution to go all-in on the public cloud. Doing this was no small undertaking. It involved shuttering eight data centers and moving our applications to the cloud. We built more than 80% of our applications to be cloud native so they would run more smoothly and efficiently.
We weren’t immune to the challenges data management decision-makers face, including cost predictability, observability and data governance. To tackle these challenges head on, the first step we took was to federate our data management responsibility. With a federated model, the myriad data stakeholders across the organization can manage their own data while respecting individual and enterprise-wide data priorities and policies. We wanted a workflow-based model that automated as many back-end data management processes as possible. The goal was to make it easy to create a data set, manage access and build a data pipeline.
We initially tried to address these issues with individual point solutions. We filled in the gaps with our own tools, but this led to tool sprawl, where our stack became so extensive that it hindered our efforts. For us, the solution was obvious.
We built our own software that integrated some of these capabilities into a single data management solution, Capital One Slingshot. We built Slingshot to streamline our Snowflake provisioning processes while adhering to governance requirements, provide detailed visibility into cost drivers and help optimize future spend.
In today’s market, organizations need to make better use of data to maximize operational efficiency and deliver the kinds of experiences customers expect. But doing that, especially in the cloud, requires a solid and effective data management foundation. By using better software tools, companies can unleash the power of the cloud, achieve true business success and make the lives of data management professionals easier along the way.