Lessons from Capital One's cloud migration journey
Learn how we became the first U.S. bank to adopt a cloud-first strategy by doing the hard things first.
At Capital One, we’re getting used to living 100% in the cloud after reaching a pivotal milestone of exiting our data centers and going all-in on the cloud in 2020. As the VP of Software Engineering at Capital One, here are my thoughts on Capital One’s journey to the cloud and the takeaways that matter for other organizations undergoing large-scale tech transformations.
For those who want a tl;dr: Always do the hard things first.
How Capital One’s journey to the cloud started
A bank with technology in its DNA
Let me start by telling you a little bit about Capital One. For those who don’t know, we are a 25-year old, Fortune 100 company that’s still founder led, with tens of millions of customers and 50k associates. Our founder-led mentality can be felt in everything we do, and has helped to instill, support and grow a highly entrepreneurial and innovation-driven culture. This is part of why we’re the first bank to report that we’ve exited data centers, because big, tech-driven thinking is in our company DNA and has been on display since our earliest days as a company.
Capital One was founded on the belief that the banking industry would be revolutionized by technology. That by pairing the tech prowess of leading tech companies with the risk management skills of a leading bank, we could build great customer experiences delivered in real time through software, data and AI/ML. Therefore it’s really important to recognize that our migration to the cloud has been part of a much broader digital transformation that’s been going on for close to a decade.
How our digital transformation started
With those underlying assumptions, starting in 2012, we sought to completely redefine who we are as a company—to build a technology company that does banking instead of a bank that just uses technology. This tech transformation had a number of different dimensions:
- We needed to become great at building software.
- We needed the top engineering talent to do it.
- We needed to reinvent our technology operating model to enable the agility and innovation necessary to set the pace in the market.
- We needed to move to an agile model for delivering software and insist on modern architectural standards like RESTful APIs, microservices and building on open source foundations.
- We needed to become cloud-first, building everything new on the cloud.
As a company, we went all in: from the CEO to the associates just out of school. By aligning so tightly we did one of the hardest things any company going through a transformation can do: we made sure everyone had the same north star.
Solving technological challenges in the banking industry
We continue to work on unique, challenging technology problems that will ultimately benefit millions of customers, and we’re doing it in a way that hasn’t been done in the banking industry before.
For instance, we can now build experiences that are powered by data and a sophisticated approach to AI/ML, allowing us to interact with our customers in a more natural, seamless and accessible way. So we’re embedding machine learning across the entire enterprise, from call center operations to back office processes, fraud, security, digital customer personalization and much more.
In 2015, we became open-source first and we've been developing open source projects that support enterprise machine learning needs, such as the recently released Data Profiler and rubicon-ml open source projects.
Benefits of adopting a cloud-first strategy
By exiting our legacy data centers and moving to the cloud, we have built the foundation for the bank of the future. Some of the benefits of migrating our data centers to the public cloud have been:
- Instantly provisioned infrastructure at near unlimited scale - Applications use as much or as little computing and storage as they need, and pay for only what they use.
- Lower costs - Third party providers can realize economies of scale that we could not realize on our own.
- Increased scale - Our technologists are using real-time, streaming data at scale, machine learning and the power of the cloud to solve unique, challenging technology and data problems to deliver intelligent solutions that benefit millions of customers.
- Increased resiliency - If one region were to go down, we can stand up in another region on AWS. If a data center goes down, there’s no alternative.
- Plugging into the world’s innovation - New offerings are being introduced all the time, allowing us to leverage the newest tech, immediately and at a significant discount to doing the same in our own data centers.
- Shift to modern architecture - The use of RESTful APIs, microservices, open source and DevOps processes are all greatly accelerated in the cloud. These elements are essential to modern software architecture.
- Agility and speed to market - We can now release more frequently, with teams going from quarterly or monthly releases to releasing code multiple times a day. We’ve also seen dramatic improvements in system availability and disaster recovery, including cutting both the number of transaction errors and critical incident resolution time in half. Also, we’ve reduced the time to build a new dev environment from months to weeks.
This has helped us to modernize how we work across a wide range of our tech organization and allows us to quickly react to new challenges and adopt new technology.
"Do the hard part first" should be the mantra of your cloud journey
Obviously, on a journey like this you learn a lot of lessons—some small and some large. We came away from ours with one big takeaway—do the hard part first. No, really.
There is no silver bullet on this journey. You have to define how you want your environment to operate and build controls and establish a governance function. Aligning to agile principles, verifying your goals and pressure testing the most important things first to continually verify you’re achieving your goals. You have to know where you’ll take on tech debt whenever possible, and build an actionable plan to maintain what you build —a topic I covered in my blog post on 4 best practices for repeatable software delivery. You have to center the fact that CI/CD is critical to everything and without a robust CI/CD pipeline your products will not scale. I hate the phrase minimal viable product because typically MVPs ignore the hard things. When you build your MVP, remember to build in the infrastructure too.
Many companies will underestimate the comprehensive all-in nature of this kind of transformation. While I haven’t been at Capital One for the entire journey, I can tell you this was a significant, multi-year effort that required every part of the business to take part.
In complexity there is no silver bullet, only silver buckshot.
A constant search for the proverbial ‘silver bullet’ or ‘easy button’ can take you and your organization in circles by distracting you from the hard things you should be doing (and doing first). By eschewing fads, especially ones presented as cut-and-paste solutions with a checklist of rules promising a shortcut to success, you can maximize your silver buckshot and get the most out of your own cloud journey.
People are the real superheroes behind any tech transformation
Doing the hard part first might be the big takeaway but I want to emphasize that at the end of the day you cannot achieve this without investing in the people who make this work happen. One reason fads and quick fixes don’t work for deep transformations is because they focus on processes rather than people. They may have the kernel of a good idea, yet they fail to acknowledge the fact that they rely on the abilities, cooperation and enthusiasm of employees—who inconveniently happen to be human beings!
Only if leaders recognize and embrace this human factor can tangible, long-lasting improvements be achieved during a cloud transformation. This is why elements like team composition and mission matter. Teams need to own their delivery to production and have roles that are understood and empowered to get the job done. And I would be remiss as leader of Capital One’s Tech Diversity, Inclusion and Belonging Council if I didn’t advocate for this, they absolutely must center that inclusivity is the greatest indicator of a high functioning team and that diversity of thought makes us stronger.
At Capital One, part of our cloud journey and greater digital transformation has focused on making our employees feel empowered in their roles and building a culture focused on continuous learning. Our technology department has ambitious goals around AWS certifications and we’re giving associates time to study for their exams through “Invest in Yourself Days”. In 2017, we launched a Tech College inside Capital One to support associates in continually sharpening their skills. Our Tech College focus areas include software engineering, security, cloud, mobile, data, agile & machine learning, as well as soft skills such as understanding bias, creating a culture of inclusivity, and fostering a community of idea and information sharing.
In addition, Tech College supports four different development programs:
- Technology Development Program: a 24-month rotational program providing recent tech graduates with a variety of learning opportunities.
- Capital One Developer Academy: A six-month computer programming immersive experience for non-computer science graduates.
- Technology Leadership Development Program: A year-long program to develop future leaders, enable career growth and enhance technical ability.
- Elements of People Leadership: A 6-week cohort program for experienced people leaders aimed at developing strong leadership skills.
These investments in people have worked hand in hand with the investments in technology our cloud journey required. We couldn’t have had one without the other.
The bank of the future is now
When I first spoke on this at Lesbians Who Tech, we were eight years into our technology transformation. We didn’t start on day one knowing exactly how this would all unfold over the years—we just knew our goal was to get to a destination where we could more quickly and nimbly build new capabilities for our customers. So we comprehensively reimagined our talent, our technology infrastructure and how we work until it felt like we’d arrived there (not to mention that we could stay there).
We tried a lot of paths and still are, which gave us the nimbleness needed to respond to everything 2020 threw our way. Our ability to swiftly respond to the COVID-19 pandemic and ensure our associates were able to remain connected and focused on helping our customers was directly enabled by our technology transformation.
Being in the cloud helped us to quickly respond to the impact of COVID-19 shutdowns in four major ways:
- Being in the cloud was critical for enabling our back office workforce to remain connected and productive throughout this transition. We are able to actively monitor the performance of our collaboration tools such as Slack, GSuite and Zoom and dynamically scale based on need. This means our teams, including our engineering teams, could stay better connected while transitioning to remote work—a story which our HR Tech Team told in more detail on our blog.
- From an AWS perspective, our ability to control and monitor our infrastructure through APIs allowed us to do it from anywhere—including while working remotely. We didn’t have to worry about manning our data centers, or how we were going to get the equipment and people needed to add infrastructure in support of the crisis.
- Our work in the cloud enabled us to prioritize both the health and safety of our contact center agents and still provide critical support to our customers—thousands of contact center agents are now working remotely with Amazon Connect. There is even an AWS case study on how we made this happen.
- The cloud enables digital banking and digital banking became even more relevant during lockdowns. By accessing accounts online or using the Capital One Mobile app, our customers could make payments, view transactions, check balances and more, all while social distancing.
This nimbleness has far-ranging implications outside of our pandemic response. By building resiliency and reliability into both our technology and our organizational structures we can better weather all manner of situations, now and beyond.