Bleeding Edge Dependency Testing with edgetest

Capital One is open sourcing edgetest for bleeding edge dependency testing

Capital One Tech

April 13, 2022

By Faisal Dosani and Akshay Gupta

edgetest

Dependency management in any language can be a challenge and Python is no exception. Tools like pip and conda use dependency resolvers to try and honor the requirements given to them, but oftentimes version conflicts prevent installation; this problem became more apparent when pip introduced a new resolver in October 2020. New versions of an upstream package can break your code, and tracking down the culprit can be even more challenging if you have a long list of transient dependencies.

edgetest is an open source plugin-based Python package designed to help developers test their code against new versions of their existing dependencies. edgetest helps alleviate some of the burden of dependency management by:

creating a virtual environment;
installing your local package into the environment;
upgrading specified dependency package(s); and
running your test command (e.g. pytest).

Maintenance cost and environment management has become a part of “running the engine” with the pip resolver. Now, edgetest can help reduce the maintenance cost of packages by automating bleeding edge dependency testing. For example, if you depend on pandas>=0.25.1,<=1.0.0, edgetest will test your project against the most current pandas version (1.4.1 as of writing). With an effective test suite, you will know whether you can safely upgrade to pandas>=0.25.1,<=1.4.1 or not. edgetest will report whether or not it is safe to upgrade based on the test results. It will do this for each dependency individually before upgrading all upstream packages to identify any potential interactions.

Why We Built edgetest

After pip introduced a dependency resolver in October 2020, we decided to take a more prescriptive approach to dependency pinning for internal projects at Capital One. Specifically, this involved adding both lower and upper pins to any direct dependencies for all packages. However, this decision added a new form of maintenance cost: updating the pins. We needed an automated way to help remediate security vulnerabilities identified in packages and continue to support the latest version of dependencies in a way that scaled. edgetest was a solution to this problem given the number of Python packages our team supported during that time. Machine learning packages can often have complex dependency structures and experimentation with new features is critical. While implementations of models should always pin packages to ensure deterministic behavior and auditability, we don't want the tools themselves to be unnecessarily restrictive, but to allow for some flexibility in their implementation.

We can now have scheduled CI/CD jobs which automatically run edgetest against many internal libraries to run unit tests and bump dependency pins, ensuring some level of trust in the latest versions. We should note that having robust unit testing is really critical to getting the most out of edgetest.

Is This Different from GitHub’s Dependabot?

edgetest is not tied to a particular version control system like GitHub. It also prevents accidental updates by only upgrading package dependencies when unit tests are passing. Finally, some users want to focus on a subset of their dependency tree for updates. Sometimes these are dependencies that release often (e.g. boto3), and sometimes these packages are core to their library’s functionality. edgetest offers multiple configuration options to help users create a test and upgrade a system that is functional for their use case.

edgetest in Action

For example, let’s imagine a simple toy_package like so:

    .                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
├── LICENSE                                                                                                                                                                                                                                    
├── README.md                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
├── setup.cfg                                                                                                                                                                                                                                  
├── setup.py                                                                                                                                                                                                                                   
├── tests                                                                                                                                                                                                                                      
│   ├── __init__.py                                                                                                                                                                                                                            
│   └── test_main.py                                                                                                                                                                                                                           
└── toy_package                                                                                                                                                                                                                                
   └── __init__.py

To configure edgetest we can include the following within our setup.cfg:

    …
install_requires =
   pandas<=1.2.0
   numpy<=1.21.0
…
[edgetest]                                                                                                                                                                                                                                     
python_version = 3.9                                                                                                                                                                                                                           
extras =                                                                                                                                                                                                                                       
   tests

And run with the following command line statement: edgetest -c setup.cfg

This will tell edgetest to use Python 3.9 and also to install the tests extra into each conda environment. edgetest will create three environments: pandas, numpy, and all-requirements. In the first two, it will only upgrade the respective packages with those names, and the all-requirements will upgrade both pandas and numpy. Next, the test command (pytest is the default) is run in each environment and the results are reported back to the user:

    ================  ===============  ===================   =====                                                                                                                                                           
Environment    Passing tests    Upgraded packages  Package version                                                                                                                                                                        
================  ===============  ===================    =====                                                                                                                                                          
pandas            True             pandas               1.4.1                                                                                                                                                                                  
numpy             True             numpy                1.22.2                                                                                                                                                                                 
all-requirements  True             numpy                1.22.2                                                                                                                                                                                 
all-requirements  True             pandas               1.4.1                                                                                                                                                                                  
================  ===============  ===================    =====

Alternatively you can also provide the –export flag to write the changes to setup.cfg for you if you wish.

What’s Next

One of the biggest benefits of edgetest is the ability to automate. A GitHub Action built on edgetest is available for users of the CI/CD platform and allows you to automate your build, test, and deploy dependency management. Using the GitHub Action is relatively easy, below is an example of using the action in your YAML:

    - id: run-edgetest
        uses: fdosani/run-edgetest-action@v1.2
        with:
          edgetest-flags: '-c setup.cfg --export'
          base-branch: 'develop'
          skip-pr: 'false'
	    python-version: 3.9

Run-edgetest-action can be found on the GitHub Marketplace, where users can search for tools that add functionality and improve workflows. More details can be found in the README of the run-edgetest-action.

Conclusion

In conclusion, edgetest is an example of Capital One’s commitment to an “open source first” approach to software development. This technology helps developers test their code against new versions of their existing dependencies and reduces the maintenance cost of packages by automating bleeding edge dependency testing. It creates a virtual environment, installs your library, upgrades specified dependencies, and runs test commands. Afterwards, edgetest will report whether or not it is safe to upgrade based on the test results. We encourage readers to check out edgetest on GitHub for more details on the project and how to contribute to edgetest.

Capital One Tech

Stories and ideas on development from the people who build it at Capital One.

Bleeding Edge Dependency Testing with edgetest

Capital One is open sourcing edgetest for bleeding edge dependency testing

edgetest

Why We Built edgetest

Is This Different from GitHub’s Dependabot?

edgetest in Action

What’s Next

Conclusion

Related Content

How Capital One is developing for the bank of the future

Using DataComPy for comparing pandas and spark dataframes

Evaluating Data Quality for Machine Learning Models at Scale

Footnotes