Bleeding Edge Dependency Testing with edgetest
Capital One is open sourcing edgetest for bleeding edge dependency testing
By Faisal Dosani and Akshay Gupta
edgetest
Dependency management in any language can be a challenge and Python is no exception. Tools like pip
and conda
use dependency resolvers to try and honor the requirements given to them, but oftentimes version conflicts prevent installation; this problem became more apparent when pip
introduced a new resolver in October 2020. New versions of an upstream package can break your code, and tracking down the culprit can be even more challenging if you have a long list of transient dependencies.
edgetest is an open source plugin-based Python package designed to help developers test their code against new versions of their existing dependencies. edgetest helps alleviate some of the burden of dependency management by:
- creating a virtual environment;
- installing your local package into the environment;
- upgrading specified dependency package(s); and
- running your test command (e.g. pytest).
Maintenance cost and environment management has become a part of “running the engine” with the pip
resolver. Now, edgetest can help reduce the maintenance cost of packages by automating bleeding edge dependency testing. For example, if you depend on pandas>=0.25.1,<=1.0.0
, edgetest will test your project against the most current pandas version (1.4.1 as of writing). With an effective test suite, you will know whether you can safely upgrade to pandas>=0.25.1,<=1.4.1
or not. edgetest will report whether or not it is safe to upgrade based on the test results. It will do this for each dependency individually before upgrading all upstream packages to identify any potential interactions.
Why We Built edgetest
After pip
introduced a dependency resolver in October 2020, we decided to take a more prescriptive approach to dependency pinning for internal projects at Capital One. Specifically, this involved adding both lower and upper pins to any direct dependencies for all packages. However, this decision added a new form of maintenance cost: updating the pins. We needed an automated way to help remediate security vulnerabilities identified in packages and continue to support the latest version of dependencies in a way that scaled. edgetest was a solution to this problem given the number of Python packages our team supported during that time. Machine learning packages can often have complex dependency structures and experimentation with new features is critical. While implementations of models should always pin packages to ensure deterministic behavior and auditability, we don't want the tools themselves to be unnecessarily restrictive, but to allow for some flexibility in their implementation.
We can now have scheduled CI/CD jobs which automatically run edgetest against many internal libraries to run unit tests and bump dependency pins, ensuring some level of trust in the latest versions. We should note that having robust unit testing is really critical to getting the most out of edgetest.
Is This Different from GitHub’s Dependabot?
edgetest is not tied to a particular version control system like GitHub. It also prevents accidental updates by only upgrading package dependencies when unit tests are passing. Finally, some users want to focus on a subset of their dependency tree for updates. Sometimes these are dependencies that release often (e.g. boto3), and sometimes these packages are core to their library’s functionality. edgetest offers multiple configuration options to help users create a test and upgrade a system that is functional for their use case.
edgetest in Action
For example, let’s imagine a simple toy_package like so:
.
├── LICENSE
├── README.md
├── setup.cfg
├── setup.py
├── tests
│ ├── __init__.py
│ └── test_main.py
└── toy_package
└── __init__.py
To configure edgetest we can include the following within our setup.cfg:
…
install_requires =
pandas<=1.2.0
numpy<=1.21.0
…
[edgetest]
python_version = 3.9
extras =
tests
And run with the following command line statement: edgetest -c setup.cfg
This will tell edgetest to use Python 3.9 and also to install the tests extra into each conda environment. edgetest will create three environments: pandas
, numpy
, and all-requirements
. In the first two, it will only upgrade the respective packages with those names, and the all-requirements
will upgrade both pandas and numpy. Next, the test command (pytest is the default) is run in each environment and the results are reported back to the user:
================ =============== =================== =====
Environment Passing tests Upgraded packages Package version
================ =============== =================== =====
pandas True pandas 1.4.1
numpy True numpy 1.22.2
all-requirements True numpy 1.22.2
all-requirements True pandas 1.4.1
================ =============== =================== =====
Alternatively you can also provide the –export
flag to write the changes to setup.cfg
for you if you wish.
What’s Next
One of the biggest benefits of edgetest
is the ability to automate. A GitHub Action built on edgetest is available for users of the CI/CD platform and allows you to automate your build, test, and deploy dependency management. Using the GitHub Action is relatively easy, below is an example of using the action in your YAML:
- id: run-edgetest
uses: fdosani/run-edgetest-action@v1.2
with:
edgetest-flags: '-c setup.cfg --export'
base-branch: 'develop'
skip-pr: 'false'
python-version: 3.9
Run-edgetest-action can be found on the GitHub Marketplace, where users can search for tools that add functionality and improve workflows. More details can be found in the README of the run-edgetest-action.
Conclusion
In conclusion, edgetest is an example of Capital One’s commitment to an “open source first” approach to software development. This technology helps developers test their code against new versions of their existing dependencies and reduces the maintenance cost of packages by automating bleeding edge dependency testing. It creates a virtual environment, installs your library, upgrades specified dependencies, and runs test commands. Afterwards, edgetest will report whether or not it is safe to upgrade based on the test results. We encourage readers to check out edgetest on GitHub for more details on the project and how to contribute to edgetest.