One of the beautiful things about Python is its comprehensive ecosystem of libraries, typically called packages. They are reusable chunks of code that are great for making development speedy and code tidy. You can head to the Python Package Index (PyPI), use a few Stack Overflow posts for advice, and pick out the package that suits your needs.
But before importing and using the chosen package, the package’s code must be present in your environment. Additionally, when releasing your application, users must have access to these packages. The packages used by your project are known as dependencies because your code can’t function without them. And dependencies often make matters more complicated by pulling in their own dependencies.
Developers rely on Python packages to keep their dependencies up to date whenever newer versions arrive with new features or patched security vulnerabilities. But projects may pin a package to a particular version because the code relies on it not changing.
Pinning a package to a specific version can become a management nightmare. For example, you may end up with two packages in your project that require different versions of the same dependency.
Additionally, many Python libraries depend on non-Python code, such as C or Fortran, to execute (typically) compute-intensive routines such as math calculations. The difficulty is that code in these languages requires compiling before use.
This is where dependency management tools are helpful, especially Python package management tools that can manage both Python and non-Python dependencies.
Ideally, you want your installations to be easy, secure, and fast. A critical part of continuous deployment is keeping applications immutable in all environments, from functional testing to production. The package management solution you work with must aid this.
There are many Python dependency management tools available, each with their own pros and cons. This article looks at some of the more popular options, and provides some practical assistance by showing you some of these tools in action.
After reading this post, you should be able to make an informed choice about which package management tool is best for your project’s needs and fits the way you like to work.
Many members of the Python software development community work with Pip almost exclusively. Pip comes with Python versions later than 2.7.9, and is the default go-to packaging tool.
However, its dependency management capabilities are pretty basic, and its speed is considered slow.
By default, Pip installs everything globally. Installing everything globally might seem like a good idea, but as has been abundantly shown over the past two decades, you will run into trouble over time if you work on different projects or choose to upgrade your dependencies. One solution is to use Pip within a virtual environment, such as venv or pyenv.
Pipenv solves some of the issues with Pip by wrapping and extending it to work with virtual environments.
On the command line, Pipenv is both colorful and user-friendly. Installing Pipenv sets up a virtual environment for you automatically.
This virtual environment is mapped to the directory path making virtual environment management easy if you work on multiple projects. And it creates a Pipfile.lock file that pins the versions of your dependencies to help ensure your application’s environment is reproducible when installed on other systems.
Worth noting is the security check feature. Running pipenv check in the environment highlights security vulnerabilities.
The pros of Pipenv include:
- Automatic loading of environment variables from .env files
- Management of different Python versions when used with pyenv
- Environment reproducibility by using Pipfile.lock
The cons of Pipenv include:
- Historically infrequent releases
- Slow installations
- The pipenv lock command does not capture dependencies of Python wheels provided in a Pipfile. This can result in non-deterministic builds in which each run may not pull in the same dependencies, causing inconsistencies between runtime environments.
Pipenv is an excellent general-purpose tool, and the Python Packaging Authority (PyPA) recommends it.
Poetry is a dependency manager with a loyal user base that provides similar functionality to Pipenv in that it offers automatic virtual environments on setup. It creates a pyproject.toml file, which is a Python standard that you can use instead of setup.py when creating your packages for distribution on PyPI or elsewhere.
It’s also handy for keeping track of project dependencies even if you aren’t building a package for the community.
Running poetry init guides you through the creation process.
Poetry has a good reputation for coping with complex dependency trees, and because of efficient use of caching it can result in a snappier experience than Pipenv.
However, Poetry’s solution to transitive dependencies isn’t quite right. Additionally, a quick run of poetry check on a pypackage.toml file that contains a vulnerable package doesn’t flag any security vulnerabilities.
On the plus side, Poetry’s error messages are highly human-readable, and offer real solutions to problems. Also, separating development dependencies from production is easy, and publishing to PyPI is as simple as poetry publish.
The pros of Poetry include:
- You can think of it as Pipenv, but with quicker installations
- Easy publishing — an uncommon, but important, use case
- Handles dependency conflicts with a dependency solver
The cons of Poetry include:
- Not universally pre-packaged with Python
- Confines virtual environments to the project directory
- Reports of problems with the dependency solver
You’ll notice that the dependency solver pops up as a pro and a con. That’s because, while it’s better to have a solver than not (dependency hell is no fun), it can be problematic as highlighted in an article titled Dependency Solving Is Still Hard, but We Are Getting Better at It. It’s one of those problems without a definitive solution — even with the most powerful computers.
Conda is a package and environment management tool provided by Anaconda. You can use it to manage packages for any software stack, but it tends to be the go-to for scientific and machine learning Pythonistas.
It’s available in Anaconda’s flagship distribution, which includes a version of Python, conda, and hundreds of prebuilt, popular packages. It’s also available in the lightweight Miniconda distribution, which includes just a version of Python and conda.
While conda is language-agnostic, neither PyPI nor PyPA officially support it. Additionally, it does not have a governance model as well defined as Pip’s.
You can use conda to create a new virtual environment and then add/manage your dependencies. You can activate your virtual environment from anywhere in your file system, making it an excellent choice to distribute code in repositories.
Conda can also be useful for installing native Python extension packages that aren’t currently available as wheels in PyPI. However, it should be noted that conda releases are sometimes delayed. For example, new releases of Pandas for conda have been known to be delayed.
After creating a new Python environment by running (for example):
conda create -n TestEnv python=3.9
You must activate the environment before use by running:
conda activate TestEnv
You can then install packages using:
conda install <package name>
pip install <package name>
Anaconda makes conda forge available to the community to build packages for conda. However, not every package on PyPi has been built for conda, so you may also need Pip. However, you should be aware that mixing conda-installed and Pip-installed packages can be hazardous. Make sure you follow best practices and install as many packages as possible with conda before installing the remaining with Pip.
If you do have a dependency clash, there’s usually some information to help you solve the issue:
While you’re still on your own to figure out the solution, at least you have a starting point.
The pros of conda include:
- Suitable for managing multiple environments, including projects that require Python2 legacy code and projects requiring Python3, for example
- Quick and easy when you need the same environment for multiple repositories
- Language agnostic
The cons of conda include:
- With Anaconda’s focus on data science, some common Python packages may not be available for conda, forcing you to mix in Pip installations. Mixing and matching can bring problems (to be fair, all package managers are going to struggle controlling packages they did not install).
- Installs dependencies with security vulnerabilities without warning
Hatch is a feature-rich project manager with a built-in dependency manager. Its efforts to make many Python project add-ons redundant are admirable. For example, it includes features like integrated testing and tools to manage code coverage. Like Poetry, it uses a pyproject.toml file.
Most of the command-line tools shown are like Poetry in terms of functionality, but Hatch offers more:
Hatch doesn’t warn you about security vulnerabilities, but it does warn you about conflicts. It then proceeds to install a version of the requested package that can resolve the conflict:
The pros of Hatch include:
- Works in conjunction with conda to help install native code dependencies
- Good option for reducing the number of tools in your project
The cons of Hatch include:
- Lots of features, which can mean a steep learning curve
The ActiveState Platform is a universal package and environment management tool for Python, Perl and Tcl that prioritizes security. Like Anaconda, the ActiveState Platform comes with its own Python ecosystem, offering an alternative to traditional Python dependency management tools.
You can use the Web GUI to configure a Python environment in the cloud. The GUI provides you with clear visuals, and creates a central source of truth for your environment that you can easily share with your team via a single command.
Alternatively, you can use the command line interface, the State Tool, to install and manage custom Python environments that feature dependencies (as well as any linked C/Fortran libraries) built on demand from source code.
As a result, there’s no need to no need to set up and manage complex build environments locally.
Whether you use the GUI or the CLI, ActiveState also provides a security audit of package dependencies, including transitive dependencies, to prevent you from introducing security vulnerabilities further down the chain.
You can get email notifications when vulnerabilities are found in your project, and then choose a version of the vulnerable package to upgrade or downgrade to and automatically rebuild a non-vulnerable environment.
In the above screenshot, you can see how the ActiveState Platform identifies packages with vulnerabilities and provides a link to the details.
The pros of the ActiveState Platform include:
- User-friendly Web GUI
- Consistent, reproducible environments
- Clear security checks throughout the dependency tree
- Built for modern DevSecOps with advanced dependency resolution features
- Builds packages from source code on demand, enhancing security
The cons of the ActiveState Platform include:
- Minimal support for macOS at this time
- Documentation needs improvement
- Managing Python environments prior to v3.9 can be very slow
Conclusions – Dependency Management that Eliminates Dependency Hell & Vulnerabilities
Package management is a hard problem in and of itself, but it’s only gotten more difficult since it’s been extended to encompass environment management. Despite the complexity, it’s essential to get it right.
Unfortunately, with Python, it’s historically been all too easy to get it wrong. This is one of the key reasons that Python features so many different package managers.
During development, sinking time into dependency hell in order to sort out problems with your environment is time wasted. As a result, many developers opt for a sophisticated solution that does as much heavy lifting as possible, leaving them free to focus on coding.
In the modern tech world, the secure way to code should be the easiest way to code. But introducing a dependency to your project inevitably brings along other dependencies, creating a domino effect that can make it challenging to spot security vulnerabilities.
Even when you find them, unless they’re critical vulnerabilities, the time and effort to resolve them means they’re rarely addressed, exposing your development and test environments to cyberattack.
If you want to eliminate dependency hell and create more secure code in dev and test without slowing down your sprint, I’d recommend a dependency manager that addresses the limitations of all the others. Take a look at the ActiveState Platform.
- Read our white paper on ‘Python Package Management Guide for Enterprise Developers‘
- Sign up for a free account and try the ActiveState Platform for yourself