Managing Python Dependencies – Everything You Need To Know
When managing Python environments, one of the key concerns is dependency management. Dependencies are all of the software components required by your project in order for it to work as intended and avoid runtime errors.
You can count on PyPI (the Python Package Index) to provide packages that can help you get started on everything from data manipulation to machine learning to web development, and more. This can be a huge time saver, but resolving dependencies required between all the packages you need can also be a huge time sink. To be able to go from writing code to delivering complete Python applications in a production environment, you’ll need to master dependencies. That means being able to:
- Add, remove and update dependencies
- Resolve dependency conflicts
- Work with dependency management tools, like Virtualenv, Pyenv, Conda etc. and also try modern solutions like ActiveState’s Platform
‘Dependency hell’ is an apt term for the vexation that can be experienced when trying to resolve dependency conflicts. Dependency conflicts occur when different Python packages have the same dependency, but depend on different and incompatible versions of that shared package. Because only a single version of a dependency is permitted in any project’s environment, finding a compatible solution can be difficult.
Even in a project that has been isolated in a virtual environment, transitive dependency conflicts can arise. Transitive dependencies are indirect dependencies, otherwise known as dependencies of dependencies. For example, if package A has dependency B and dependency B has dependency C, then package A transitively depends on dependency C. An example of a transitive dependency conflict would be when multiple packages depend on different versions of dependency C.
The rest of this document contains tools and strategies to help you avoid Python Dependency Hell.
Approaches To Managing Python Dependencies
Overall, dependency management is a combination of best practices and thoroughly understanding a toolchain that implements them. Tools for that purpose are discussed later in this page under the Virtual Environment Solutions for Managing Dependencies section.
Best Practices for managing dependencies:
- Ensuring you’re familiar with the Python Packaging and Pip User Guides, as well as Peps related to dependency specifications and packaging.
- If possible, install dependencies from a requirements.txt file rather than individually. Requirements.txt files specify valid versions for each dependency, which should all work together. If you install each dependency one at a time (especially if you don’t specify a version), the package manager will retrieve the latest version of each dependency, which will increase the possibility of a dependency conflict.
- Always create a separate virtual environment for each project. This will isolate the dependencies for each project from each other, as well as isolate them from globally installed dependencies, reducing the chance for conflict.
- Creating a lock file (such as pipfile.lock) ensures that dependencies remain pinned to the exact version in use, ensuring reproducibility.
- (Optional) Docker containers with OS-level virtualization, as well as VMs (Virtual Machines) with virtualized software/hardware architecture can be used to help ensure software applications are even further isolated from one another than virtual directory process isolation.
Managing and Adding Dependencies To a Python Project
There are many different dependency management tools and methods for managing and adding dependencies to a Python project, from pip to Conda to the ActiveState Platform. Here’s a summary of some of the most popular dependency management tools.
Pip is the de facto standard tool for installing Python packages and managing their dependencies. When you use pip to install packages, it automatically retrieves the package and all its dependencies from the Python Package Index (PyPI) and installs them locally on your system:
$ pip install <packagename>
Unfortunately, pip makes no attempt to resolve dependency conflicts. For example, if you install two packages, package A may require a different version of a dependency than package B requires.
Pip can install from either Source Distributions (sdist) or Wheel (.whl) files. Wheels are a binary installable format that includes the package and its dependencies. If both sdisk and wheel formats are available on PyPI, pip will prefer a compatible wheel. If pip does not find a wheel to install, it will automatically build a wheel and install it for you.
For more information on pip, refer to Dependency Management with Pip, Python’s Package Manager.
Setuptools is a package development process library, and a popular long term standard for ensuring that projects get packaged with all the dependencies they require to run. Dependencies are defined in the install_requires section of setup.py files, and are automatically installed along with the packages that require them. Unfortunately, this method alone will not prevent conflicts from arising if packages need to share dependencies in the same environment.
Simply put, pip and setuptools have similar limitations in terms of dependency conflict. As a result, more advanced tools and methods for managing dependencies are required, such as virtual environments and lock files.
For more information on setuptools, refer to How to Package Python Dependencies for Publication.
Pipdeptree is a tool for displaying installed Python packages in more of a visual dependency tree. This can be useful when trying to visualize a dependency conflict. However, Pipdeptree does not install packages and dependencies or provide conflict resolution.
Use Pipdeptree whenever you need help making the hierarchy of packages and dependencies in an environment more understandable. It can be used to display both packages that have been installed globally, as well as in a virtual environment.
For more information on working with PipDepTree, refer to How To Check For Python Dependencies.
Pipreqs is a tool that can generate a requirements.txt file containing the list of a project’s dependencies and their versions based on imports that it detects in the source code.
To install pipreqs, enter:
$ pip install pipreqs
To generate a requirements.txt file for a project, and save it to the project location, enter:
$ pipreqs /<projectlocation> # requirements.txt pkginfo==126.96.36.199 tabulate==0.8.7 structlog==20.1.0 ...
Virtualenv and Venv
Virtualenv is a low level tool, originally from Python 2, for isolating multiple projects in virtual environments that minimize the possibility of dependency conflicts.
In Python 3, VirtualEnv can be used with Venv to create virtual environments. Venv and VirtualEnv are similar, however Venv (unlike VirtualEnv) is included in Python 3.3+ and does not have to be installed.
Note that in Python 3.8, Virtualenv has been deprecated in favor of Venv. Venv is a lower level tool than Pipenv, described in the next section, and can be useful when Pipenv does not meet your needs.
Pipenv is a dependency manager that integrates Pip and Virtualenv in a single application. It can be used to create a virtual environment for each project, and automatically manage the dependencies within each of them.
Pyenv is a Python version manager for changing the global Python version, installing multiple Python versions, setting project-specific Python versions, and creating and managing virtual environments.
Note: Pyenv is a Bash extension and will not work on Windows outside of the Windows subsystem for Linux.
For more information Virtualenv, Venv, Pipenv and Pyenv refer to How To Manage Python Dependencies With Virtual Environments.
Alternative Python Package Managers
The Python ecosystem features a number of third party package managers that have recognized the limitations of current solutions when it comes to dependency resolution. All of these alternatives are compatible with PyPI, but attempt to resolve dependency conflicts before they occur (ie., at time of installation).
The ActiveState Platform lets you automatically build a Python runtime environment from source for your project. The runtime includes a specific version of Python, as well as all the packages and their dependencies (including linked C libraries) required for your project.
The Platform’s Solver component automatically resolves all dependencies where possible, and flags unresolvable conflicts so you can manually work around them (eg., by simply selecting a different version of the offending packages/dependencies).
Once your project’s runtime is built and packaged for your Linux, Windows, or macOS, the State Tool CLI can be used to automatically install it into a virtual environment so it remains isolated from other projects/global installations. In addition, the ActiveState Platform gives you a way to bake security right into your language builds, identifying security vulnerabilities, out-of-date packages, and restrictive licenses each time your application is run.
For more information on the ActiveState Platform, refer to the ActiveState Platform web page.
Poetry is a Python package and dependency management tool that provides dependency resolution out of the box. Project dependencies are managed in a pyproject.toml file, which is automatically updated whenever the poetry install command is run.
Note that although Poetry installs packages from PyPI by default, they do not contain setup.py and are not compatible with pip. Poetry does things differently.
For more information on Poetry, refer to How To Use Python Dependency Management Tools.
Conda is a package, dependency, and environment management tool for Anaconda Python. Some of its basic features are similar to Pip, Virtualenv, and Venv. However, it is a separate, enhanced tool designed to work in Conda environments only.
Conda is a command-line tool included in the Anaconda Python distribution and can be run from the Anaconda Prompt in Windows or in a Linux terminal. It is usually quicker and more practical to use Conda than the Anaconda Navigator GUI ( Graphical User Interface) which is similar in function.
Conda not only provides virtual environments that isolate or sandbox each project to prevent dependency conflict between them; it analyzes each package for compatible dependencies and potential conflict during installation. If there is a conflict, Conda will inform or flag you that the installation cannot be completed.
For more information on Conda, refer to How to Manage Python Dependencies with Conda.
There are several different approaches to dealing with dependencies in Python. Many ecosystems have tools for pinning package versions and only doing controlled upgrades, but those tools have their own downsides. What if you are building with more than on language? Wouldn’t it be nice to have a user-friendly system for managing your dependencies?
One such solution for managing dependencies is the ActiveState Platform, which can automatically resolve all the dependencies for your project, and compile them (packages, dependencies and sub-dependencies) into a runtime for the operating system you use. Each time a package or dependency is updated, the ActiveState Platform can rebuild the runtime while ensuring compatibility between all dependencies and sub-dependencies. Give it a try today!