How to Package Python Projects in 2021

modern python packaging
Traditionally, if you want to build a Python project from source code, you’ll need to follow a proscriptive recipe that requires specific:

  • Build tools 
  • Packagers
  • Installers

All pinned to specific versions. This makes your Python projects a very brittle system, since any change to any component can cause a build failure. For anyone building multiple Python packages from source, this means juggling multiple build environment artifacts, each one slightly different in order to be able to build each specific package.

The simplest way to build from source code is the ActiveState Platform, which automatically builds not only Python code but also linked C and Fortran libraries for Windows, Linux and Mac. Try it for free.

Enter pyproject.toml. It’s been accepted for five years. It provides Python developers with a method to package their projects in a tool-agnostic manner, avoiding the brittleness associated with dedicated toolchains.

TOML, or Tom’s Obvious Minimal Language is a configuration file format similar to YAML. It’s been around for years, and is popular with many other programming languages. Pyproject.toml is just the Python implementation. Because the TOML file is where all the magic happens, let’s take a closer look at it.

Working with pyproject.toml

A TOML file contains consistent and predictable build information about your project that is meant to be readable by both developers and machines. The metadata it provides shows WHAT you need to build the project, not HOW to build it. 

A TOML file is broken into multiple sections:

  • Build-system – specifies all the backend builder requirements, along with a backend entry point (API) at which you can just point your favourite build tool. 
  • Project – this is a new section that’s not yet supported by all build tools, but supplies core metadata like the name and version for the project
  • tool.X – you can optionally include specific information for specific build tools

For any Python build task, you’ll need three types of tools to transform the source code into an importable package:

  • Build Backend – turns source code into a distributable format (ie., a wheel file). This is chosen by the project’s author, and specified under the build-system section in the pyproject.toml file.
  • Build Front End – handles user interaction by providing an interface, such as the command line, displaying error messages, etc., as well as installing build-time dependencies.
  • Package Manager (aka Integration Front End) – not involved in the build process, but rather performs key functions like installing the built distribution into your Python environment, as well as resolving/retrieving runtime dependencies for you.

The key here is that all three components of your Python projects are defined by APIs, allowing for a separation of concerns so that each part can operate independently. In this way, you can point whatever front end tool you want at the backend API, as long as it complies with the backend’s requirements. 

Unfortunately, despite this wonderful flexibility, the front end tool that Pythonistas almost always use is pip. But pip is not a packaging tool. This has had some interesting ramifications on the way we work with Python, not all of them to the good.

Going Beyond Pip

Pip has evolved alongside the Python language, significantly decreasing the complexity of packaging and installing Python. But that simplification has been holding back the Python language for years. To understand why, we first need a little bit of a Python packaging history lesson, starting with:

  • distutils – this is the original Python packaging library that was first introduced in Python 1.6 and later deprecated in Python 3.10, but the setup.py standard it created still remains.
  • setuptools – reuses setup.py, but incorporates requirements. Further, it provides an “easy_install” script, which acts as a package manager capable of both downloading and resolving dependencies from the Python Package Index (PyPI). These package management capabilities have been deprecated in favor of pip, leaving setuptools where it is today: Python’s foremost build front end. 
  • pip – provides Python with a package manager that uses setup.py and setuptools, but hides them behind a simple, user-centric interface. 

And that’s where we are today, with an easy-to-use front end tool that even lets you install GitHub Python projects from a link. But pip is not a reliable build tool since:

  • It assumes you have setuptools installed.
  • It’s missing key functionality that you would want from a true build front end, such as the ability to create sdist distributions.

Compared to the standard Python build tools, pyproject.toml adds:

  • Flexible Builds – freely change how you build your project without disrupting your users.
  • Dependency Declaration – allows you to declare all of the dependencies that are required to start a build (such as Cython or Wheels)
  • Consumer Expectations – users and tools expect to be able to easily understand your project from a few key pieces of information.

For example, as Wheels have become more popular, using a pyproject.toml file makes it easy to build and install Wheels from your project since it’s tool agnostic. When packaging traditional Python projects, you’re stuck using:

  • setuptools, which doesn’t know how to build wheel files
  • pip, which can only install wheel files

In fact, if you try to install source code with pip, you’ll get an obscure error message about a “bdist wheel” that can only be resolved (after searching on StackOverflow, or similar help forum) by installing Wheel from PyPI.

Conclusions: Add pyproject.toml to your Python projects!

Software packaging has come a long way over the years, but Python hasn’t quite managed to catch up yet. Standards like TOML have already been adopted by many other programming languages, and can offer Python authors the ability to shed some outdated packaging practices and gain a number of new advantages, including:

  • Dependable build environments, since you can declare all the dependencies required.
  • The ability to experiment with different build backends, since you can explicitly declare the backend.
  • Support for the latest packaging tools, since pyproject.toml is tool agnostic.

And if you can find no better reason to adopt it, just remember that pyproject.toml is now a Python standard, meaning some projects won’t even work unless it’s included. In fact, more and more tools are adopting pyproject.toml as the location where they store their settings, even logging tools.

To get started with modernizing your packaging, all you need to do is:

  • Add pyproject.toml to your project. For example, the following configuration won’t change how your project gets built today, but it will explicitly tell users what you want to happen when a build is started:
[build.system]

  requires = [”setuptools”, “wheel”]

  build-backend = ”setuptools.build_meta:__legacy__”
  • Stop invoking setup.py, or even remove it altogether and use setup.cfg instead. In fact, executing setup.py as a script for any reason is all but deprecated.

At the end of a migration from setup.py, you’ll have explicit, readable, and dependable build information as metadata (not executable code), and no errors when your users attempt to build a Wheel file. 

Even if your Python projects build just fine right now, they’re only doing so based on a lot of assumptions, and assumptions are no way to build dependable software. Instead, explicitly choose a build backend, and let everyone know what you chose by putting it in a pyproject.toml file.

package management Python

Need more information about packaging your Python projects? Read these:

Python Package Management Guide for Enterprise Developers

How to Package Python Dependencies for Publication

How to Build a Runtime Environment from Source

Recent Posts

Scroll to Top