Python is a general purpose programming language with a vast ecosystem of libraries and tools for executing a wide variety of tasks or exposing a range of services. As an interpreted language, Python code is executed at runtime using a pre-compiled Python interpreter with a specific version ( v2.7.6 or v3.8.2, for example) and relevant environmental variables.
The dependencies are resolved based on the current PYTHONPATH, or else dynamically added in code. Where a dependency includes C code libraries (typically to speed up computational processes), the C code needs to be pre-compiled. For this reason, most organizations choose to work with pre-compiled Python distributions such as those from ActiveState or Anaconda.
However, many security-conscious organizations cannot use pre-built runtime environments. Instead, they typically dedicate resources to compiling the runtime and it’s bundled dependencies from source for each project. Put differently, they require a secure, verified and reproducible runtime environment where they can run Python executables. Moreover, they often need to build their runtime for multiple platforms in order to support all their teams, from developers that may work on Windows and macOS, to test and production teams that deploy on Linux.
In this article, I’ll compare the experience of building a Python runtime environment from source, using both the tradition, do-it-yourself method, as well as a new automated build method provided by the ActiveState Platform. I’ll conclude by contrasting the two experiences, and provide some overall impressions.
Let’s start with the traditional method.
How to Build a Python Runtime Environment from Source
Traditionally, building a custom Python runtime environment from source is a bit hit-and-miss. Because most Python source code is “pure Python,” building Python from source typically represents the easiest part. But our use case has two complications:
- The need to build our runtime for multiple platforms means the most complex part will be finding a good installer package that has adequate cross-platform support.
- The need to add packages with native C libraries to the runtime, like SciPy and NumPy means we’ll need to configure a C compiler,as well..
We can commence our process by downloading and installing Python from source. We’ll use this runtime as our base rather than getting a precompiled runtime. The following instructions will build a Python runtime for macOS only. However, the steps can be recorded in a script for easier reproducibility.
- First navigate to python.org’s Downloads page and find the release version you want to provide.
- Create a new folder in the filesystem for downloading the sources:
$ mkdir python-install && cd python-install $ curl -o python-3.8.2.tgz https://www.python.org/ftp/python/3.8.2/Python-3.8.2.tgz
3. Extract the contents and build:
$ tar -cvzf python-3.8.2 && cd Python-3.8.2 $ mkdir bin $ ./configure --prefix=$(pwd)/bin --with-openssl=$(brew --prefix openssl) $ make -j 8 $ make test $ make install
4. Once the compilation succeeds (it may take some time, so grab a coffee), we can test the Python binary inside the bin folder:
$ cd bin/bin ➜ ./python3 Python 3.8.2 (default, Mar 9 2020, 16:31:30) [Clang 11.0.0 (clang-1126.96.36.199)] on darwin Type "help", "copyright", "credits" or "license" for more information.
5. Now we need to add some libraries that have native C code, including SciPy and NumPy. Rather than install them via pip, which bundles the pre-built versions, we need to install them from source so we can control the compilation process.
To compile NumPy and Scipy we’ll need to install some extra libraries and compilers. For example, we’ll need the following compilers:
- A C compiler (like Gnu) for both NumPy and Scipy.
- A Fortran Compiler (like gfortran) for Scipy only.
- OpenBLAS if you want optimized BLAS support.
6. Once you have the compilers you want installed and configured, the next step is to clone the repositories:
$ git clone https://github.com/numpy/numpy.git $ git clone https://github.com/scipy/scipy.git
cd into each project and run the build and install commands.
Note that you may need to configure a custom prefix for the installation base of the CPython install folder (as specified in the <path_to_cpython> variable:
$ python setup.py build $ python setup.py install --prefix=<path_to_cpython>
This process may require some experimentation, depending on the available shared libraries and the OS target. Also be aware that it may take some time to finish.
8. Once we have installed the required versions of NumPy and SciPy, we can freeze all the dependencies in a requirements.txt:
./pip3 freeze > requirements.txt ➜ cat requirements.txt numpy==1.18.1 scipy==1.4.1
If you plan on automating this process, you can commit an existing requirements.txt file that you can use to install via pip, for example:
./pip3 install -r requirements.txt
9. The last step is bundling the final artifacts in an installer or a compressed archive. There are a few options here based on the target platform:
- For Windows, we can use an MSI installer such as Advanced Installer to bundle the whole Python folder inside an executable.
- For macOS we can either go the route of using pkgbuild and productbuild tools, or we can just archive them in a file.
- For Linux, a simple tarball will suffice.
In every case, we’ll need to have a post-install script that creates symlinks and references the specific Python path. We’ll also need to compress the output folder we used to build the Python runtime (in this case, the bin folder).
$ cd ../.. && tar -czvf python-runtime-3.8.2.tar.gz Python-3.8.0
10. Now, when we extract the python-runtime-3.8.2.tar.gz in a folder on a different system, we’ll need to export the bin folder to be available in the environment:
$ export PYTHONHOME=python-runtime-3.8.2/bin/ $ export PATH=$PATH;$PYTHONHOME
We may also need to add aliases for the executables:
$ alias python=$(pwd)/python-runtime-3.8.2/bin/bin/python3
And that should do it. We can now distribute the runtime to our team members so they can create virtual environments for their projects.
As you can imagine, the whole process is not ideal, and requires quite a bit of careful planning and scripting to accomplish. Is there a better way?
How to Build a Python Runtime Using the Activestate Platform
Let’s see how we can accomplish the same use case with the ActiveState Platform:
- Create a version of Python from source.
- Add non-native libraries like SciPy and NumPy.
- Distribute reproducible builds for Windows, macOS and Linux.
1. First, we’ll need to create a free account on the ActiveState Platform:
You can also register and login via Github. Once logged in, you should see your dashboard page:
The highlighted button will allow us to build a new Custom Runtime that contains the version of Python we need, as well as the two packages we want. Click the Build a Custom Runtime button to display the project creation wizard:
3. Enter a project name, select a language & version (Python 3.6.6) and the Platforms to build for ( Linux, Windows 10 and MacOS), and then click on the Create Project button to display the project configuration page:
4. Now we can customize our build by adding/removing packages or changing platforms. Let’s add our two required packages, SciPy and NumPy. To do so, click on the Choose Packages button to pop up the Add Packages modal and search for NumPy:
5. Click the Add button to add the latest version of NumPy to the build. Search for SciPy and add it as well, then click the Done button.
6. Note that build has been updated not only with the two packages we added, but also with all their dependencies, as well (we’ve gone from 11 dependencies to 26). The ActiveState Platform includes a dependency solver, which will automatically resolve any dependency conflicts for you.
We can now commit the changes we’ve made, but before we do, note that we also have an option to add a commit message by clicking in the text box labeled “Why are you making these changes?” Any changes made follow a git commit/change approach. You can view a list of commits, along with all the additions and subtractions for each commit, under the History tab.
Click the Commit Changes button to trigger a new build to create the custom installers.
7. The Commit Changes button now becomes a View Status button. By clicking on the View Status button, we can navigate to the builds page where we can observe the progress:
As soon as the build instance is launched, Python starts building from source, followed by the other packages and dependencies in our build.
Once the build succeeds, you will be able to download an installer for each platform. The installers (a Windows MSI, Mac PKG and Linux tarball) contain all the previously specified dependencies and libraries ready to be used.
And that’s it! All accomplished just by using a graphical UI. No scripts to create, no compilers to configure – no manual steps at all!
You can watch this 5-minute video to see these steps in action.
We can also test our runtime by downloading each installer, or using the ActiveState Platform’s CLI client, the State Tool, to automatically download and install the runtime in a virtual environment for us. In either case, it will create the necessary links and update the paths to set up Python. Next we can use the REPL to test our runtime:
➜ python3 ActivePython 188.8.131.5206 (ActiveState Software Inc.) based on Python 3.6.6 (default, Dec 19 2018, 08:04:03) [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> print("I like ", np.pi) I like 3.141592653589793
As you can see, all the imports we included in the distro are available for use, but now we have streamlined the whole process of building and packaging the Python runtime.
In this post, I’ve run through two ways of building a custom Python runtime from source that included a specific list of dependencies:
- The traditional manual method
- ActiveState’s automated, GUI-driven method
While both ways ended up in the same place, we can arguably say that with ActiveState we encountered fewer problems in general. In fact, after wrestling with compilers and trying to set up cross-platform builds, I’m reminded why most developers prefer to just download a pre-built distribution.
But if you can’t use a pre-built solution — if you need to build from source code — making use of the UI was a much more pleasant experience. And with the Git-like approach we could see exactly what was changed, when, and even revert to a previous build if needed.
- Try out the ActiveState Platform for yourself: Sign up for a free account.
- Check out the ActiveState vs DIY infographic for a graphical take on how to build a runtime environment
ActiveState Platform: How to Build a Custom Runtime in 5 minutes?
Solving Reproducibility & Transparency in Google Cloud Build CI/CD Pipelines