Precompiling the Hard-to-Compile Packages
In this release, we’ve focused on bringing the difficult to build data science packages to all ActivePython users on Windows, MacOS, and Linux (contact us if you are looking for AIX or other platforms). In particular, this release features machine learning, data engineering, data science collaboration, and an increased focus on providing the strongest security packages available in the ecosystem. As the saying goes…but wait, there’s more! Since machine learning requires serious performance, we’ve added some outstanding enhancements thanks to the performance experts at Intel.
Machine learning is becoming an indispensable tool as businesses look to enhance the value locked up inside their data. And Python is becoming a BIG part of that. So in this release we’ve provided the University of Montreal’s theano project as well as Google’s TensorFlow. Some of these packages are very tricky to build and I noticed at the hands-on TensorFlow session at OSCON that those using Windows were struggling. Check out our TensorFlow build on Linux and Windows for an easy way to get it running.
Note: we’ve added these new machine learning packages on top of scikit-learn that we added in our last release. To round out our machine learning package offerings, we’ve included the keras framework which provides an easier way to build models with either TensorFlow or theano. Just as a caveat (as the python community is still migrating from 2 to 3), some of these libraries are available on only Python 2.7 or 3.5, and they aren’t on every platform, so please check this page featuring our machine learning python packages.
In order to feed the data hungry machine learning algorithms, having smooth running data pipelines are key. Data engineering is an important part of any data science strategy, and the use of data engineering frameworks such as luigi, Dask, or Apache Airflow can help to form a strong foundation. In addition, we already ship connectors and drivers to all popular database platforms, regardless of whether it is traditional relational databases, big data platforms, NoSQL and variants in between.
We realize that security is such an important component for web services and web applications, so we’ve added a number of packages that we feel are critical to developers. For starters, we always provide an updated version of OpenSSL with every new release. In addition, we provide a helper library for PyOpenSSL, called service_identity, which is a fairly unknown but very important package that is used to avoid man in the middle attacks during authentication. We’ve also provided a drop in replacement for the abandoned PyCrypto package, called PyCryptodome, which will require no changes are your part if your applications already using PyCrypto. As well we’ve added the bcrypt library to ActivePython which is another great encryption library that is used for storing sensitive data.
The Intel Math Kernel Library
ActivePython is now leveraging Intel’s Math Kernel Library so when you’re running ActivePython on Intel hardware, you will see faster processing times for SciPy and NumPy and any other packages that use these for their underlying computations. This also means that theano will enjoy these speed improvements, which is of considerable value when training your models, as will Dask with its parallelization and integration with SciPy and Numpy…plus many more!
We hope you enjoy this release of ActivePython and as always don’t be shy to reach out directly to me with any feedback, or requests (yes we have that package too!) for our next version.
Download ActivePython 2.7.13 or 3.5.3 to try out the latest machine learning and data science packages.