It’s always fun to update applications on your phone or laptop – you get to use all the new features! But it’s not so fun when you need to update your application.
In my case, I had a bunch of machine learning models written in Python 2 that I had to migrate over to Python 3. Since Python 2 is no longer supported by the Python Software Foundation, it was critical that I migrate, but the process was still a chore.
Why Migrate from Python 2 to 3?
The most pressing reason to migrate is that the Python Software Foundation will no longer issue security fixes for vulnerabilities, or even performance updates.
However, migrating over to Python 3 can be an annoying task! It’s not a simple “push button” upgrade because Python 3 is not backward compatible with Python 2. You will have to make code-level changes once you update, but if you intend to keep supporting your application, you don’t really have a choice.
Also keep in mind that, while most of the open source libraries you used to build your application on Python 2 also support Python 3, just importing them directly will likely result in faulty behavior – even in production environments – causing unnecessary hindrances to your users.
Finally, Python 3 is so much more modern and compatible with the world we live in now. It has great support for unicode characters, does floating point operations better, and finally fixes that wretched print statement! (Don’t come at me with pitchforks now.)
So now that you’ve decided to migrate to Python 3, how do you do it?
Python 3 Migration Guidelines
The strategy you use for migration largely depends on the age of your application, and the version of Python 2 that it uses. Different strategies involve doing a direct migration and single push to production; using a bridge version such as v2.7; or migrating each service one by one (in the case of microservices).
As a rule-of-thumb, if you’re already using Python v2.7 in your code base, you’re in a good position. If you’re using Python v2.6 or previous, you’re better off first doing a migration tov 2.7, ensuring everything is working, and then moving to Python 3.
Another popular strategy is to write any new code to be compatible with Python 3. To ensure that this does not break when interfacing to Python 2 code, you can add the following statements at the top of any new modules you create:
from __future__ import absolute_import from_future_import division from_future_import print_function
It’s also a good idea to utilize staging servers, and push a build there to ensure everything works before moving to production.
When I migrated my code base, I followed some general guidelines that included:
- Learn What’s New in Python 3
This one’s obvious, but it has to be mentioned. I made the grave mistake of following a “tutorial” for my migration. I quickly realized that every code base is unique and you might run into issues that aren’t exactly covered in a tutorial. Make sure you read through the change logs of Python 3, and the version that you’re migrating to before you start. A great place that has everything covered is the Python docs page.
- Test Coverage
I’m sure you’ve written quite a few automated tests for your application. Though you can technically migrate without tests, you will most certainly shorten your migration process with them.
But this always brings up the question, how many tests is enough? A good way to measure your test coverage is to use the Coverage library. It works with v2.7, v3.x and bleeding edge versions of Python, as well. Coverage is a simple tool that analyses how many executable lines of code you have vs how many lines of code your tests execute. It then gives you a test coverage percentage. Obviously, the higher the number the better.
When you start the migration make sure you do it in parts, and use tests to validate each step. You’ll be able to catch bugs much better this way. Another tool to help you with the process is Tox. With Tox, you can run tests in different versions of Python. This helps you quickly identify breaking points in your application.
- Identify Breaking Syntax and Libraries
Python 3 brings with it many syntax changes. The most popular being “print” function changes. Ideally, your codebase shouldn’t have print statements, but if it does, consider migrating them first.
Basically, print statements in Python 2 did not need to be enclosed with brackets, but now they do:
# Python 2 syntax
print "Hello World!"
# Python 3 syntax
While most popular Python libraries/packages have been ported over to Python 3, you can never be sure. Any mismatch can cause your application to fail migration. To detect all unsupported libraries, use the caniusepython3 package.
You can also use the Pylint library to weed out all syntactic issues.
- Automatic Conversion Tools
Breathe: you don’t have to do everything manually in this migration because help is on the way!
There are a number of tools that can automate much of your migration effort’s heavy lifting. The following libraries can comb through your code base and make changes to improve its Python 3 compatibility. While they don’t hit all the requirements for a complete migration, they do chew up a major portion of the work you need to do.
For large applications that have dependencies all over the place (making it impossible for you to test in silos), consider the six library. This tool migrates your codebase to a limbo state that is compatible with both Python 2 and 3. From there, you can use other tools to migrate completely over to Python 3.
Otherwise, you can use any of the below libraries to automatically migrate syntax level changes:
- 2to3 – reads Python 2.x source code and applies a series of fixers to transform it into valid Python 3.x code.
- Modernize – attempts to generate a codebase compatible with Python 3. Note that the code it generates has a runtime dependency on six.
- Python-future – passes Python 2 code through all the appropriate fixers to turn it into valid Python 3 code.
While they all essentially do the same thing, they take different approaches. Refer to the documentation for each to determine which might be best for your code base.
Once you’re done, make sure to run your tests once again in order to ensure nothing is broken. And remember that these tools are not foolproof: sometimes they require you to do a little manual digging.
- Manual Checks
Unfortunately, there’s only so much that automatic tools can do for you. There are a few manual checks that you’ll have to perform to ensure smooth delivery:
Python 3 division supports floating point out of the box. This means that anywhere you have divided by an integer and expected a whole number as the output, that will no longer be the case.
# Python 2 syntax
9 / 2 = 4
# Python 3 syntax
9 / 2 = 4.5
- CSV Parsing
In Python 2, CSV files for parsing should be opened in binary mode. However, in Python 3, text mode is required. Furthermore, you will now need to specify the newline argument.
- Iterable objects, not lists
Many Python 3 built-in functions now return an iterator instead of a list. This is because iterators have better and more efficient memory consumption than lists.
Though this should not cause breaking changes in most scenarios, if your codebase does require a list to be returned, consider manually converting an iterator to a list first.
- Relative Imports
Relative imports are no longer supported in Python 3. Any code that used relative imports in Python 2 now has to be modified to support absolute imports.
When I prepared for my migration, I read a lot about Python 3. I was surprised to find that it has been in development since 2005! The first release was way back in 2008. Ever since then the Python community has been pushing for developers to make the jump. However, after an unsuccessful deadline in 2016, the community decided to do a hard stop in 2020.
Well, here we are 12 years later and according to recent surveys and pypistats.org, many of us are still using Python 2. Hopefully, that’s only because you’re in the midst of a migration because, sooner or later, your Python 2 application will become riddled with bugs and security vulnerabilities.
But if you’re stuck on Python 2 because a key library or vendor hasn’t migrated yet, or perhaps you just can’t justify the opportunity cost of migration, there’s still hope. ActiveState continues to offer support for Python 2 beyond EOL, including security fixes to core language and standard libraries.
- Read more about ActiveState’s Python 2 EOL Support Offering.
- Get the Python 2-to-3 runtime, which contains all the conversion libraries from this post, including