Upgrading Perl to a Modern Version: the ActiveState Guide

Upgrading Perl to a Modern Version: the ActiveState Guide

kjljlk

Many organizations stopped upgrading Perl after the release of 5.8.8 or
5.10.1, about 8 to 11 years ago. At that point in history, major releases of
Perl could take many years to come to fruition. To make things more confusing,
point releases of a given version sometimes contained large changes (including
backwards incompatibilities) that we’d expect in major releases. For example,
5.10.1, released 20 months after 5.10, contained a number of backwards
incompatibilities and quite a few new features.

This made upgrading Perl scary and unpredictable. It also made working on the
Perl core challenging. What was an appropriate change for any release? When
would changes see the light of day?

To fix all that, Perl 5 pumpking Jesse Vincent instituted a new release
plan. Major releases (5.12, 5.14, etc.) would come out once a year in
April. In between, development version point releases (5.11.0, 5.11.1) would
be made once a month.

After a major release comes out (5.22.0, 5.24.0, etc.) there may be additional
point releases (5.22.1) as needed, but only for two years after the initial
release. After that period, support ends except for critical security fixes,
for which an additional year of support is provided.

Note that when I say “support” I mean that the community of people working on
Perl as a whole will work on bugs and security issues during this time
frame. This is not commercial support. If you want that, well, that’s one of
the reasons to become an ActiveState customer!

Jesse’s plan worked, and we’ve seen yearly releases in April (or May or June)
every year from 2010’s 5.12.0 to 2016’s 5.24.0.

The upshot of all this is that major Perl releases are much smaller and safer
than they used to be. Point releases are extremely safe, and we recommend
always upgrading when a new point release comes out.

But if you haven’t upgraded for ages, you’re still in a difficult
situation. How do you get from ancient Perl to modern Perl? What should you
look out for? And why should you even upgrade at all?

Features And Fixes

Let’s tackle that last question first. Why upgrade?

The simple answer is that Perl has seen quite a few new features in the past
few years, ranging from little tweaks (the s///r modifier) to “this really
changes all my code for the better” (subroutine signatures). Let’s take a look
at some of the highlights of the last six years, from 5.12 through 5.24.

Y2038 Handling

If you’re my age, you remember the Y2K efforts of the late 90s. While none of
the predicted catastrophes came to pass on January 1, 2000, that was in no
small part because of the countless software developers who worked to prevent
such catastrophes!

Well, there’s another time-related bug swiftly approaching,
the Year 2038 problem. Many
programs stored their dates and times as seconds since the Unix epoch (January
1, 1970 UTC). In C this is the time_t type. If the epoch value is stored in
a signed 32-bit integer, this means that you have 2,147,483,647 seconds
available. That sounds like a lot, but that epoch value corresponds to January
19, 2038 at 03:14:07 UTC. That’s not far from now at all!

And to make matters worse, it’s easy to do calculations today that will go
well past 2038. Imagine calculating the interest and principal payments for a
30-year mortgage starting today.

Fortunately, there are many efforts to ameliorate this problem, and 64-bit
platforms today use a 64-bit integer for time_t. The Perl core got a head
start on this in the 5.12 release by making sure that time is always
represented with something much larger than a 32-bit integer internally,
regardless of the platform’s integer size.

Regexes and Unicode

Perl 5.14 added a number of new modifiers for regexes and substitutions. The
/a modifier restricts the \s\d, and \w character classes to ASCII
only. This fixes a bug you might not have even known you had:

if ( $string =~ /^\d+/ ) { ... }

Did you know that by default, \d will match any Unicode number, including
৪. That’s a Bengali number four, not the Arabic 8! These sort of “look-alike”
Unicode characters have been the source of several security
vulnerabilities. The new regex modifiers make it easier to write secure code
in Perl.

Non-Destructive Substitution

Perl 5.14 added a new /r modifier for substitutions. With this modifier, the
substitution operator returns a new string rather than modifying the original:

my $new = $old =~ s/new/old/gr;

This was always possible using a much uglier idiom, but this operator is much
clearer.

The fc Operator

Comparing Unicode strings case-insensitively can be tricky. Unicode has a
concept called “foldcase” which addresses this. When you apply foldcasing to
two strings you can compare them with confidence. Perl 5.16 added an fc
operator to help you do just that.

Hash Randomization

The order of the items returned by keysvalues, and each was randomized
for each call starting in 5.18. This makes Perl hashes more robust against
algorithmic complexity attacks, which can be used to cause a denial of
service.

Note that we’ll cover this again later in the section on what you need to
watch out for!

Subroutine Signatures

This was added as an experimental feature in 5.20, and is one of my favorite
additions to Perl in recent years. If you enable the signatures feature you
can write code like this:

use feature 'signatures';
no warnings 'experimental::signatures';
sub run ( $self, $command, %opts = () ) { ... }

This simple feature goes a long way towards improving the readability of your
Perl code!

Hash and (New) Array Slices

Arrays have supported a slice syntax for a long time:

my ( $first, $second ) = @array[0, 1];

But now you can do this with hashes too:

my %subset = %hash{ 'foo', 'bar' }

Pretty neat! But you can also do this with arrays:

my @array = ( 'a'..'z' );
my %int_hash = %array[ 1, 2 ];
# %int_hash is ( 1 => 'b', 2 => 'c' )

Postfix Dereferencing

Postfix dereferencing is another new experimental feature in 5.20, and it has
since been marked as stable in 5.24. Perl has rightly been criticized for the
ugly syntax required to dereference a complex data structure. For example:

my @elems = @{ $foo->{bar}[0]{baz} };

That outer @{ ... } wrapper breaks the nice left to right dereferencing
syntax of the contained expression. With postfix dereferencing, we can make
the syntax consistent (if not beautiful):

my @elems = $foo->{bar}[0]{baz}->@*;

The difference is even more apparent when you compare the slice syntax:

my @subset = @{ $foo->{bar}[0]{baz} }[0, 2, 3];
@subset = $foo->{bar}[0]{baz}->@[0, 2, 3];

It’s not elegant, but on balance, I find this a bit more readable than the @{
... }
 wrapper version. This works for any sort of reference, including
scalars, arrays, hashes, subs, and globs.

Non-Capturing Regex Grouping

Are you annoyed at writing /(?:foo|bar)/ all the time just to avoid the
speed hit of capturing in regexes? I know I am! Perl 5.22 adds a new /n
regex modifier which disables all capturing, so you can write /(foo|bar)/n
instead.

Unicode

This isn’t specific to any particular release, but rather is part of every
major release. Perl keeps new releases in sync with the Unicode specification,
meaning that if you want to be able to use the full spectrum of Unicode
characters in your code, you need to upgrade. Perl 5.10.1 used version 5.1.0
of the Unicode character database. Perl 5.24 uses version 8.0 of that database.

And of course, there have been many Unicode bugs fixed since 5.10.1. The Perl
core’s Unicode introduction docs state that “Perl v5.14.0 is the first release
where Unicode support is (almost) seamlessly integrable without some gotchas”.

And So Much More

This is a very quick skim of the updates in major releases from 5.12 through
5.24. There’s a lot I didn’t even mention. You can read through the perldelta
docs yourself for more details:

Land Mines Along The Way

It’s not just features you get with an upgrade. You also get some backwards
incompatibilities and other potential breakage. Here are some things that you
need to watch out for.

Smartmatch is In Limbo

The smartmatch feature was first added in 5.10.0, and later revised in
5.10.1. It was then marked as experimental in 5.18.0. In 2016 there was a
discussion about whether to revise or kill the feature that has not yet been
resolved.

Smartmatch may return as stable in a future release, but if so it will be in a
simplified form. For now, I would suggest removing the use of smartmatch
(given/when and ~~) from any production code before upgrading your
Perl. Some alternatives include the list comprehension functions like any
from
List::Util,
Switch::Plain, and good old
if/elsif.

Importing from UNIVERSAL is Forbidden

Starting with Perl 5.12, importing things like isa and can from
UNIVERSAL was deprecated. In 5.22 this became a fatal error. You should
always call these subroutines as class or object methods.

Storable as a Data Interchange Format

If you’re using the Storable module to serialize and thaw data between Perl
processes, or worse, you are saving Storable-serialized data in files, a
database, or cookies, then you need to be very careful with your upgrade
process. This module does not guarantee binary compatibility across
releases. If you pass data from a newer version of Storable to an older
version, the older version will always die. This module is part of the core,
so upgrading your Perl will upgrade your Storable.

One possibility is to upgrade all of your systems to the most recent version
of Storable before you upgrade your Perl. Then as you upgrade a given system, you can
re-install the latest Storable, ensuring compatibility between systems.

A better, more permanent fix, is to not use Storable this way at all!
Storable is a poor choice for data interchange. Instead, consider switching
to Sereal, JSON, or YAML.

Hash Randomization

As I already mentioned, the order of the items returned by keysvalues,
and each was randomized starting in 5.18. This is good, but it has the
potential to break your code. In my experience, the most vulnerable code is
typically test code that looks like this:

use Test::More;
is_deeply( [ keys %hash ], [ 'foo', 'bar' ], ... );

This worked in older Perls because once you knew the stable order that keys
were returned in, you could hard code this in tests. Starting with Perl 5.18,
these tests will fail. Sometimes. This leads to all sorts of fun to debug
problems as you try to figure out why your test suite is failing on random
changes unrelated to the failing test.

The fix, of course, is to use sort whenever you get the keys or values
from a hash:

use Test::More;
is_deeply( [ sort keys %hash ], [ 'bar', 'foo' ], ... );

This could also come into play as an order of operations bug in non-test
code. Again, this will manifest as subtle intermittent failures.

Assigning to $0 Sets the Legacy Process Name

What does that mean? The short answer is that starting with Perl 5.14, setting
$0 calls prctl in addition to setting argv[0]. This means that some
programs, notably killall, will now see the updated name. If you rely on
being able to kill things based on the name a process has at startup, you should
check your code base for assignment to $0.

Devel::DProf is dead

This module was removed from the core in 5.14, and it is no longer
maintained. Use
Devel::NYTProf for your
profiling needs instead.

And More

As with features, there many other smaller changes. Most of these will
manifest as either warnings, compilation errors, or runtime errors.

For the warnings, it’s very important that you capture all output from your
code when you test with an upgraded Perl. These warnings go direct from the
Perl interpreter to stderr. It’s possible to intercept them, but not easy to
do this for all code consistently. Make sure that stderr isn’t going to
/dev/null! The same caveat applies to catching runtime errors.

The forbidden syntax is easier to spot, since this will cause the interpreter
to die during compilation.

Upgrading Best Practices

We recommend that you first start by trying ActivePerl 5.24.1 (or if you’re reading this in
the future, the latest stable release we offer). Upgrading to intermediate
releases one at a time is a lot of work, and a problem surfaced in 5.16 may be
fixed in 5.20 anyway.

If you have test suites (of course you have test suites) then the easiest
place to start is by running the suite under the latest Perl.

Once your tests pass and are warnings-free under the new Perl, you can then
upgrade a staging machine or two, then a few more, and so on.

Something you should be paying attention to both in the test suite and on
staging machines is the performance characteristics of your Perl code. While
recent versions of Perl include some new optimizations, they may also be
slower in some areas. These changes are hard to predict, so there’s no
substitute for monitoring CPU usage, disk usage, memory usage, etc.

Once you’ve squashed the problems surfaced in staging, you can take the same
incremental approach to production. Upgrade one or two machines and monitor
them closely. Then upgrade a few more, and so on.

If you get really stuck with 5.24.1, you might consider falling back to an
earlier release like 5.20 as a stopgap, but upgrading to any recent release
from 5.8 or 5.10 will be a major project, so you might as well aim for all the
new shiny bits while you’re at it.

Fortunately, once you’ve bitten the bullet and done the big leap, future
upgrades will be much easier. We recommend that you plan to upgrade your Perl
at least every other major release. Given the relatively small delta between
major releases these days, skipping one major release is fine. Skipping more
than one could tempt you to skip many, and suddenly you’re back here, planning
a big project to update from an ancient version of Perl.

The downside to skipping any releases is that you may miss out on
deprecation warnings telling you about something that will go away in the next
version.

You can make upgrading simple by doing it often. That way it becomes routine
rather than an infrequent and very painful chore.

Thanks

Thanks to the denizens of #p5p on irc.perl.org for reviewing an earlier draft
of this article and providing valuable feedback, including Joel Berger,
Christian HansenDominic Hargreaves, and Karl Williamson, among others.


Check out the latest version of ActivePerl–download the community edition.


Title image courtesy of Famartin on wkimedia.org under Creative Commons License.

 

Dave Rolsky

Dave Rolsky

Dave Rolsky has been a Perl developer since 1999, and has created or contributed to dozens of CPAN modules, including DateTime, Log::Dispatch, Params::Validate, and more. He is also a member of the Moose core development team, and in early 2009 completed a TPF grant to substantially rewrite and expand the Moose documentation. Way back when, he co-wrote Embedding Perl in HTML with Mason and RT Essentials, both published by O'Reilly. He spends a lot of his free time on animal advocacy, and otherwise vegetates with video games, books, and TV shows, like any proper nerd.