How to De-risk Unavailable Software Dependencies – Lessons Learned

  • Fact: 89% of organizations rely on open source software. (Source: RedHat)
  • Fact: 98% of applications incorporate open source software as dependencies in their codebase. (Source: Synopsys)
  • Fact: when a single, key open source package is deleted, corrupted or otherwise becomes unusable, it can have a disproportionate impact on the software industry as a whole.

The poster child of just this kind of disruption is the “leftpad incident” in the JavaScript ecosystem. In this case, the author of leftpad decided to delete his 11-line package for his own reasons (explained here), despite the fact that tens of thousands of web apps and services relied on it, with the result that they now suddenly displayed:

npm ERR! 404 ‘left-pad’ is no longer in the npm registry.

The effect was as if one person had suddenly broken the internet. While the outage was short-lived (npm restored leftpad), it served to expose the house-of-cards precariousness of modern software development. 

XKCD on software dependencies
Source: XKCD on software dependencies – Post number 2347

Unfortunately, the leftpad incident is far from an isolated example. Other key instances include:

All these cases serve to illustrate the need for BETTER backup and preservation of open source dependencies.

Python’s atomicwrites Package Goes Missing

The latest version of this story occurred last week when the atomicwrites package on the Python Package Index (PyPI) suddenly went missing. Atomicwrites is a Python package that allows atomic file writes, which is useful when you need a file to appear to be consistent even while you’re modifying it. It’s a key dependency for commonly used packages like pytest, home assistant, and many more.

The incident occurred as a direct result of PyPI’s supply chain security initiative, which was in the process of rolling out two-factor authentication (2FA) for authors of what they deemed “critical packages.” Given the daily volume of downloads for atomicwrites, its author was included in the 2FA rollout:

atomicwrites Daily Downloads
Unfortunately, the author took exception to PyPI’s initiative. They viewed it as an extra time requirement being placed on maintainers of open source libraries that have already dedicated copious amounts of their free time. As a result, the author deleted atomicwrites from PyPI and published a new version of it thinking it would remove the “critical” status. While that worked, it also had the unintended consequence of removing all previously published versions of atomicwrites, which the author was unable to replace/reupload. 

The result? In the author’s own words:

“I decided to deprecate this package. While I do regret to have deleted the package and did end up enabling 2FA, I think PyPI’s sudden change in rules and bizarre behavior wrt package deletion doesn’t make it worth my time to maintain Python software of this popularity for free.”

While the folks at PyPI have presently restored all versions of atomicwrites, it serves as yet another example of just how vulnerable to deletions our “shared dependencies” approach to modern coding has become.

A Wayback Machine for Open Source Dependencies

All these incidents serve to highlight the need for better persistence and preservation of open source dependencies. What’s needed is something like a “wayback machine” for public repositories. While every software development organization could take this on individually by implementing dependency vendoring (whereby all the dependencies that a project requires are checked into the organization’s code repository and never deleted), it’s a big ask. Not only does it mean duplicated effort that quickly adds up to a massive time and resources requirement worldwide, but it also introduces the kind of complexity many organizations may not be prepared to deal with. (Read: Everything you need to know about dependency vendoring)

Instead, consider using the ActiveState Platform, which pulls in the source code for open source language dependencies from multiple public repositories, including PyPI (Python), CPAN (Perl), RubyGems (Ruby), Packagist (PHP), and more. And because we never delete any dependency, you can always count on their availability, even if:

  • A dependency becomes unusable or goes missing from its public repository
  • A transitive or OS-level dependency shifts

This is because the ActiveState Platform not only builds dependencies from source code, but also packages them into self-contained runtime environments for Windows, Mac and Linux operating systems.

In fact, we even retain packages that have been found to contain malware, exploits and other malicious code, which are typically deleted from public repositories as soon as they’re identified. However, this means that malware researchers are often left hunting for instances of deleted packages to perform their forensic analyses. We currently mark compromised packages as “unavailable” to ensure users can’t accidentally include them in their runtime environment builds, but you can still access them using our Malware Archivist tool.

How to Ensure Software Dependency Availability

While it’s unlikely that ActiveState will ever capture all of the world’s open source code, what we do capture has made our customers’ software far more resilient to change in a much more cost-effective way than creating a solution by themselves. And if there’s one thing constant about technology in general and open source in particular, it is change.

Additionally, the ActiveState Platform catalog offers better classification and categorization (based on metadata) compared to a general internet search, making open source packages far more discoverable. This is valuable not only for ISVs, but also scientific researchers who are in the midst of a reproducibility crisis (read more here), in which the results of many scientific studies over the past decade have been found to be difficult or even impossible to reproduce at least partially because the software used to run the experiment is itself unreproducible.

If you’ve ever experienced a temporary “package not found” moment of panic, or even a permanent “can’t reproduce the build” situation, the ActiveState Platform may be the solution you’ve been looking for in order to avoid these occurrences in the future.

Next steps:

Want to understand if the ActiveState Platform can help make your software more resilient to changes in the open source ecosystem? Contact Sales, or sign up for a free ActiveState Platform account and try it out yourself. 

 

Recent Posts

Webinar - Walking Dead Past Python EOL
Walking Dead Past Python EOL

With Red Hat dropping Python 2 support, more organizations will be stuck maintaining zombie legacy apps. Stop racing against EOL dates and letting bad practices infect your new projects. Get current and stay current with the latest open source language versions.

Read More
Scroll to Top