Recently, GitHub reported that more than 35,000 files in GitHub repositories had been found to include a malicious URL, namely:
Many of these files were included in cloned repositories – clones of popular projects – which were re-released under a similar name. This is a classic example of typosquatting, but on a massive scale that has not been seen previously on GitHub. A typical infected file might look like:
Cloned repositories altered with malware contain backdoor (source: BleepingComputer)
The threat here is twofold:
- The malicious URL could be used to exfiltrate a user’s environment variables, such as API keys, tokens, AWS credentials, etc.
- A code execution backdoor (line 241 in the code example above) could allow remote attackers to execute arbitrary code on installed systems.
Typosquatting occurs when a bad actor:
- Creates a malware-infected version of publicly available code, such as an open source package or, in this case, a GitHub repository.
- Names the package/repo similar to that of an existing popular package/repo.
- Uploads it to the public repository in the hopes that developers will mistakenly download it rather than the valid package/repo.
Unfortunately, typosquatting has been proven to be incredibly effective, and is one of the most popular ways for bad actors to compromise organizations. Luckily, there are a number of best practices that can help mitigate the risk of importing typosquatted code.
How to Mitigate the Risk of Typosquatting
We all make typing mistakes, which makes us all susceptible to typosquatting. But there are a number of solutions that organizations can put in place to help catch those mistakes before they result in the import of a malicious exploit:
- Typosquatting Filter – if you use Python, we’ve recently created a Typosquatting Detector project that you implement to help identify and filter out potentially typosquatted Python packages before they infect your organization.
- Malware Detectors – there are a number of best-in-class solutions (both commercial and open source) that can help combat typosquatting. Learn more about the Top 10 Malware Scanners we recently reviewed.
- Securely Populate Repositories – if your organization uses an artifact repository such as JFrog Artifactory, Sonatype Nexus, Azure Artifacts or similar tool, rather than populating it with prebuilt packages from public repositories (which would potentially expose you to typosquatting), consider populating it with signed code from trusted vendors, instead.
Security-conscious organizations may want to implement more than one best practice, creating a defense in depth strategy that not only seeks to prevent typosquatted code from ever entering the organization, but also scanning for malware on a regular basis throughout the software development lifecycle in order to catch any that may have inadvertently slipped in.
Conclusions: Secure the Software Supply Chain
Securing the software supply chain is a difficult and expensive task due to the breadth and depth of the open source supply chain that most software development organizations require. Each point in the chain offers multiple points of entry for malicious actors who will always look for the weakest link to exploit. This makes ensuring the integrity and security of publicly available code a complicated and costly problem for most organizations.
While no vendor currently provides a comprehensive, end-to-end supply chain solution, some like ActiveState have begun to offer a turnkey solution for use cases involving open source languages like Python, Perl, Tcl and Ruby. Such an out-of-the-box solution can save enterprises significant time, resources and money when compared to a multi-vendor approach.