How a 500ms delay exposed "a nightmare scenario" for the software supply chain

"This might be the best executed supply chain attack we've seen described in the open, and it's a nightmare scenario." There's no real plan to prevent the next one.

How a 500ms delay exposed "a nightmare scenario" for the software supply chain
Photo by Juanjo Jaramillo / Unsplash

An audacious attempt to compromise the security of the servers that run enterprise tech was thwarted late last week thanks to a sequence of events that will be hard to duplicate at scale. The incident validated some of the best practices in open-source software and revealed some of its biggest weaknesses, and needs to be a wake-up call for governments, vendors, and tech buyers.

Thanks to some dedicated sleuthing by a Microsoft engineer, Linux maintainers were able to stop a two-year effort by someone posing as an eager-to-help developer to insert a backdoor into production Linux systems. The vehicle was the open-source XZ Utils data-compression tool, and a compromised version of that tool made its way into new, experimental builds of Linux but was detected before it could make its way downstream into commercial distributions.

On Friday morning, Microsoft's Andres Freund (who should never be allowed to buy a beer at any tech conference ever again) posted his discovery of a backdoor in XZ Utils that he noticed after snooping through the code somewhat miffed that it was executing 500ms slower than expected.

He notified maintainers of the Debian Linux operating system that some experimental, newer builds had incorporated the compromised tool, and after double checking, Red Hat also issued an "urgent security alert" for beta versions of Fedora that had been compromised.

The exploit could have allowed whoever controlled it to execute code remotely on compromised Linux servers, which make up the vast majority of servers used in enterprise tech. As the weekend unfolded, security researchers discovered that the backdoor was inserted over a two-year process by someone who volunteered to help write patches for the project.

"This might be the best executed supply chain attack we've seen described in the open, and it's a nightmare scenario: malicious, competent, authorized upstream in a widely used library," said open-source maintainer Filippo Valsorda, as noted by Ars Technica

The attacker took advantage of a widely discussed problem in open-source software: there are countless software libraries and tools at the heart of the world's infrastructure software that are maintained by unpaid, overworked individuals.

Starting in late 2021, multiple people (or someone pretending to be multiple people) pressured the lead maintainer of the project to accept legitimate patches written by the attacker to fix legitimate issues, convincing the lead maintainer that the attacker was just another developer trying to help. After about a year, the attacker was formally added to the list of maintainers for XZ Utils and spent another year helping write new releases until adding the backdoor to the project two months ago.

A world so reliant on enterprise infrastructure software can no longer expect hero hackers or dedicated-but-weary maintainers to save the day by detecting and fixing all the problems in open-source code. But there are no quick, easy solutions here.

One of the biggest problems is the sheer number of open-source libraries and tools that are layered by the dozen to assemble modern software; simply putting together a list of all those projects is an immense undertaking, let alone inspecting them.

The Biden administration's push for government suppliers to maintain a software bill of materials helps, but involving the government too deeply in open-source software maintenance feels like a different kind of disaster waiting to unfold.

Some want maintainers to receive compensation, but even Tidelift's Luis Villa, who has been at the forefront of those efforts, warned that "paying maintainers is not a magic bullet."

"Planning for the scenario in which the worst case has happened and understanding the outcomes and recovery process is everyone’s homework now, and making sure you are prepared with tabletop exercises around zero days," wrote Docker's Justin Cormack.

(This post originally appeared in the Runtime newsletter on April 2nd, sign up here to get more enterprise tech news three times a week.)

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Runtime.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.