In March, a software bug threatened to derail large swaths of the web. XZ utils, an open-source compression tool embedded in myriad software products and operating systems, was found to have been implanted with a backdoor.
The backdoor—a surreptitious entry point into the software—would have allowed a person with the requisite code to hijack the machines running it and issue commands as an administrator. Had the backdoor been widely distributed, it would have been a potential disaster for millions of people.
Luckily, before the malicious update could be pushed out into wider circulation, a software engineer from Microsoft noticed irregularities in the code and reported it. The project was subsequently commandeered by responsible parties and has since been fixed.
While disaster was narrowly averted, the episode has highlighted the ongoing liabilities in the open-source development model that are longstanding and not easily fixed. The XZ episode is far from the first time an open-source bug has threatened to derail large swaths of the web. It certainly won’t be the last. Understanding the vexing cybersecurity dilemmas posed by open-source software requires a tour through its byzantine and not altogether intuitive ecosystem. Here, for the uninitiated, is our attempt to give you that tour.
The Web Runs on FOSS
Today, the vast majority of codebases rely on open-source code. It is estimated that 70 to 90 percent of all software “stacks” are composed of it. In all likelihood, the vast majority of the apps on your phone have been designed with it, and, if you are one of the 2.5 billion people in the world who uses an Android, your device’s operating system is a modified version of the software that originated with the Linux kernel—the largest open source project in the world.
When people talk about software “supply chains”—the digital scaffolding that supports our favorite web products and services—much of that code is made of open-source components. Its ubiquity has led observers to refer to open source as the “critical infrastructure” of the internet—a Protean substance that is both indispensable and incredibly powerful.
Yet, as important as it is, open-source software remains a subject that isn’t widely understood by most people outside of the tech industry. Most people have never even heard of it.
Why Use Open Source?
For the uninitiated, a quick explanation might go something like this: Unlike “closed” or proprietary software, free and open source software, or FOSS, is publicly inspectable and can be used or modified by anyone. Its usage is determined by a variety of licensing agreements and, quite uniquely, the components are often maintained by volunteers—unpaid developers who spend their free time keeping the software up to date and in good working condition.
Open-source projects can start as pretty much anything. Often, they are small projects for digital tinkerers who simply want to build something new. Eventually, some projects get popular, and private companies will begin incorporating them into their commercial codebases. In many cases, when a corporate development team decides to create a new application, they will construct it using a wealth of smaller, already existing software components which are comprised of hundreds or even thousands of lines of code. These days, most of those components come from the open-source community.
It can be sort of difficult to picture how this odd relationship between commercial software and the open source ecosystem works. Luckily, several years ago the webcomic artist Randall Munroe created what is now a well-known meme that helps visualize this counterintuitive dynamic:
There are many reasons that companies turn to open source for their development needs. Aside from the fact that it’s free, FOSS also allows for software to be created with efficiency and speed. If a programmer doesn’t have to worry about the fundamental building blocks of an application’s code, it frees them up to focus on the software’s more marketable elements. In a competitive environment like Silicon Valley—where a speedy time to market is a critical advantage—open source is pretty much a DevOps imperative.
But with speed and agility comes vulnerability. If FOSS is a ubiquitous element of modern software, there are also structural problems with the ecosystem that put massive amounts of software at risk. Those problems can get pretty hairy pretty quickly—often with disastrous results.
Bugs From Hell
The XZ episode didn’t end in disaster, but it easily could have. One instance where web users weren’t so lucky was the notorious “log4shell” incident. Three years ago, in November 0f 2021, a code vulnerability was discovered in the then-popular open-source program log4j. A logging library, programs like log4j are regularly integrated into apps, where coders use them to record and assess a program’s internal processes. Log4j, which is maintained by the open-source organization Apache, was widely used at the time of the bug’s discovery and was embedded in millions of applications all over the world.
Unfortunately, log4j’s bug—dubbed “log4shell”—was quite bad. Like the XZ bug, it involved remote code execution. This meant that a hacker could quite easily inject their own “arbitrary” code into an impacted program, enabling them to hijack the machine running it. Due to log4j’s popularity, the scope of the bug was massive. Major, multi-billion dollar companies were affected. Hundreds of millions of devices were vulnerable. In the days after the flaw’s disclosure, experts estimated that the vulnerabilities were a ticking time bomb and that cybercriminals were already looking to exploit them.
The discovery of the bug sent corporate America into a full-blown panic and spooked the highest levels of the federal government. Some of the biggest companies in the world were at risk—making it a matter of national security. Several weeks after the bug’s discovery, Anne Neuberger, a top cybersecurity advisor to President Joe Biden, called a White House summit on open source security, inviting executives from Microsoft, Meta, Google, Amazon, IBM, and other big names, as well as influential open source organizations like Apache, the Linux Foundation, and Linux’s Open Source Security Foundation, or OSSF. The meeting was less concerned with how to remedy the hellish vulnerability than with figuring out how to stop this sort of thing from ever happening again.
Not long after the meeting, top executives at the Linux Foundation, including then-general manager of OSSF Brian Behlendorf, began formulating a so-called “mobilization plan” to better secure the entire FOSS ecosystem. The federal government, meanwhile, began developing its own strategies to further regulate the tech industry. Most notably, President Biden’s cybersecurity plan, which was published last year, has sought to prioritize a number of new safeguards to prevent the emergence of new, highly destructive bugs.
Yet as the dangers surrounding the XZ vulnerability show, FOSS is still an environment that, at its highest levels, is vulnerable to bugs that could have catastrophic, system-wide implications for the internet. Understanding the risks in FOSS, however, isn’t easy. It requires a detour into the unique ecosystem that produces so much of the world’s software.
Closed Source Doesn’t Mean More Secure
Before we go any further, it’d be helpful to make one thing clear: Just because a software program is “closed source” or proprietary doesn’t mean it’s more secure. Indeed, security experts and FOSS proponents contend that the opposite is true. We’ll revisit this issue again later but, for the time being, I’ll just direct your attention to a little company called Microsoft. This company, despite being a prominent, closed-source corporate giant, has had its product base hacked countless times—sometimes to disastrous effect. Many companies that keep their products closed have similar track records, and, unlike with open-source software, their security issues are often kept secret, since nobody but the company has access to the code.
The Maintainers
If you want to talk about the security risks in open-source software, you have to start by talking about the people behind the code. In the open-source ecosystem, those people are known as “maintainers” and, as you might expect, they are in charge of maintaining the quality of the software.
Explaining the role of the maintainer is a little complicated. Maintainers might aptly be compared to the construction workers who—in the real world—build our roads and bridges. Or, the engineers who design them. Or both. In short, a maintainer is the caretaker (and often the creator) of a particular open-source project but, in many cases, they work together with “contributors”—users of the software who want to make improvements to the code.
Maintainers host their open-source projects on public repositories, the most popular of which is Github. These repositories include interactive mechanisms that are ultimately controlled by the maintainer. For instance, when a contributor wants to add something to a project, they might submit a “pull request” on GitHub, which includes the new code they hope to add. The maintainer is then tasked with signing off on a “merge,” which will update the project to reflect the contributor’s changes. It’s through this collaborative process that open-source projects continually grow and transform.
As the master controller of these living, iterative projects, the maintainer’s job often requires an immense amount of work—everything from ongoing correspondence with users and contributors, to signing off on code commits, to creating “documentation” guides that show how everything inside the software actually works. Yet, for all of that work, a whole lot of maintainers are not paid particularly well. Most are not paid at all. Open source is supposed to be free, remember? In the world of FOSS, hard work is repaid with little more than the knowledge that your code is being put to good use.
The plight of the maintainer is a peculiar one and is very much tied up with open source’s complicated history, as well as its not altogether straightforward relationship with the corporations that use its code.
A Flash History of FOSS
It’s helpful to consider that, in the beginning, open source didn’t have much to do with corporatism or money. In fact, it was just the opposite.
FOSS grew out of an idealistic hacker movement from the 1980s called the “free software” movement. For all intents and purposes, that movement began with Richard Stallman—an eccentric computer scientist who looks a little like Jerry Garcia and has long espoused a bold form of cybernetic idealism. In 1983, while working at the MIT Artificial Intelligence Lab, Stallman established GNU, a repository of free software. The idea behind the collection was user control. Stallman balked at the idea that private companies could keep software behind a walled garden. He felt that software users needed the ability to control the programs they used—to see how they worked, as well as to change or modify them if they wished. As such, Stallman postulated the idea of “free” software—famously commenting that he meant free “like free speech, not like free beer.” That is to say, Stallman is not against developers getting paid, but their code should be open and visible to all for future improvement.
In 1991, a then-21-year-old Finnish computer programming student named Linus Torvalds spurred the next great innovation in open-source history. Reportedly out of boredom, Torvalds created a new operating system and named it after himself, calling it “Linux.” Pivotally, Torvalds created the Linux “kernel,” the vital component within any operating system that governs the interface between a computer’s hardware and its digital processes. It wasn’t clear at the time, but Linux would go on to become the largest, most popular open-source project in the world. Today, there are hundreds of Linux distributions (or “distros”) that use the kernel that Torvalds created.
In 1998, a small but influential group inside the free software community decided they wanted to break away from the movement’s…
Continuar leyendo: Open-source cybersecurity could derail the internet as we know it