2

Fuzzing open source

 2 years ago
source link: https://lwn.net/Articles/710534/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Fuzzing open source

Did you know...?

LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net.

Fuzz testing finds bugs, so it stands to reason that continuous fuzz testing will find more bugs and find them sooner. That is part of the premise behind the OSS-Fuzz program announced by Google on December 1. Since many of the bugs found by fuzzing have security implications, discovering more earlier can only be a good thing for the security of our systems.

OSS-Fuzz is meant to apply fuzzing power to free and open-source software projects, especially those that are part of the critical internet infrastructure. As might be guessed based on that, the Core Infrastructure Initiative has worked with Google to develop OSS-Fuzz. The underlying technology used comes partly from the ClusterFuzz project that has been successfully employed to find a large number of bugs in the Chrome browser.

The basic idea is to continuously fuzz test the latest source version of the projects that have signed up (and been approved) for the OSS-Fuzz beta test. Initially, OSS-Fuzz will be using the libFuzzer coverage-guided fuzzing library in conjunction with AddressSanitizer (ASan) to try to find various types of memory misuse that could be exploited by attackers. That process is parallelized across thousands of virtual machines such that some four trillion test cases are run per week (or were at the time of the announcement, that number may have grown since then).

The announcement describes an early success story for the project:

Our initial trials with OSS-Fuzz have had good results. An example is the FreeType library, which is used on over a billion devices to display text (and which might even be rendering the characters you are reading now). It is important for FreeType to be stable and secure in an age when fonts are loaded over the Internet. Werner Lemberg, one of the FreeType developers, was an early adopter of OSS-Fuzz. Recently the FreeType fuzzer found a new heap buffer overflow only a few hours after the source change:
ERROR: AddressSanitizer: heap-buffer-overflow on address 0x615000000ffa
READ of size 2 at 0x615000000ffa thread T0
SCARINESS: 24 (2-byte-read-heap-buffer-overflow-far-from-bounds)
   #0 0x885e06 in tt_face_vary_cvtsrc/truetype/ttgxvar.c:1556:31
OSS-Fuzz automatically notified the maintainer, who fixed the bug; then OSS-Fuzz automatically confirmed the fix. All in one day!

Many bugs have been found, 150 at the time of the announcement, though there are 200 entries (with 177 verified) in the bug tracker as of this writing. Roughly half of those are marked as security bugs. All of the bugs found will be disclosed within 90 days of their discovery in keeping with Google's own disclosure policy.

ClusterFuzz does more than just parallelize the fuzzing process, it manages test cases to both whittle them down to something small that still reproduces the problem and to continue to run them to detect when the problem is fixed and, after that, whether regressions bring it back. It also tries to determine which change (or set of changes) introduced a problem by doing a bisection.

Projects interested in joining the OSS-Fuzz effort will need to add some fuzz targets to their code. These targets are functions that accept a byte array from the fuzzing engine and use that input in an "interesting" way using the project's API. These fuzz targets are not libFuzzer-specific and can be used by other fuzzers (there is talk of adding support for american fuzzy lop, for example). Those fuzz targets must be integrated with the build and test system for the project.

After that, a corpus of both good and bad inputs for the fuzz target should be created. That gives the fuzzing engine a starting point that has the proper structure for the input data (and the bad inputs will help show ways to "break" the format), which eliminates a whole bunch of wasted effort on inputs that won't get past the first tests in the code. In coverage-guide fuzzing, the binary is instrumented to provide information on the code that a given input has caused to be executed. Changing the input data to cause new paths through the code to be taken is the underlying mechanism for coverage-guided fuzzing.

To set up a new project, a directory needs to be created under projects in a GitHub clone of the oss-fuzz Git repository. It needs a Dockerfile to describe the container environment for building the project and its fuzz targets, a build script (build.sh) that will run in the container to generate a build of the project, and a configuration file (project.yaml) with some project metadata. A pull request can be made to the oss-fuzz project and if it is accepted, the project will be tossed into the ClusterFuzz hopper.

The FAQ lists two main criteria for a project's inclusion in the beta: exposure to remote input and the number of dependent users (people and projects). Right now, the goal is to add "established projects that have a critical impact on infrastructure and user security", though expanding the reach of OSS-Fuzz is in the plans. The current project list has over 50 entries that reads a bit like a "who's who" of open-source projects that fit the criteria including libreoffice, curl, libarchive, pcre2, libpng, openssl, postgresql, sqlite3, tor, strongswan, and so on. There is also a build status page where the most recent build log for each project can be accessed.

Fuzzing takes a lot of resources, but it is an inherently parallelizable process, so it is a perfect match for Google and others with enormous clusters of computers. Though it has taken some time, the security-testing story for open-source projects has certainly gotten better over the years. The lessons of Heartbleed (and, seemingly to a lesser extent, some of the larger vulnerabilities of yesteryear) have not gone completely unheeded. Beyond that, it is good to see other projects that are applying fuzzing to the kernel. Fuzzing is no panacea, but it can certainly help find the next "zero day" before it actually becomes one.


(Log in to post comments)


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK