Hardening the "file" utility for Debian
source link: https://lwn.net/Articles/796108/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Hardening the "file" utility for Debian
Benefits for LWN subscribers
The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!
The file command would seem to be an ideal candidate for sandboxing; it routinely handles untrusted input. But an effort to add seccomp() filtering to file for Debian has run aground. The upstream file project has added support for sandboxing via seccomp() but it does not play well with other parts of the Debian world, package building in particular. This situation provides further evidence that seccomp() filtering is brittle and difficult to use.
The discussion began with a post to the debian-devel mailing list where Christoph Biedl announced that he had enabled the file sandbox feature for the unstable repository. He was asking that other Debian developers keep an eye out for problems. He noted that the feature has some drawbacks:
In addition, he had already encountered problems with file running in environments with non-standard libraries that were loaded using the LD_PRELOAD environment variable. Those libraries can (and do) make system calls that the regular file binary does not make; the system calls were disallowed by the seccomp() filter.
Building a Debian package often uses FakeRoot (or fakeroot) to run commands in a way that appears that they have root privileges for filesystem operations—without actually granting any extra privileges. That is done so that tarballs and the like can be created containing files with owners other than the user ID running the Debian packaging tools, for example. Fakeroot maintains a mapping of the "changes" made to owners, groups, and permissions for files so that it can report those to other tools that access them. It does so by interposing a library ahead of the GNU C library (glibc) to intercept file operations.
In order to do its job, fakeroot spawns a daemon (faked) that is used to maintain the state of the changes that programs make inside of the fakeroot. The libfakeroot library that is loaded with LD_PRELOAD will then communicate to the daemon via either System V (sysv) interprocess communication (IPC) calls or by using TCP/IP. Biedl referred to a bug report in his message, where Helmut Grohne had reported a problem with running file inside a fakeroot. The msgget() system call was the cause in that case; Biedl changed the Debian file whitelist to specifically allow that call before his announcement:
There is a workaround for such situations which is disabling seccomp, command line parameter --no-sandbox.
As it turns out, though, his fix was specific to the sysv IPC mechanism; in order to make it work with TCP/IP, more whitelisting of system calls will be needed, as Grohne pointed out. Furthermore, blocking mechanisms like IPC and networking is just what the filter should be doing; those are the kinds of calls you don't want to make if file is compromised, he said. Instead of playing whack-a-mole with system calls, he suggested checking for the presence of LD_PRELOAD libraries and turning off the sandbox for those cases.
That idea did not sit entirely well with Biedl, who was concerned with "silently disabling this security feature in a production system". He thought that perhaps disabling the filter for build environments might be a way forward. Meanwhile, on debian-devel, several people thanked Biedl for enabling the filter, seeing it as a good step toward helping to secure the system. Russ Allbery said:
But Biedl eventually had to deliver some bad news in the thread. He disabled the system-call filtering in file because of the problems it caused:
However, he did point out that Grohne had suggested some ideas for ways to make the sandboxing of file more workable. In the bug report, Grohne said:
Of course, getting there is essentially rewriting the seccomp feature in file. You cannot easily bolt it onto file in the way it currently is.
That is something that will need to be worked out with the upstream project and Biedl said that he plans to do so. There were several suggestions on how to approach the problem in the mailing list thread as well. Colin Watson commiserated with Biedl, reporting on the problems he encountered when adding seccomp() filtering:
At the moment my compromise solution is to reluctantly open up the minimum possible set of syscalls I could find that stopped people sending me bug reports that were in fact caused by something injected from outside my software, and to limit most of that to only those cases where I've detected the relevant LD_PRELOAD wrappers as being present.
The fragility of the seccomp() solution extends to glibc and kernel versions, as Vincent Bernat pointed out. Those kinds of problems could be detected through automated testing, Philipp Kern suggested. Biedl said that it is something he is working on.
In file, we have a strong candidate for hardening, as it parses and handles file data that often has unknown origins—textbook untrusted input, in other words. But actually using seccomp() filtering to reduce its attack surface has not been successful for Debian. In truth, hardening programs that are often used in conjunction with LD_PRELOAD is always going to be difficult to impossible. But even just changing the version of glibc (which can potentially change the system calls it makes) or which kernel the tool is running on can invalidate the carefully crafted whitelist.
The OpenBSD pledge() system call provides a different path. Developers can specify which system calls are allowed, but only in broad categories like stdio (file operations, mostly), inet (IPv4 and IPv6 calls), or proc (process calls, such as fork(), but not including execve(), which is governed by the exec category). By not tying the filtering directly to individual system calls, some of the problems that Linux seccomp() users have encountered can be avoided. It also doesn't hurt that the OpenBSD user space is released in lockstep with the kernel.
For its file utility, OpenBSD systematically reduces the privileges that the tool has with multiple pledge() calls. It starts by disallowing all but a handful of categories after processing the command-line arguments. It then forks a process that executes the child() function, which reduces privileges further, eventually to only have stdio and recvfd. The child reads messages from the parent, each of which includes a file descriptor for a file to be tested. In that way, the code that is most at risk for compromise is only able to perform fairly minimal operations.
For Linux, it may well be that seccomp() filtering just isn't suitable for retrofitting onto existing projects. Completely separating the "worrisome" code (file-format parsing for file, for example) from the unavoidable code (e.g. opening files) may provide a path, but also probably means the existing code will have to be rewritten or at least majorly thrashed. The calls that LD_PRELOAD libraries are targeting for interception will likely be in that unavoidable part. Perhaps that could even lead hardened subprocesses to simply use the older, simpler seccomp() mode, as suggested by Grohne. That seems preferable to playing a never-ending game of whack-a-mole.
(Log in to post comments)
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK