4

Announcing the FreeBSD/Firecracker platform

 1 year ago
source link: https://www.daemonology.net/blog/2022-10-18-FreeBSD-Firecracker.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Announcing the FreeBSD/Firecracker platform

The Firecracker Virtual Machine Monitor was developed at Amazon Web Services as a building block for services like AWS Lambda and AWS Fargate. While there are many ways of launching and managing VMs, Firecracker distinguishes itself with its focus on minimalism — important both for security (fewer devices means less attack surface) and reducing the startup time, which is very important if you're launching VMs on demand in response to incoming HTTP requests. When Firecracker was first released, the only OS which it supported was Linux; six months later, Waldek Kozaczuk ported the OSv unikernel to run on Firecracker. As of a few minutes ago, there are three options: FreeBSD can now run in Firecracker.

I started working on this on June 20th mainly out of curiosity: I had heard that Firecracker had PVH boot support (which was in fact mistaken!) and I knew that FreeBSD could boot in PVH mode from Xen, so I wondered just how hard it would be to get FreeBSD up and running. Not impossible, as it turned out, but a bit more work than I was hoping for.

I had a lot of help from other FreeBSD developers, and I'd like to thank in particular Bryan, Ed, Jessica, John, Kyle, Mark, Roger, and Warner for explaining code to me, helping review my patches, and even writing entirely new code which I needed. Among the changes which went into getting the FreeBSD/Firecracker platform working:

  • The PVH boot mechanism uses an ELF Note to tell the loader where the PVH kernel entry point is located; FreeBSD was using an SHT_NOTE while Firecracker (or rather, linux-loader) was looking only for PT_NOTEs. Once I tracked down the problem this was fixed quite quickly by Ed and Roger.
  • When PVH booting, the loader provides the requested images (kernel, and potentially kernel modules and ramdisk), and also a "start into" structure with metadata needed for the boot process. In Xen, the kernel and modules are loaded into memory first and the start info structure is placed immediately after them; Firecracker places the start info page first and loads the rest later. Very early in the boot process, FreeBSD needs a page of temporary space — and it was using the page immediately after the start info page. Mark and Roger reworked the FreeBSD PVH boot code to use the first page after all of the data provided by the PVH loader — not overwriting important data makes a difference.
  • Firecracker doesn't provide ACPI, instead providing information about CPUs and interrupt controllers via the MPTable interface defined in the historical Intel MultiProcessor Specification. Support for this isn't included in the FreeBSD GENERIC kernel — no matter, I was going to provide a customized "FIRECRACKER" kernel configuration anyway — but Firecracker's implementation had two bugs: It placed the MPTable in the wrong place (above the advertised top of system memory rather than in the last kB) and it set a field containing the number of table entries to zero rather than the appropriate count. In both cases, Linux accepts the broken behaviour; so I added a "bug for bug compatibility" option to the FreeBSD MPTable code.
  • Upon entering userland, the FreeBSD serial console died after printing 16 characters. This bug I recognized, since I ran into it with EC2: The UART is losing an interrupt on the transmission FIFO. Fortunately the FreeBSD kernel still had a workaround in place and setting hw.broken_txfifo="1" fixed that problem.
  • The serial console also couldn't read input — in fact, Firecracker wouldn't read input, and any keypresses stayed in the terminal buffer until after Firecracker exited. This turned out to be due to a bug — or perhaps I should say missing feature — in Firecracker's UART emulation: Firecracker doesn't emulate the FCR (FIFO Control Register), which FreeBSD uses to flush the FIFO. I added code to check if flushing the FIFO via the FCR succeeded, and if not switched to the (slower) approach of reading bytes and discarding them. (Why do we need to flush the FIFO? When the UART is first attached, we write data into it to see how large the buffers are, and then throw away the dummy data.)
  • Firecracker uses Virtio to present virtual devices to guest operating systems; no problem, FreeBSD has Virtio support. Except... FreeBSD discovers Virtio devices via ACPI, which doesn't exist on Firecracker. Instead, Firecracker exposes device parameters (memory-mapped I/O address and interrupt number) via the kernel command line. This took quite a bit of plumbing to handle — not least of which because FreeBSD interprets the kernel command line as environment variables, and the Virtio MMIO specification calls for devices to be exposed as a series of virtio_mmio.device=... arguments — i.e., with the same "variable name" for each of them. The FreeBSD kernel now handles such duplicate environment variables by appending suffixes, so that we end up with virtio_mmio.device, virtio_mmio.device_1, virtio_mmio.device_2, et cetera, and the Virtio driver looks for those environment variables to create device instances.
  • Most Virtio hosts handle disk I/Os consisting of multiple segments of data; QEMU for example handles 128 segments. Firecracker is more minimalist: It rejects I/Os with more than one segment. This causes problems for FreeBSD with unaligned I/Os from userland, since a buffer which is contiguous in virtual address space might span non-contiguous pages in physical address space. I modified FreeBSD's virtio block device driver to make use of the busdma system, which "bounces" (aka. copies via a buffer) data as needed to comply with alignment (and other) requirements. Now when a Virtio block device only supports single-segment I/Os, if we get an unaligned request we bounce the data.

Now that FreeBSD supported Firecracker, there was one more thing to do: Make Firecracker support FreeBSD. I mentioned earlier that I mistakenly thought that Firecracker supported PVH booting; as it turned out, Alejandro Jimenez contributed patches two years ago, but they were never merged. Some of his code ended up in the linux-loader project (which Firecracker uses); but I spent a few weeks digging through his thousand lines of changes to figure out which went into linux-loader, which still applied cleanly to Firecracker, and which I had to rewrite from scratch — a task made more difficult by the fact that Firecracker is written in Rust, and I had never used Rust before! Nevertheless, I was eventually successful, and opened a PR with updated patches which I hope to see merged into mainline Firecracker in the upcoming weeks.

How to try FreeBSD/Firecracker

To try FreeBSD on Firecracker, you'll need to build a FreeBSD amd64 FIRECRACKER kernel, and build Firecracker with my patches:

  • To build the FreeBSD kernel (on a FreeBSD system):
    # git clone https://git.freebsd.org/src.git /usr/src
    # cd /usr/src && make buildkernel TARGET=amd64 KERNCONF=FIRECRACKER
    
    and the built kernel will be found in /usr/obj/usr/src/amd64.amd64/sys/FIRECRACKER/kernel.
  • To build Firecracker with PVH boot support (on a Linux system):
    # git clone -b pvh-v3 https://github.com/cperciva/firecracker.git 
    
    and follow the instructions in the getting started documentation to build from source

You'll probably also want to build a disk image so that FreeBSD has something to boot from; place vfs.root.mountfrom=ufs:/dev/vtbd0 into Firecracker's boot_args to tell FreeBSD to use the disk you attach (aka. the first Virtio block device) as the root disk. If there's significant community interest in experimenting with FreeBSD/Firecracker, I'll provide a prebuilt FreeBSD kernel, FreeBSD root disk, and Firecracker binary so people can skip the process of building these themselves.

Have fun!

Posted at 2022-10-18 06:05 | Permanent link | 1 Comment

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK