7

My Favorite Bits of OSDI/ATC'23

 1 year ago
source link: https://brooker.co.za/blog/2023/07/13/osdi.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

My Favorite Bits of OSDI/ATC'23

Talking to 3D people is cool again.

This week brought USENIX ATC'23 and OSDI'23 together in Boston. While I've followed OSDI and ATC papers for years, it's the first time I've been to either of them (I've have been to NSDI a couple times). It was a really good time. In this post I'll cover a couple of my favorite papers1, and trends I noticed.

Overall, it was great to meet a bunch of folks in person who I've only interacted with online, and nice to be back to in-person conferences.

Thoughts and Trends

  1. When we presented the Firecracker paper at NSDI'20, several people said to me that they were worried about the fact we had chosen Rust, because it raised the risk that Firecracker wouldn't be useful once Rust was no longer in vogue. This year at OSDI, pretty much everybody I talked to was building in Rust. Obvious exceptions are folks doing AI/ML work (Python still seems big there), and folks looking to get into the mainline Linux kernel. I couldn't be more happy to see memory safety start to become the default practice in systems.

  2. Loads of folks were talking about emergent system properties like metastability. Unfortunately, not a lot of folks seem to be writing papers about it, or getting grants to work on it. I did talk to a couple folks with upcoming papers, and I really hope the hallway interest turns into more publications. Metastable failures in distributed systems and Metastable Failures in the Wild are some of the most important systems work of the last few years, in my opinion. There's a lot more to do here.

  3. I got a rough feeling that more papers were paying more attention to security issues than in years past. Subtle issues like timing side-channels especially. Another trend I like to see. Security and systems have always been linked, so this isn't new, but there does seem to be a reduction in completely security-naive work.

Some of the Papers I Enjoyed The Most

  • Take Out the Trache by Audrey Cheng et al2. This paper makes an astute observation about how caches help with latency the most when everything a transaction needs is cached, and so traditional cache eviction strategies don't make the right decisions. They then present new metrics, and a nice design for improving things. Worth reading if you're building any kind of database or distributed cache.
  • VectorVisor by Samuel Ginzburg et al. What if we compiled normal applications to WASM, then ran them on GPUs? And it actually worked? This is the kind of academic systems work I love the most: bold, innovative, and solving a problem that doesn't really exist yet but definitely could in the future.
  • EPF: Evil Packet Filter by Di Jin et al. Operating system kernels like Linux use various internal mechanisms that make it harder to go from kernel bug to working exploit. This paper looks at how useful the current BPF implementation can be for thwarting these mechanisms.
  • Triangulating Python Performance Issues with SCALENE by Emery Berger et al. A selection of cool approaches for profiling CPU, GPU, and memory in Python programs. Emery finished his talk with a tantalizing demo: generating performance patches automatically by combining LLMs with profiler results.

There are many papers I haven't read yet, but have heard good things about. I want to look at MELF, zpoline, Ensō, and vMVCC in more detail.

Amazon's Papers

We presented two papers at ATC this year:

Cloning and Snapshot Safety

A number of papers in the program implemented VM or process cloning, typically for accelerating serverless workloads. This thread of work, related to our own work on Lambda Snapstart, is bound to have a lot of influence over how systems are built in the coming decades. But I was disappointed to see most of these papers not paying attention to some of the uniqueness risks of cloning. As we describe in Restoring Uniqueness in MicroVM Snapshots, naively cloning VMs leads to situations where UUIDs, cryptographic keys, or IVs can be duplicated between clones. I'd love to see folks working on cloning insist on solving this problem in their solutions.

Soapbox

Two things came up that I found extremely disappointing. First, there were a lot of folks who should have been there (especially paper authors) who couldn't get visas to come to the US. It's unacceptable and counterproductive to have a visa policy where folks who are doing cutting-edge research in an economically-critical areas can't trivially travel to the USA.

Second, a group of folks presented the results of the CS Conference Climate & Harassment Survey. I'd recommend reading Dan Ports' post for a summary of the results. In short, 40% of the community have experienced harassment at conferences (not necessarily this conference, or a USENIX conference), and 30% of non-male attendees don't feel welcome. This is unacceptable, and we need to do better3.

Footnotes

  1. These are some of my favorites of the ones I've read, or saw talks for. If you presented a paper and it's not on this list, you can safely assume I haven't had time to check out your excellent work yet.
  2. Great DB work from the folks at UC Berkley? Hard to believe.
  3. I unfortunately missed the dedicated session on this topic, and look forward to attending similar sessions at future conferences.

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK