The Swift Runtime: Class Metadata

Welcome to the fifth in a series of posts on the Swift runtime. The goal is to go over the functions of the Swift runtime, using what I learned in my Swift on Mac OS 9 project as a reference. Last time we finished talking about how the metadata for structs and enums gets set up; this time we’re going to talk about classes.

As mentioned previously, I implemented my stripped-down runtime in Swift as much as possible, though I had to use a few undocumented Swift features to do so. I’ll be showing excerpts of my runtime code throughout these posts, and you can check out the full thing in the ppc-swift repository.

Structs and classes

One way in which Swift differs from some of its contemporaries (Rust, Go, Kotlin) is that it makes a distinction between structs and classes. The biggest difference is that struct instances are passed around by value and class instances by reference.1 But both structs and classes can have stored properties, declare methods, and conform to protocols. There’s just a few important ways that structs and classes differ:2

Because class instances aren’t implicitly copied when you do an assignment or call a function, they can have deinitializers to clean up resources. (Move-only value types will also be able to do this.)
Because class instances are allocated in one place and stay there for their whole life, their address in memory can be used to uniquely identify them, at least while they’re alive. (Raw pointers work like this too.)
Because class instances carry their type with them, they can exhibit polymorphic behavior without using generics. In practice this means “you can inherit from another class and override its methods”. (Protocol-typed values work like this too.)

You can see that you can build something very much like classes out of the component parts of move-only types, pointers, and protocol-typed values (what Rust would spell Arc<dyn View>), but Swift still found it useful to bundle all that behavior together, especially since it had to support interoperation with Objective-C from the get-go.

On the implementation side, there are a number of differences that fall out of this design (and from some other implementation choices in the language). Let’s take a look.

The structure of class metadata

Remember how simple struct metadata was?

Struct metadata value witness table kind* type descriptor

* remember, the value witness table pointer is stored before the table; the “kind” field is the “first” value, i.e. the value at offset 0.

Yeah, classes ain’t so simple.

Class metadata destroyer value witness table kind* superclass ObjC method cache ObjC method cache ObjC-compatible data flags instance “address point” instance size instance align mask
+ some reserved bits class size class “address point” type descriptor ivar destroyer (methods and generic args) (methods and generic args) …

* “kind” is still at offset 0.

There’s a lot going on there! What is all this stuff? Why can’t we get away with storing all the interesting stuff in the type descriptor again? Why do we need any of this? *takes a deep breath* Okay, let’s go through it step by step:

The destroyer calls the deinitializer and then deallocates the class’s memory. We talked about it in the first post in this series.
We still haven’t talked in depth about value witness tables, and we still aren’t going to, but every class has the same one, since all “values” used to manipulate classes are just references, and all object references behave the same.3
Every metadata has a kind. As noted in the first post in this series, the kind for classes is carefully chosen not to overlap with any valid addresses…except on modern Apple platforms, where it’s replaced by a pointer to a “metaclass” object for compatibility with Objective-C.
Classes can have superclasses! And if they don’t have one, the field is nil.
The next three fields are for compatibility with Objective-C. Swift does not use them for anything, and in fact they’ve been removed upstream for non-Apple platforms, only a few weeks after I cut my own branch for this project. (Thanks to Alejandro for the tip.)
The flags are, well, flags, but I didn’t need any of the info they store in my runtime.
The “address point” of an instance specifies whether any fields should be allocated before the object’s metadata pointer when the class is instantiated. Why would you want to do this? Well, it would mean that even when you subclassed a class, you’d still be able to reference “the first ‘negative’ field” without having to know how big the superclass is.

Swift currently does not implement this (i.e. the field is always 0), so I didn’t worry about it in my own runtime, but in theory the real compiler and runtime could start using it.
The size and alignment of instances has to be stored in the metadata because the type layout in the value witness table is the layout of a reference, not the layout of the actual class instance. Since you can’t have really big alignments anyway, some of the bits in this field are reserved for the runtime to store arbitrary data. For us that’s unused.
The class metadata also has a size and address point, which are much the same as for instances. Instead of stored properties, though, we’re counting methods and generic arguments that need to get stored in the class.

This isn’t actually used for much, because it’s not often that you need to know how much memory the class metadata itself takes up. It’s used by the Objective-C runtime for dynamic subclassing, because dynamic subclasses are also going to expect the Swift methods and generic arguments to be there, and by reflection that wants to read the entire metadata. Neither of things are relevant for my runtime.
We talked about type descriptors when we talked about struct metadata: they have information on how to instantiate generic classes, and also store extra metadata for both generic and non-generic classes.
The ivar destroyer is used in the special case of a failable initializer that fails before calling super.init but after a subclass has been initialized. In that case, the subclass’s fields have already been initialized, but running the full deinitializer wouldn’t be safe. (I didn’t actually implement support for this—it uses the runtime function swift_deallocPartialClassInstance—but it’s not complicated.)
And finally we’ve got methods and generic arguments: first the generic args of the root class, if any, then the methods, then the generic args of the first-level subclass, then the methods, and so on. The methods form a vtable, or virtual dispatch table, and so overrides are implemented by replacing a method pointer in the “superclass’s section” of methods.

(My colleague David Smith has commented that it’s odd that Swift made method calls so efficient—an offset lookup in the class metadata—and then turned around and endorsed patterns that didn’t involve class hierarchies. Making method dispatch as fast as C++’s is a constraint that cuts off some interesting ideas.)

Honestly this tour of class metadata is probably more informative than the runtime functions associated with classes, but we’ll go through those too.

Allocating generic class metadata

Classes, structs, and enums all use the same entry point for accessing possibly-cached metadata, swift_getGenericMetadata. But when it comes to actually allocating that metadata, the needs are different for classes. It starts out with a fairly familiar pattern:

@_cdecl("swift_allocateGenericClassMetadata")
func swift_allocateGenericClassMetadata(
  _ rawDescription: TypeErasedPointer<ClassDescriptor>,
  _ arguments: UnsafePointer<UnsafeRawPointer>,
  _ rawPattern: TypeErasedPointer<GenericValueMetadataPattern>
) -> TypeErasedPointer<ClassMetadata> {
  let description = rawDescription.assumingMemoryBound(to: ClassDescriptor.self)
  let pattern = rawPattern.assumingMemoryBound(to: GenericClassMetadataPattern.self)

  let allocationBounds = description[].metadataBounds

The very first thing we need to know is the “bounds” of the metadata, which in this case means its negative and positive extents. This will give us the total size we need to allocate, as well as where to put the “zero offset” pointer that we’ll eventually end up returning. I’ve factored that out into a helper property on ClassDescriptor:

var metadataBounds: ClassMetadataBounds {
  let immediateMembersOffsetInWords =
    self._.metadataPositiveSizeInWords &- self._.numImmediateMembers
  let immediateMembersOffset =
    Int(immediateMembersOffsetInWords) &* MemoryLayout<Int>.size
  return ClassMetadataBounds(
    negativeSizeInWords: self._.metadataNegativeSizeInWords,
    positiveSizeInWords: self._.metadataPositiveSizeInWords,
    immediateMembersOffset: immediateMembersOffset)
}

We’ll come back to immediateMembersOffset soon. For now, this gives us enough info to actually call the allocator.

let bytes = swift_slowAlloc(
  size: allocationBounds.totalSizeInBytes,
  alignMask: MemoryLayout<UnsafeRawPointer>.alignment &- 1)
let rawMetadata = (bytes + allocationBounds.addressPointOffsetInBytes)
let metadata = rawMetadata.bindMemory(to: ClassMetadata.self, capacity: 1)

Aside: Wait, what happens if the superclass size changes? Won’t these values be invalidated? Indeed they will, but in Swift that’s a change that requires recompiling clients unless the base library is built with library evolution support (or unless one of the class’s ancestors comes from Objective-C). My Swift-on-Classic runtime doesn’t support that, so I just left out all of that logic.

Now that we have our allocated metadata, we’ll start by filling in the “negative offset” fields:

rawMetadata.storeBytes(
  of: pattern.destroyFn[],
  toByteOffset: -2 &* MemoryLayout<Int>.size,
  as: Optional<UnsafeRawPointer>.self)
rawMetadata.storeBytes(
  of: swift_getObjectValueWitnessTable(),
  toByteOffset: -1 &* MemoryLayout<Int>.size,
  as: UnsafePointer<ValueWitnessTable>.self)

And then the regular fields:

metadata[]._.base.rawKind = TypeMetadata.Kind.class.rawValue
metadata[]._.superclass = nil
// This is an "is Swift" bit that isn't really needed on non-ObjC platforms.
metadata[]._.objcCompatibleData = 1
metadata[]._.flags = pattern[]._.classFlags

// Layout, filled in later.
metadata[]._.instanceAddressPoint = 0
metadata[]._.instanceSize = 0
metadata[]._.instanceAlignMask = 0

metadata[]._.classSize = UInt32(bounds.totalSizeInBytes)
metadata[]._.classAddressPoint = UInt32(bounds.addressPointOffsetInBytes)

metadata[]._.description = description
metadata[]._.ivarDestroyer = pattern.ivarDestroyer?[]

As you can see, there’s not much going on there! Most of the information is either copied directly from what we already have, or gets a dummy value to be filled in later. Is that really all we have to do?

Well, not quite. Turns out I skipped over two things from before: the “extra data” pattern that’s also in struct metadata, and an additional data pattern that’s used for the class’s “immediate” members: the members that can be overridden but are not themselves overrides. The former actually adds on to the allocation size of the class; the latter is already included. So the first part of swift_allocateGenericClassMetadata actually looks like this:

let bounds = description[].metadataBounds
var allocationBounds = bounds
if let extraDataPattern = pattern.extraDataPattern {
  allocationBounds.positiveSizeInWords &+=
    UInt32(extraDataPattern[]._.offsetInWords) &+
    UInt32(extraDataPattern[]._.sizeInWords)
}

And once we have our metadata, we need to initialize it from those patterns:

if let extraDataPattern = pattern.extraDataPattern {
  // Note: not using allocationBounds here
  let extraDataOffset =
    Int(bounds.positiveSizeInWords) &* MemoryLayout<Int>.size
  (rawMetadata + extraDataOffset).initialize(from: extraDataPattern)
}

let immediateMembers = rawMetadata + bounds.immediateMembersOffset
memset(
  immediateMembers,
  0,
  Int(description[]._.numImmediateMembers) &* MemoryLayout<Int>.size)
if let immediateMembersPattern = pattern.immediateMembersPattern {
  immediateMembers.initialize(from: immediateMembersPattern)
}

Okay, now we’ve got everything initialized. There’s one last thing to do, and it’s the same as for structs: store the generic arguments in the metadata.

installGenericArguments(
  in: rawMetadata.assumingMemoryBound(to: TypeMetadata.self),
  at: bounds.immediateMembersOffset,
  description: rawDescription.assumingMemoryBound(to: TypeContextDescriptor.self),
  from: arguments)
return UnsafeRawPointer(rawMetadata)

We saw installGenericArguments(in‍:at‍:description‍:from‍:) back in the third entry in this series. Nothing’s changed, except that the offset of the generic arguments is based on the particular class’s bounds. As noted, it goes at the beginning of the “immediate members” section.

With that, our class is allocated…but we can hardly say it’s ready to use.

Wrap-up

This post is getting a bit long, so I’m going to call it here, even though we only went through one function. Next time we’ll look at the other half of class metadata initialization, which is mostly about filling in data from the superclass.

This isn’t quite the same as “value semantics” vs. “reference semantics”, which is a whole talk in and of itself. In fact, Alexis Gallagher gave such a talk in 2016, so you can check that out if you’re interested. ↩︎
I made an exploratory diagram about this a while back. ↩︎
This is not quite true when Swift has to interoperate with Objective-C! Because not all Objective-C objects are represented as pointers, there are slightly different rules for what the runtime can do with them. I’m not going to go into that in detail here, but the main thing is that there are more spare bits for Swift object references, and that lets them get packed more tightly into certain enums. ↩︎

This entry was posted on September 29, 2020 and is filed under Technical. Tags: Swift, Swift runtime

The Swift Runtime: Class Metadata