[RFC] MDL: A Micro-Architecture Description Language for LLVM

TL;DR:

We’ve created a DSL and compiler for modeling micro-architecture that handles a very broad class of architectures - CPU, GPUs, VLIWs, DSPs, ML accelerators, and embedded devices. This effort grew out of a need to quickly develop and experiment with high-quality compilers and tools to facilitate rapid architecture exploration. We named the DSL “MDL” for “Microarchitecture Description Language”.

While being significantly more expressive than TableGen’s Schedules and Itineraries used in LLVM, MDL is also more concise, and simpler to read and write while supporting a much broader class of embedded and accelerator architectures. We currently can automatically _generate _MDL descriptions for all upstream targets which are in many cases 1/10 the size of the equivalent TableGen descriptions. We’ve integrated this with LLVM, and are sending out this RFC because we believe it could be valuable to the larger LLVM community. \

The MDL compiler, associated tools, and documentation are available as open source (at GitHub - MPACT-ORG/llvm-project at work 115), and we would like to explore adding this to the LLVM project, and encourage contributions from others.

Background

Over the last few years, we have been using LLVM to develop a compiler backend for Google’s TPU machine learning accelerators. TPUs have complex microarchitectures and pose a number of challenges that are not seen in in typical LLVM targets:

Clustered VLIW with partitioned register files.
Extremely deep pipelines with complex hazard conditions
Instructions with functional-unit-specific and/or cluster-specific behaviors
- Non-trivial and/or instance-specific latencies
- Complex resource usage
- Functional-unit-specific register constraints
Shared/allocated encoding resources (instructions need 1…M of N resources)
Explicitly managed hardware resources (register ports, internal datapaths, busses, etc)

While some of these problems manifest in a few upstream targets, this collection of problems is a superset of the problems directly addressed by LLVM - Schedules and Itineraries are simply not sufficient to model everything. Supporting this class of architecture is therefore code-intensive - it takes around 20,000 lines of C++ code to model the TPU sub-targets. This is brittle, hard to write, debug, test, and evolve over time. In contrast, the MDL description for these sub-targets is ~2,000 lines of text.

Status

We’ve created the MDL language and compiler for describing microarchitecture details, a methodology for integrating it with TableGen files for any target, and a set of APIs that can be used in a machine-independent way to inform back-end passes such as bundle-packing, instruction scheduling, and register allocation.
To facilitate integration with LLVM, we built a tool which scrapes architectural information from TableGen files, and produces our MDL language for all upstream targets.
We’ve modified the CodeGen and MC libraries to (optionally) use our methodology for latency management.

There is a lot more to do. For example, we plan to enhance existing back-end scheduling passes and register allocation passes to cleanly handle a larger class of embedded and accelerator architectures, based on MDL-generated information.

We welcome feedback on the language design and associated tools and use model. You can find the MDL design documentation in our github repo in llvm/docs/Mdl.

-Reid

Recommend

B端业务该怎么做私域？

Deploying TechZone Toolkit Modules on existing Clusters

Tech team - Senior Backend Engineer

Android-Entwicklerin Mid-to-Senior-Level bei satellite (m/w/d)

PSA: You’re already running out of time to buy an iPhone 14 Pro for the holidays

The Tiangong Space Station Makes China a Major Space Power | WIRED

Lyft earnings preview: What to expect amid tech distress signals

Exclusive Interview with Billionaire Entrepreneur Ryan Breslow

2023年宠物赛道都有哪些“弯道超车”新机遇？聚焦宠物食品、用品创新的第三届PIIS2022论...

Matt Ruby Gets Overit: Key Takeaways and Actionable Advice

About Joyk