2

Chromium M105 launched with Control Flow Integrity - Operating Systems blog - Ar...

 1 year ago
source link: https://community.arm.com/arm-community-blogs/b/operating-systems-blog/posts/control-flow-integrity
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Enhancing Chromium's Control Flow Integrity with Armv9

7215.Arm_5F00_Premium_5F00_Solution_5F00_Security_5F00_Hero-_2800_1_2900_.jpg_2D00_900x506x2.jpg?_=638010128403393326
7 minute read time.

The Android and Browser Enablement Team is pleased to mark the late-August release of Chromium M105 – the first with Arm’s Pointer Authentication (PAC) and Branch Target Identification (BTI) features fully enabled. M105 marks the end of over three years of work by Arm to bring these industry-leading control flow integrity technologies to Chromium and is an important security boost for customers using devices built on the Armv9 architecture like the Samsung Galaxy S22 and Vivo X80 5G.

What is Control Flow Integrity (CFI) and why is it important?

Android-powered devices store our most sensitive personal data. Security is therefore a top priority for Arm and this extends through all levels of the system, including user-space applications like Chromium. Alongside our photos, passwords, and bank details, we also expect Chromium to download and execute untrusted content from all over the Internet safely, with it using industry-standard techniques like sandboxing to do so.

While Chromium is rightfully recognised for its strong security, all software is built on abstraction and the implicit promise that execution matches the developer’s intent – that the right subroutines are executed entirely, in the intended context, and the correct order. Jumping or returning to an arbitrary location lets an attacker chain fragments of previously-innocuous code, which can then lead to more serious attacks against the kernel or Android’s system services. Making these attacks more difficult is why Arm has developed the architecture, hardware, and software behind PAC and BTI.

How do PAC and BTI work?

Return oriented programming (ROP) attacks occur when an attacker achieves an overwrite of the return address stored on the current stack frame. Pointer authentication helps prevent this attack and protects the backward edge by cryptographically protecting the link register with a unique per-process key using a PACIASP instruction before saving it to the stack (see figure 1). When a function returns, PAC authenticates the restored link register with a matching AUTIASP instruction before returning to it. Arm calls this specific use of PAC to protect the return address ‘PAC-ret’.

Control flow integrity

Figure 1: A basic function protected with pointer authentication instructions

While PAC is a good improvement, it only protects the reverse edge - that’s where BTI comes in. As an example, consider a function like this:

Fullscreen
void do_something(int message, void *payload) {
switch(message) {
case 0:
handle_type_0(payload);
case 1:
handle_type_1(payload);
case 2:
handle_type_2(payload);
case 3:
handle_type_3(payload);
default:
assert("shouldn't be here");
void do_something(int message, void *payload) {
    switch(message) {
        case 0:
            handle_type_0(payload);
        case 1:
            handle_type_1(payload);
        case 2:
            handle_type_2(payload);
        case 3:
            handle_type_3(payload);
        default:
            assert("shouldn't be here");
    }
}

This might compile to something like this with the -mbranch-protection=pac-ret compiler flag:

Fullscreen
do_something(int, void*):
cmp w0, #3
b.hi .LBB0_6
paciasp
stp x29, x30, [sp, #-32]!
str x19, [sp, #16]
mov x29, sp
mov w8, w0
adrp x9, .LJTI0_0
add x9, x9, :lo12:.LJTI0_0
mov x19, x1
adr x10, .LBB0_2
ldrb w11, [x9, x8]
add x10, x10, x11, lsl #2
br x10
.LBB0_2:
mov x0, x19
bl handle_type_0(void*)
.LBB0_3:
mov x0, x19
bl handle_type_1(void*)
do_something(int, void*): 
        cmp     w0, #3
        b.hi    .LBB0_6
        paciasp 
        stp     x29, x30, [sp, #-32]!           
        str     x19, [sp, #16]                  
        mov     x29, sp
        mov     w8, w0
        adrp    x9, .LJTI0_0
        add     x9, x9, :lo12:.LJTI0_0
        mov     x19, x1
        adr     x10, .LBB0_2
        ldrb    w11, [x9, x8]
        add     x10, x10, x11, lsl #2
        br      x10
.LBB0_2:
        mov     x0, x19
        bl      handle_type_0(void*)
.LBB0_3:
        mov     x0, x19
        bl      handle_type_1(void*)
.LBB0_4:
        mov     x0, x19
        bl      handle_type_2(void*)
.LBB0_5:
        mov     x0, x19
        ldr     x19, [sp, #16]                  
        ldp     x29, x30, [sp], #32             
        autiasp
        b       handle_type_3(void*)
.LBB0_6:
        ret

If an attacker can achieve a small out-out-bounds write (say, by corrupting a C++ vtable and branching to it via a use-after-free), they could land in an arbitrary place in this function. This means that they could sign a bad link register by landing on the PACIASP, and this also conveniently skips the range check – if the attacker has the right register value in x0, they can continue the attack by jumping straight to the function return. BTI mitigates this kind of attack by requiring that all indirect branches taken from registers land on a BTI or PACIASP landing pad instruction, which signals that the location is intended as a valid function start or other branch target. With BTI (enabled via the -mbranch-protection=standard compiler flag), the function now looks like this:

Fullscreen
do_something(int, void*): // @do_something(int, void*)
cmp w0, #3
b.hi .LBB0_6
paciasp
stp x29, x30, [sp, #-32]! // 16-byte Folded Spill
str x19, [sp, #16] // 8-byte Folded Spill
mov x29, sp
mov w8, w0
adrp x9, .LJTI0_0
add x9, x9, :lo12:.LJTI0_0
mov x19, x1
adr x10, .LBB0_2
ldrb w11, [x9, x8]
add x10, x10, x11, lsl #2
br x10
.LBB0_2:
mov x0, x19
bl handle_type_0(void*)
.LBB0_3:
do_something(int, void*):                    // @do_something(int, void*)
        bti     c
        cmp     w0, #3
        b.hi    .LBB0_6
        paciasp 
        stp     x29, x30, [sp, #-32]!           // 16-byte Folded Spill
        str     x19, [sp, #16]                  // 8-byte Folded Spill
        mov     x29, sp
        mov     w8, w0
        adrp    x9, .LJTI0_0
        add     x9, x9, :lo12:.LJTI0_0
        mov     x19, x1
        adr     x10, .LBB0_2
        ldrb    w11, [x9, x8]
        add     x10, x10, x11, lsl #2
        br      x10
.LBB0_2:
        bti     j
        mov     x0, x19
        bl      handle_type_0(void*)
.LBB0_3:
        bti     j
        mov     x0, x19
        bl      handle_type_1(void*)
.LBB0_4:
        bti     j
        mov     x0, x19
        bl      handle_type_2(void*)
.LBB0_5:
        bti     j
        mov     x0, x19
        ldr     x19, [sp, #16]                  // 8-byte Folded Reload
        ldp     x29, x30, [sp], #32             // 16-byte Folded Reload
        autiasp
        b       handle_type_3(void*)
.LBB0_6:
        ret

With BTI, the attacker can only reach PACIASP and sign an arbitrary link register if they branch there from the x16 or x17 registers. If they do not control these registers, they’re only permitted to land on the function’s starting BTI C via a BLR (branch with link register), which indicates that this is a valid function entry point (they can’t skip the range check either). Due to the new BTI J instructions, it’s also impossible to reach any other parts of the switch statement except via a BR (branch without link register), which also makes it harder to control the reverse edge. The relevant instructions are fully backwards compatible and execute as no-operations on pre Armv9-based CPUs.

What did it take to bring these technologies to Chromium?

For applications like Chrome, which are statically linked into a single binary, we need to ensure that every possible indirect call target is annotated with a BTI landing pad. Support is declared through a special ELF note – BTI is disabled by default if BTI and non-BTI object files are linked together. To enable BTI in Chromium (and also provide the strongest possible PAC protection), we needed to enable it everywhere.

Enabling PAC and BTI in Chromium started with enabling it in the Android Open Source Project (AOSP), since build artefacts from AOSP are used as part of the Android Native Development Kit (NDK). The Android and Browser Enablement team worked together with our Development Solutions Group and other teams in CE-OSS to add PAC and BTI support to AOSP’s constituent components - like its build system, toolchain, BoringSSL and v5.10 of the Linux kernel - and extensively validated AOSP 12 before its release in 2020.

In 2020, we also started work with Google to track down any pre-compiled binary blobs in Chromium and rebuild them with BTI. We needed to prepare Chromium to use version r23 of the NDK, which involved extensive surgery to remove any remaining dependencies on the prebuilt non-BTI libgcc 4.9 libraries. We also needed to migrate to LLVM’s tools from their GCC equivalents. Various issues and regressions in LLVM’s compiler-rt and libunwind libraries needed to be investigated and corrected before NDK r23 could be adopted. We also needed to integrate V8’s BTI support by providing the right memory mapping primitives in Chromium, and we manually added BTI landing pads to components like libdav1d, libffmpeg, and breakpad.

In January 2022, we received the first Android 12 devices with hardware support for PAC and BTI, which sped up this effort. In March 2022, we began running all of Chromium’s test suites on this hardware. In M101, we enabled Pointer Authentication for all C++ code in Chromium, followed by support for V8-generated code in M102. We kept BTI as an opt-in flag until M105, where it is now the default. Arm's implemented linker checks to maintain BTI support in M106 and continues regular testing on PAC/BTI hardware as it becomes widespread.

What’s next?

Thanks to our work on AOSP and the NDK, most Android applications and libraries using native code can adopt enhanced CFI by compiling with the -mbranch-protection=standard compiler flag and we highly recommend this for most projects.

However, security is a process – not a result. Although we’ve made a good start by protecting Chromium, we’re continuing our efforts to protect end users by ensuring that future-looking technologies like Rust support PAC and BTI well, improving performance, and bringing PAC and BTI support to Chromium on Linux systems. We’re also researching new ways to enhance the security of the Arm architecture in places where we generate executable code (like in Chromium’s V8), improving Arm’s Application Binary Interface to make better use of pointer authentication, and continuing to focus on other categories of security problems like memory safety.

You can be part of this journey too and help us build the future of computing on Arm for everyone – we’re hiring for several positions (see careers.arm.com for details).

Thanks and acknowledgements

Shipping innovative features like PAC and BTI would be impossible without a lot of collaboration across companies and projects. The ABET team would particularly like to thank Eliott Hughes, Kentaro Hara, Nico Weber, Hans Wennborg and Andrew Grieve at Google, as well as Georgia Kouveli on Arm’s V8 team.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK