5

BPF Performance Tools (Book)

 2 years ago
source link: https://www.brendangregg.com/bpf-performance-tools-book.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

BPF Performance Tools (book)

This is the official site for the book BPF Performance Tools: Linux System and Application Observability, published by Addison Wesley (2019). This book can help you get the most out of your systems and applications, helping you improve performance, reduce costs, and solve software issues. Here I'll describe the book, link to related content, and list errata and updates.

The book is available on Amazon.com (paperback, kindle), InformIT (paperback, PDF, etc), and Safari (here and here). The paper book was released in December 2019 but sold out immediately. ISBN-13: 9780136554820. (If you purchase through the Amazon or InformIT links, the book's technical editor earns a commission.)

The Amazon Kindle preview shows the first 100 pages out of this 880 page book.

There is also a companion book, Systems Performance: 2nd Edition (2020), that provides balanced coverage of performance analysis and methodologies using all tool types.

On this page: BPF, Screenshots, OSes, Audience, Tools, TOC, Related, Errata, Updates.

What is BPF?

Berkeley Packet Filter (BPF) is an in-kernel execution engine that processes a virtual instruction set, and has been extended recently (aka eBPF) for providing a safe way to extend kernel functionality. In some ways, eBPF does to the kernel what JavaScript does to websites: it allows all sorts of new applications to be created. BPF is now used for software defined networking, observability (this book), security enforcement, and more. The main front-ends for BPF performance tools are BCC and bpftrace. BPF itself is also becoming a technology name, and no longer an abbreviation.

Screenshots

As an example new tool from the book, readahead.bt provides a new view of file system read ahead performance: the age of read-ahead pages when they are finally referenced, and unused read-ahead pages while tracing:

# readahead.bt
Attaching 5 probes...
^C
Readahead unused pages: 128

Readahead used page age (ms):
@age_ms: 
[1]             2455 |@@@@@@@@@@@@@@@                                     |
[2, 4)          8424 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[4, 8)          4417 |@@@@@@@@@@@@@@@@@@@@@@@@@@@                         |
[8, 16)         7680 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@     |
[16, 32)        4352 |@@@@@@@@@@@@@@@@@@@@@@@@@@                          |
[32, 64)           0 |                                                    |
[64, 128)          0 |                                                    |
[128, 256)       384 |@@                                                  |

The book covers many of the existing tools as well, for example, tcplife for efficiently logging TCP session details:

# tcplife
PID   COMM   LADDR          LPORT RADDR          RPORT TX_KB RX_KB MS
4169  java   100.1.111.231  40158 100.2.116.192  6001      7    33 3590.91
4169  java   100.1.111.231  56940 100.5.177.31   6101      0     0 2.48
4169  java   100.1.111.231  6001  100.2.176.45   49482     0     0 17.94
4169  java   100.1.111.231  18926 100.5.102.250  6101      0     0 0.90
4169  java   100.1.111.231  44530 100.2.31.140   6001      0     0 2.64
4169  java   100.1.111.231  44406 100.2.8.109    6001     11    28 3982.11
34781 sshd   100.1.111.231  22    100.2.17.121   41566     5     7 2317.30
[...]

Apart from kernel resources, applications are also analyzed. The following book tool counts Java JNI usage by stack trace:

# bpftrace --unsafe jnistacks.bt
Tracing jni_NewObject* calls... Ctrl-C to end.
^C
Running /usr/local/bin/jmaps to create Java symbol files in /tmp...
Fetching maps for all java processes...
Mapping PID 25522 (user bgregg):
wc(1):   8350  26012 518729 /tmp/perf-25522.map

[...]
@[
    jni_NewObject+0
    Lsun/awt/X11GraphicsConfig;::pGetBounds+171
    Ljava/awt/MouseInfo;::getPointerInfo+2048
    Lnet/sf/freecol/client/gui/plaf/FreeColButtonUI;::paint+1648
    Ljavax/swing/plaf/metal/MetalButtonUI;::update+232
    Ljavax/swing/JComponent;::paintComponent+672
    Ljavax/swing/JComponent;::paint+2208
[...]
    Ljavax/swing/RepaintManager;::prePaintDirtyRegions+1556
    Ljavax/swing/RepaintManager$ProcessingRunnable;::run+572
    Ljava/awt/EventQueue$4;::run+1100
    call_stub+138
    JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Th...
]: 232

This book doesn't just show the tools, it also explains caveats and gotchas. In this case jnistacks.bt is a simple tool, but getting it to work in production can mean fixing stack traces and symbols. These real-world gotchas are explained with recommended fixes and workarounds.

The book explains these and over 150 other BPF tools, as well as summarizing over 30 traditional performance analysis tools (top, vmstat, iostat, perf, Ftrace, etc) so that you can use the right tool for the job.

Operating Systems

Extended BPF is a built-in Linux kernel technology, added in parts since 3.18. At least Linux 4.9 is necessary to utilize the tools in this book. All Linux distributions can use the BPF tools (Ubuntu, CentOS, Fedora, Red Hat, etc): although the status of BCC and bpftrace varies for each distribution. Some have packages, others still require a build from source. See the install instructions for BCC and bpftrace.

Other operating systems including BSD (where BPF originated) are not covered in this book. As extended BPF is being ported elsewhere, a future edition of this book may cover more than Linux.

Audience

This book is primarily for engineers, developers, and support staff in enterprise and cloud environments. No programming is required, unless you want to, as you can use this book as either:

  • A reference of ready-to-run performance analysis and debugging tools.
  • A guide for learning how to develop new tools.

This book is also useful for students as a way to learn system internals in an interactive way: you can run and develop tools to examine the workings of the system.

Tools

Over 150 BPF tools are covered in the book, for performance analysis, troubleshooting, and other uses (e.g., security forensics). These tools provide observability for CPUs, memory, disks, file systems, networking, languages, applications, containers, hypervisors, security, and the Linux kernel. To explain how to analyze different languages, three types of execution are studied: compiled, JIT-compiled, and interpreted, using C, Java, and the bash shell as examples. The same approaches can be applied to other languages, and a summary for Node.js, C++, and Golang are included.

To cover all these targets, many new tools needed to be developed for this book. The diagram on the top right shows these new tools colored red. The source to these is included in the book, and can also be found here:

https://github.com/brendangregg/bpf-perf-tools-book

The /originals directory contains an as-is snapshot of the published tools, and /updated contains those tools plus updated versions.

Table of Contents

  • Preface
    Part I: Technologies
    • 1. Intro
      2. Technology Background
      3. Performance Analysis
      4. bcc
      5. bpftrace
    Part II: Using BPF Tools
    • 6. CPUs
      7. Memory
      8. File Systems
      9. Disk I/O
      10. Networking
      11. Security
      12. Languages
      13. Applications
      14. Kernel
      15. Containers
      16. Hypervisors
    Part III: Additional Topics
    • 17. Other BPF Tools
      18. Tips and Tricks
    Apx.A. bpftrace One-Liners
    Apx.B. bpftrace Cheat Sheet
    Apx C. bcc Tool Development
    Apx D. C BPF
    Apx E. BPF Instructions
    Glossary
    Bibliography

PDF Download eBook EPUB

The Safari online book store features early drafts of books for feedback, called "rough cuts." I'd never published one before, but did this time to see if it helped. It did not. This happened:

  1. I received next to no feedback from the rough cut.
  2. A badly-formatted EPUB version immediately appeared on pirate sites, months before the book was finished.

This pirate version is missing bug fixes and content I later added. It is really frustrating as I've worked hard to give readers the best possible experience, but some of you may be studying this draft instead, thinking that it's the final book. There is also (obviously) no way for the publisher to ask the pirates to update their version. Please only read the finished book, preferably "second printing" or later (as the second printing should include the errata fixes, listed below). One tell-tale sign: the cover of the final book includes the text "Foreword by Alexei Starovoitov...," and the early draft versions did not.

Related Content

Errata

1st Printing

  • pxxvii, Preface: the footnote 1 text is somehow from chapter 6 by mistake; it should be: "The exercises include some advanced and "unsolved" problems, for which I have yet to see a working solution. It is possible that some of these problems are impossible to solve without kernel or application changes."
  • pxxxiv, Preface: the Kindle version has a conversion error where two early page numbers are inserted into the text, appearing as "xxxivtracepoints" and "xxxvmost of whom".
  • 2.6, p45: "This figure also shows the Linux kernel versions ...": that was the old figure, but not this new one.
  • 2.10.2, p60: "The location of the probe from the previous readelf(1) output was 0x6a2."; that previous output was deleted.
  • 2.13, p64: "Linux 2.6.21" -> "Linux 2.6.31".
  • 5.9.6, p154: "A a rate of 99" extra "a".
  • 6.2.3, p192: "perf list" should be "perf script" (twice).
  • 6.2.5, p196: "perf script to show the rate" should be "perf stat ..." (matching the screenshot).
  • 9.1.3, p346: the Safari version misnumbers step 2a as another step 1.
  • 9.3.2, p359: "biostoop(8)" -> "biosnoop(8)".
  • 9.3.7, p370: "kprobe:blk_start_request,kprobe:blk_mq_start_request" -> "kprobe:blk_account_io_done", to trace the full I/O latency (and not just OS queued time).
  • 12.5.1, p583: Javascript(Node.js): "v8 can run Java functions" Java should be JavaScript.
  • ApxC, p749: "line 4 imports the BPF library" should be line 2.
  • ApxC, p749: "predate his capability" typo his->this.
  • ApxC, p767: "make $(getconf" -> "make -j $(getconf" (missing -j).
  • ApxC, p767: "thesamples" -> "the samples".

1st & 2nd Printing

  • ApxE, p786: Dest and Source Register are both 4-bit (not 8-bit).

Updates

These are updates to BPF and its front-ends, many of which were mentioned in the book as "planned" and have since been implemented:

  • 5.5.1 p173: bpf_probe_read_kernel() and bpf_probe_read_user() have now been implemented and may show up in Linux 5.5.
  • 5.15.2 p174: bpfrace added signal() (thanks Bas Smit).
  • 5.15.2 p175: bpftrace added override_return() (thanks Bas Smit).
  • bpftrace added strncmp() (thanks Jay Kamat, Bas Smit).
  • 5.10.3, p155: bpftrace added if else support (thanks Daniel Xu).
  • 5.10.4, p155: bpftrace added while() loops (thanks Bas Smit).
  • bpftrace curtask is now a task_struct if type info is available (headers or BTF).
  • Appendix C: covered the BCC Python interface, but that is now considered deprecated as we switch to BCC libbpf C.

Thanks to all the reviewers, and to Deirdré Straughan for editing another one of my books!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK