5

【5-2 Golang】实战—dlv调试

 1 year ago
source link: https://studygolang.com/articles/35915
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

  Go程序出异常怎么办?pprof工具分析啊,可是如果是代码方面bug等呢?分析代码bug有时需要结合执行过程,加日志呗,可是某些异常问题服务重启之后,可能会很难复现。这时候我们可以断点调试,这样就能分析每一行代码的执行,每一个变量的结果,C语言通常使用GDB调试,Go语言有专门的调试工具dlv,本篇文章主要介绍dlv的基本使用。

dlv 概述

  dlv全称delve,安装也比较简单,go install就能安装:

//下载&安装
$ git clone https://github.com/go-delve/delve
$ cd delve
$ go install github.com/go-delve/delve/cmd/dlv

//go 1.16版本以上
# Install at a specific version or pseudo-version:
$ go install github.com/go-delve/delve/cmd/[email protected]

#On macOS make sure you also install the command line developer tools:
xcode-select --install

  dlv支持多种方式跟踪你的Go程序,help命令查看:

dlv help

//参数传递
Pass flags to the program you are debugging using `--`, for example:
`dlv exec ./hello -- server --config conf/config.toml`

Usage:
  dlv [command]

Available Commands:
  //常用来调试异常进程
  attach      Attach to running process and begin debugging.
  //启动并调试二进制程序
  exec        Execute a precompiled binary, and begin a debug session.
  debug       Compile and begin debugging main package in current directory, or the package specified.
  ......

  dlv与GDB还是比较类似的,可打印变量的值,可设置断点,可单步执行,可查看调用栈,另外还可以查看当前Go进程的所有协程、线程等;常用的功能(命令)如下:

Running the program:
    //运行到断点处,或者直到程序终止
    continue (alias: c) --------- Run until breakpoint or program termination.
    //单步执行
    next (alias: n) ------------- Step over to next source line.
    //重新启动进程
    restart (alias: r) ---------- Restart process.
    //进入函数,普通的n函数调用是一行代码,会直接跳过
    step (alias: s) ------------- Single step through program.
    //退出函数执行
    stepout (alias: so) --------- Step out of the current function.

Manipulating breakpoints:
    //设置断点
    break (alias: b) ------- Sets a breakpoint.
    //查看所有断点
    breakpoints (alias: bp)  Print out info for active breakpoints.
    //删除断点
    clear ------------------ Deletes breakpoint.
    //删除所有断点
    clearall --------------- Deletes multiple breakpoints.

Viewing program variables and memory:
    //输出函数参数
    args ----------------- Print function arguments.
    //输出局部变量
    locals --------------- Print local variables.
    //输出某一个变量
    print (alias: p) ----- Evaluate an expression.
    //输出寄存器内存
    regs ----------------- Print contents of CPU registers.
    //修改变量的值
    set ------------------ Changes the value of a variable.

Listing and switching between threads and goroutines:
    //输出协程调用栈或者切换到指定协程
    goroutine (alias: gr) -- Shows or changes current goroutine
    //输出所有协程
    goroutines (alias: grs)  List program goroutines.
    //切换到指定线程
    thread (alias: tr) ----- Switch to the specified thread.
    //输出所有线程
    threads ---------------- Print out info for every traced thread.

Viewing the call stack and selecting frames:
    //输出调用栈
    stack (alias: bt)  Print stack trace.

Other commands:
    //输出程序汇编指令
    disassemble (alias: disass)  Disassembler.
    //显示源代码
    list (alias: ls | l) ------- Show source code.

  dlv的命令虽然比较多,但是常用的也就几个,一般只要会设置断点,单步执行,输出变量、调用栈等就能满足基本的调试需求。

dlv 实战

  我们写一个小程序,通过dlv调试,复习下之前介绍的管道读写,以及调度器流程。注意,Go是多线程/多协程程序,实际执行过程可能比较复杂,而且笔者也省略了部分调试过程,所以即使你完全跟着步骤调试,结果可能也不一样。程序如下:

package main

import (
    "fmt"
    "time"
)

func main() {
    queue := make(chan int, 1)
    go func() {
        for {
            data := <- queue      
            fmt.Print(data, " ")  
        }
    }()

    for i := 0; i < 10; i ++ {
        queue <- i                
    }
    time.Sleep(time.Second * 1000)
}

  编译Go程序并通过dlv启动执行:

//编译标识注意 -N -l ,禁止编译优化
go build -gcflags '-N -l' test.go

dlv exec test
Type 'help' for list of commands.
(dlv)

  接下来就可以输入上面介绍的诸多调试命令,开启dlv调试之旅了。我们之前已经介绍过管道的实现原理以及Go调度器相关,管道的读写操作实现函数为runtime.chanrecv/runtime.chansend,调度器主逻辑是runtime.schedule;另外,读者需要知道,我们的主协程也就是main函数,编译后对应的函数是main.main。在这几个函数都添加断点。

//有些时候只根据函数名无法区分,设置断点可能需要携带包名,如runtime.chansend
(dlv) b chansend
Breakpoint 1 set at 0x1003f0a for runtime.chansend() /go1.18/src/runtime/chan.go:159
(dlv) b chanrecv
Breakpoint 2 set at 0x1004c2f for runtime.chanrecv() /go1.18/src/runtime/chan.go:455
(dlv) b schedule
Breakpoint 3 set at 0x1037aea for runtime.schedule() /go1.18/src/runtime/proc.go:3111
(dlv) b main.main
Breakpoint 4 set at 0x1089a0a for main.main() ./test.go:8

  continue(简写c)命令执行到断点处:

(dlv) c
> runtime.schedule() /go1.18/src/runtime/proc.go:3111 (hits total:1) (PC: 0x1037aea)

=>3111:    func schedule() {
  3112:        _g_ := getg()
  3113:
  3114:        if _g_.m.locks != 0 {
  3115:            throw("schedule: holding locks")
  3116:        }

  =>指向当前执行的代码,第一次竟然执行到了runtime.schedule,没有到main函数?要知道main函数最终也是作为主协程调度执行的,所以main函数肯定不是第一个执行的,调度主协程之前肯定需要线程,创建主协程,执行调度逻辑等等。那Go程序第一行代码应该是什么?我们看一下调用栈:

(dlv) bt
0  0x0000000001037aea in runtime.schedule
   at /go1.18/src/runtime/proc.go:3111
1  0x000000000103444d in runtime.mstart1
   at /go1.18/src/runtime/proc.go:1425
2  0x000000000103434c in runtime.mstart0
   at /go1.18/src/runtime/proc.go:1376
3  0x00000000010585e5 in runtime.mstart
   at /go1.18/src/runtime/asm_amd64.s:368
4  0x0000000001058571 in runtime.rt0_go
   at /go1.18/src/runtime/asm_amd64.s:331

  Go程序第一行代码在runtime/asm_amd64.s,入口函数是runtime.rt0_go,有兴趣的可以看看,都是汇编代码。接下来,继续c执行到断点,你会发现还是程序还是会执行的暂停到runtime.schedule,甚至是runtime.chanrecv,这是因为在调度主协程之前,还需要做很多初始化工作(有用到这几个函数)。所以我们通常是先设置断点main.main,c执行到这里,再设置其他断点,restart重新执行程序,删除其他断点,重新在main.main设置断点,并continue执行到断点处:

(dlv) r
Process restarted with PID 57676

(dlv) clearall

(dlv) b main.main
Breakpoint 5 set at 0x1089a0a for main.main() ./test.go:8

(dlv) c
> main.main() ./test.go:8 (hits goroutine(1):1 total:1) (PC: 0x1089a0a)

=>   8:    func main() {
     9:        queue := make(chan int, 1)
    10:        go func() {

  这下程序终于执行到main.main函数处了,接下来在管道读写函数设置断点,并continue执行到断点处:

(dlv) b chansend
Breakpoint 1 set at 0x1003f0a for runtime.chansend() /go1.18/src/runtime/chan.go:159
(dlv) b chanrecv
Breakpoint 2 set at 0x1004c2f for runtime.chanrecv() /go1.18/src/runtime/chan.go:455

(dlv) c
> runtime.chansend() /go1.18/src/runtime/chan.go:159 (hits goroutine(1):1 total:1) (PC: 0x1003f0a)

=> 159:    func chansend(c *hchan, ep unsafe.Pointer, block bool, callerpc uintptr) bool {
   160:        if c == nil {
   161:            if !block {
   162:                return false
   163:            }

  程序执行到了runtime.chansend函数,对应的应该是"queue <- i"这一行代码。bt看看函数栈桢确认下是不是:

(dlv) bt
0  0x0000000001003f0a in runtime.chansend
   at /go1.18/src/runtime/chan.go:159
1  0x0000000001003edd in runtime.chansend1
   at /go1.18/src/runtime/chan.go:144
2  0x0000000001089aa9 in main.main
   at ./test.go:18

//查看参数
(dlv) args
c = (*runtime.hchan)(0xc00005a070)
ep = unsafe.Pointer(0xc000070f58)
block = true    //会阻塞协程
callerpc = 17341097
~r0 = (unreadable empty OP stack)

//循环第一次写入管道的数值应该是0,x命令可查看内存
(dlv) x 0xc000070f58
0xc000070f58:   0x00

  这里我们通过args命令看一下输入参数,block为true说明会阻塞当前协程(如果管道不可写),ep是一个地址,存储待写入数据,x命令可以查看内存,我们看到就是数值0。

  还记得我们之前介绍的管道chan的实现原理吗?底层维护着一个循环队列(有缓冲管道),写数据主要包含这几步逻辑:1)如果管道为nil,阻塞当前协程(block=true);2)如果已关闭,抛出panic异常;3)如果有协程在等待读,直接将数据交给目标协程,并唤醒该协程;4)如果管道还有剩余容量,写数据;4)管道容量已经满了,阻塞当前协程(block=true)。

  接下来可以单步执行,看看管道写操作的执行流程。这一过程比较简单,重复较多,就不再赘述了,我们只列出来单步执行的一个中间过程:

(dlv) n
1 > runtime.chansend() /go1.18/src/runtime/chan.go:208 (PC: 0x10040e0)
Warning: debugging optimized function
   203:        if c.closed != 0 {
   204:            unlock(&c.lock)
   205:            panic(plainError("send on closed channel"))
   206:        }
   207:
=> 208:        if sg := c.recvq.dequeue(); sg != nil {
   209:            // Found a waiting receiver. We pass the value we want to send
   210:            // directly to the receiver, bypassing the channel buffer (if any).
   211:            send(c, sg, ep, func() { unlock(&c.lock) }, 3)
   212:            return true
   213:        }

  单步执行过程中,你可能会发现阻塞协程是通过gopark函数将协程换出,切换到调度器循环的。我们在runtime.schedule以及runtime.gopark函数再设置断点,观察协程切换情况:

(dlv) b schedule
Breakpoint 8 set at 0x1037aea for runtime.schedule() /go1.18/src/runtime/proc.go:3111
(dlv) b gopark
Breakpoint 9 set at 0x1031aca for runtime.gopark() /go1.18/src/runtime/proc.go:344

(dlv) c
> runtime.gopark() /go1.18/src/runtime/proc.go:344 (hits goroutine(1):2 total:2) (PC: 0x1031aca)

=> 344:    func gopark(unlockf func(*g, unsafe.Pointer) bool, lock unsafe.Pointer, reason waitReason, traceEv byte, traceskip int) {
   345:        if reason != waitReasonSleep {
   346:            checkTimeouts() // timeouts may expire while two goroutines keep the scheduler busy
   347:        }
   348:        mp := acquirem()
   349:        gp := mp.curg

  runtime.gopark函数主要是切换到调度栈,并执行runtime.schedule调度器(查找可执行协程并调度),所以再次continue会执行到runtime.schedule断点处:

(dlv) c
> [b] runtime.schedule() /go1.18/src/runtime/proc.go:3111 (hits total:19) (PC: 0x1037aea)

=>3111:    func schedule() {
  3112:        _g_ := getg()


(dlv) bt
0  0x0000000001037aea in runtime.schedule
   at /Users/lile/Documents/go1.18/src/runtime/proc.go:3111
1  0x000000000103826d in runtime.park_m
   at /Users/lile/Documents/go1.18/src/runtime/proc.go:3336
2  0x0000000001058663 in runtime.mcall
   at /Users/lile/Documents/go1.18/src/runtime/asm_amd64.s:425

  bt查看调用栈,发现栈底函数是runtime.mcall,调用栈这么短吗?怎么看不到runtime.gopark函数呢?因为这里切换了栈桢,从用户协程栈切换到调度栈,所以调用链路肯定不一样了,是看不到之前用户栈的调用链路的。runtime.mcall函数就是用来切换栈桢的。

  dlv是Go程序调试非常好的工具,不仅可以帮助我们学习理解Go语言,也可以帮助我们快速排查定位程序bug等,一定要熟练掌握。


Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK