Go 1.13 defer 的变化
source link: https://www.tuicool.com/articles/BZvuMnU
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
1.13 正式发布了,Release notes 上说 defer 现在大多数情况下可以提升 30% 的性能。这 30% 的性能怎么来的呢?
我们知道,以前的 defer func 会被翻译成 deferproc 和 deferreturn 两个过程, 这里
现在 deferproc 这一步增加了 deferprocStack 这个新过程,由编译器来选择使用 deferproc 还是 deferprocStack,当然了,既然官方说优化了大部分的使用场景,说明基本上大部分情况下,是会被编译到 deferprocStack 的。
// All other fields can contain junk. // The defer record must be immediately followed in memory by // the arguments of the defer. // Nosplit because the arguments on the stack won't be scanned // until the defer record is spliced into the gp._defer list. //go:nosplit func deferprocStack(d *_defer) { gp := getg() if gp.m.curg != gp { // go code on the system stack can't defer throw("defer on system stack") } // siz and fn are already set. // The other fields are junk on entry to deferprocStack and // are initialized here. d.started = false d.heap = false d.sp = getcallersp() d.pc = getcallerpc() // The lines below implement: // d.panic = nil // d.link = gp._defer // gp._defer = d // But without write barriers. The first two are writes to // the stack so they don't need a write barrier, and furthermore // are to uninitialized memory, so they must not use a write barrier. // The third write does not require a write barrier because we // explicitly mark all the defer structures, so we don't need to // keep track of pointers to them with a write barrier. *(*uintptr)(unsafe.Pointer(&d._panic)) = 0 *(*uintptr)(unsafe.Pointer(&d.link)) = uintptr(unsafe.Pointer(gp._defer)) *(*uintptr)(unsafe.Pointer(&gp._defer)) = uintptr(unsafe.Pointer(d)) return0() // No code can go here - the C return register has // been set and must not be clobbered. }
简单验证验证:
package main func main() { defer println(1) }
0x003a 00058 (deferstack.go:4) LEAQ ""..autotmp_1+8(SP), AX 0x003f 00063 (deferstack.go:4) PCDATA $0, $0 0x003f 00063 (deferstack.go:4) MOVQ AX, (SP) 0x0043 00067 (deferstack.go:4) CALL runtime.deferprocStack(SB) 0x0048 00072 (deferstack.go:4) TESTL AX, AX 0x004a 00074 (deferstack.go:4) JNE 92 0x004c 00076 (deferstack.go:5) XCHGL AX, AX
原来的 deferproc 仍然存在,所以对应的 _defer
结构体上需要区分这个 defer 结构是在栈上还是堆上分配的:
type _defer struct { siz int32 // includes both arguments and results started bool heap bool // 增加了这个新字段 sp uintptr // sp at time of defer pc uintptr fn *funcval _panic *_panic // panic that is running defer link *_defer }
在没有 deferprocStack 之前,就是走 deferproc 的过程,虽然也有 deferpool,但是不够用的时候,肯定还是会有这么个东西:
d = (*_defer)(mallocgc(total, deferType, true))
社区里一直有人吐槽 defer 慢慢慢。所以这次相当于官方响应民意了。。
为什么没有把所有 defer 调用都优化成栈上分配呢?
case ODEFER: d := callDefer if n.Esc == EscNever { d = callDeferStack } s.call(n.Left, d)
n.Esc 是 ast.Node 的逃逸分析结果,被修改为 EscNever 主要就是下面这段:
case ODEFER: if e.loopdepth == 1 { // top level n.Esc = EscNever // force stack allocation of defer record (see ssa.go) break }
怎么理解这个 loopdepth 呢?大概就是每增加一个 for 循环增加一吧,我们照这个思路仿照一个 defer 仍然分配在堆上的例子:
package main import "fmt" func main() { for i := 0; i < 10; i++ { defer func() { for { var a = make([]int, 128) fmt.Println(a) } }() } }
go tool compile -S
0x0043 00067 (deferproc.go:7) PCDATA $0, $0 0x0043 00067 (deferproc.go:7) MOVQ AX, 8(SP) 0x0048 00072 (deferproc.go:7) CALL runtime.deferproc(SB) 0x004d 00077 (deferproc.go:7) TESTL AX, AX 0x004f 00079 (deferproc.go:7) JNE 83 0x0051 00081 (deferproc.go:7) JMP 33 0x0053 00083 (deferproc.go:7) XCHGL AX, AX
嗯,还是熟悉的味道。
然而在研究完之后才发现,其实也不用这么麻烦,直接去看官方的 test 就好了哈哈: 这里
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK