38

Go 1.13 defer 的变化

 5 years ago
source link: https://www.tuicool.com/articles/BZvuMnU
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

1.13 正式发布了,Release notes 上说 defer 现在大多数情况下可以提升 30% 的性能。这 30% 的性能怎么来的呢?

我们知道,以前的 defer func 会被翻译成 deferproc 和 deferreturn 两个过程, 这里

现在 deferproc 这一步增加了 deferprocStack 这个新过程,由编译器来选择使用 deferproc 还是 deferprocStack,当然了,既然官方说优化了大部分的使用场景,说明基本上大部分情况下,是会被编译到 deferprocStack 的。

// All other fields can contain junk.
// The defer record must be immediately followed in memory by
// the arguments of the defer.
// Nosplit because the arguments on the stack won't be scanned
// until the defer record is spliced into the gp._defer list.
//go:nosplit
func deferprocStack(d *_defer) {
	gp := getg()
	if gp.m.curg != gp {
		// go code on the system stack can't defer
		throw("defer on system stack")
	}
	// siz and fn are already set.
	// The other fields are junk on entry to deferprocStack and
	// are initialized here.
	d.started = false
	d.heap = false
	d.sp = getcallersp()
	d.pc = getcallerpc()
	// The lines below implement:
	//   d.panic = nil
	//   d.link = gp._defer
	//   gp._defer = d
	// But without write barriers. The first two are writes to
	// the stack so they don't need a write barrier, and furthermore
	// are to uninitialized memory, so they must not use a write barrier.
	// The third write does not require a write barrier because we
	// explicitly mark all the defer structures, so we don't need to
	// keep track of pointers to them with a write barrier.
	*(*uintptr)(unsafe.Pointer(&d._panic)) = 0
	*(*uintptr)(unsafe.Pointer(&d.link)) = uintptr(unsafe.Pointer(gp._defer))
	*(*uintptr)(unsafe.Pointer(&gp._defer)) = uintptr(unsafe.Pointer(d))

	return0()
	// No code can go here - the C return register has
	// been set and must not be clobbered.
}

简单验证验证:

package main

func main() {
	defer println(1)
}
0x003a 00058 (deferstack.go:4)	LEAQ	""..autotmp_1+8(SP), AX
	0x003f 00063 (deferstack.go:4)	PCDATA	$0, $0
	0x003f 00063 (deferstack.go:4)	MOVQ	AX, (SP)
	0x0043 00067 (deferstack.go:4)	CALL	runtime.deferprocStack(SB)
	0x0048 00072 (deferstack.go:4)	TESTL	AX, AX
	0x004a 00074 (deferstack.go:4)	JNE	92
	0x004c 00076 (deferstack.go:5)	XCHGL	AX, AX

原来的 deferproc 仍然存在,所以对应的 _defer 结构体上需要区分这个 defer 结构是在栈上还是堆上分配的:

type _defer struct {
	siz     int32 // includes both arguments and results
	started bool
	heap    bool // 增加了这个新字段
	sp      uintptr // sp at time of defer
	pc      uintptr
	fn      *funcval
	_panic  *_panic // panic that is running defer
	link    *_defer
}

在没有 deferprocStack 之前,就是走 deferproc 的过程,虽然也有 deferpool,但是不够用的时候,肯定还是会有这么个东西:

d = (*_defer)(mallocgc(total, deferType, true))

社区里一直有人吐槽 defer 慢慢慢。所以这次相当于官方响应民意了。。

为什么没有把所有 defer 调用都优化成栈上分配呢?

case ODEFER:
		d := callDefer
		if n.Esc == EscNever {
			d = callDeferStack
		}
		s.call(n.Left, d)

n.Esc 是 ast.Node 的逃逸分析结果,被修改为 EscNever 主要就是下面这段:

case ODEFER:
		if e.loopdepth == 1 { // top level
			n.Esc = EscNever // force stack allocation of defer record (see ssa.go)
			break
		}

怎么理解这个 loopdepth 呢?大概就是每增加一个 for 循环增加一吧,我们照这个思路仿照一个 defer 仍然分配在堆上的例子:

package main

import "fmt"

func main() {
	for i := 0; i < 10; i++ {
		defer func() {
			for {
				var a = make([]int, 128)
				fmt.Println(a)
			}
		}()
	}
}

go tool compile -S

0x0043 00067 (deferproc.go:7)	PCDATA	$0, $0
	0x0043 00067 (deferproc.go:7)	MOVQ	AX, 8(SP)
	0x0048 00072 (deferproc.go:7)	CALL	runtime.deferproc(SB)
	0x004d 00077 (deferproc.go:7)	TESTL	AX, AX
	0x004f 00079 (deferproc.go:7)	JNE	83
	0x0051 00081 (deferproc.go:7)	JMP	33
	0x0053 00083 (deferproc.go:7)	XCHGL	AX, AX

嗯,还是熟悉的味道。

然而在研究完之后才发现,其实也不用这么麻烦,直接去看官方的 test 就好了哈哈: 这里


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK