defer的妙用之跟踪函数调用链

本文参考：Tony Bai 老师的极客时间《Tony Bai · Go语言第一课》课程

使用 defer 可以跟踪函数的执行过程
defer 会预计算参数(表达式进行求值): 详情见 https://luenci.com

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// trace.go
package main

func Trace(name string) func() {
	println("enter:", name)
	return func() {
		println("exit:", name)
	}
}
func foo() {
	defer Trace("foo")()
	bar()
}
func bar() {
	defer Trace("bar")()
}
func main() {
    defer Trace("main")() 
	foo()
}

out:
    enter: main
    enter: foo
    enter: bar
    exit: bar
    exit: foo
    exit: main

程序的函数调用的全过程一目了然地展现在了我们面前：程序按 main -> foo -> bar的函数调用次序执行，代码在函数的入口与出口处分别输出了跟踪日志。

Go 会对 defer 后面的表达式Trace("foo")()进行求值。由于这个表达式包含一个函数调用Trace("foo")，所以这个函数会被执行。

程序存在的问题

调用 Trace 时需手动显式传入要跟踪的函数名；
如果是并发应用，不同 Goroutine 中函数链跟踪混在一起无法分辨；
输出的跟踪结果缺少层次感，调用关系不易识别；
对要跟踪的函数，需手动调用 Trace 函数

接下来我们一步一步的解决这些问题，来完善我们的函数调用链。

自动获取所跟踪函数的函数名

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
func Trace() func() {
	pc, _, _, ok := runtime.Caller(1)
	if !ok {
		panic("not found caller")
	}
	fn := runtime.FuncForPC(pc)
    name := fn.Name()
	println("enter:", name)
	return func() { println("exit:", name) }
}

...

改进后的Trace()函数，通过 runtime.Caller 函数获得当前 Goroutine 的函数调用栈上的信息，runtime.Caller 的参数标识的是要获取的是哪一个栈帧的信息。当参数为 0 时，返回的是 Caller 函数的调用者的函数信息，在这里就是 Trace 函数。但我们需要的是 Trace 函数的调用者的信息，于是我们传入 1。

Caller 函数有四个返回值：

第一个返回值代表的是程序计数（pc）；
第二个和第三个参数代表对应函数所在的源文件名以及所在行数，这里我们暂时不需要
最后一个参数代表是否能成功获取这些信息

如果获取失败，我们抛出 panic。然后，我们通过 runtime.FuncForPC 函数和程序计数器（PC）得到被跟踪函数的函数名称。runtime.FuncForPC 返回的名称中不仅仅包含函数名，还包含了被跟踪函数所在的包名

增加 Goroutine 标识

注意：Go 核心团队为了避免 Goroutine ID 的滥用，故意没有将 Goroutine ID 暴露给开发者。但是在 Go 标准库的 h2_bundle.go 中发现了一个官方的获取 Goroutine ID的用法。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// curGoroutineID 获取当前goroutine的id
func curGoroutineID() uint64 {
	b := make([]byte, 64)
	b = b[:runtime.Stack(b, false)]
	// Parse the 4707 out of "goroutine 4707 ["
	b = bytes.TrimPrefix(b, goroutineSpace)
	i := bytes.IndexByte(b, ' ')
	if i < 0 {
		panic(fmt.Sprintf("No space found in %q", b))
	}
	b = b[:i]
	n, err := strconv.ParseUint(string(b), 10, 64)
	if err != nil {
		panic(fmt.Sprintf("Failed to parse goroutine ID out of %q: %v", b, err))
	}
	return n
}

func Trace(name string) func() {
	pc, _, _, ok := runtime.Caller(1)
	if !ok {
		panic("not found caller")
	}
	fn := runtime.FuncForPC(pc)
	name = fn.Name()
	gid := curGoroutineID()
	fmt.Printf("g[%05d]: enter: [%s]\n", gid, name)
	return func() { fmt.Printf("g[%05d]: exit: [%s]\n", gid, name) }
}
....

out:
    g[00001]: enter: [main.main]
    g[00001]: enter: [main.foo]
    g[00001]: enter: [main.bar]
    g[00001]: exit: [main.bar]
    g[00001]: exit: [main.foo]
    g[00001]: exit: [main.main]

改进后的代码在出入口输出的跟踪信息中加入了 Goroutine ID 信息，我们输出的 Goroutine ID 为 5 位数字，如果 ID 值不足 5 位，则左补零，这一切都是 Printf 函数的格式控制字符串%05d帮助我们实现的。这样对齐 Goroutine ID 的位数，为的是输出信息格式的一致性更好。如果你的 Go 程序中 Goroutine 的数量超过了 5 位数可以表示的数值范围，也可以自行调整控制字符串。

多Goroutine 测试

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
// trace.go
package main

import (
	"bytes"
	"fmt"
	"runtime"
	"strconv"
	"sync"
)

var goroutineSpace = []byte("goroutine ")

// curGoroutineID 获取当前goroutine的id
func curGoroutineID() uint64 {
	b := make([]byte, 64)
	b = b[:runtime.Stack(b, false)]
	// Parse the 4707 out of "goroutine 4707 ["
	b = bytes.TrimPrefix(b, goroutineSpace)
	i := bytes.IndexByte(b, ' ')
	if i < 0 {
		panic(fmt.Sprintf("No space found in %q", b))
	}
	b = b[:i]
	n, err := strconv.ParseUint(string(b), 10, 64)
	if err != nil {
		panic(fmt.Sprintf("Failed to parse goroutine ID out of %q: %v", b, err))
	}
	return n
}

func Trace() func() {
	pc, _, _, ok := runtime.Caller(1)
	if !ok {
		panic("not found caller")
	}
	fn := runtime.FuncForPC(pc)
	name := fn.Name()
	gid := curGoroutineID()
	fmt.Printf("g[%05d]: enter: [%s]\n", gid, name)
	return func() { fmt.Printf("g[%05d]: exit: [%s]\n", gid, name) }
}

func A1() {
	defer Trace()()
	B1()
}
func B1() {
	defer Trace()()
	C1()
}
func C1() {
	defer Trace()()
	D()
}
func D() {
	defer Trace()()
}
func A2() {
	defer Trace()()
	B2()
}
func B2() {
	defer Trace()()
	C2()
}
func C2() {
	defer Trace()()
	D()
}
func main() {
	var wg sync.WaitGroup
	wg.Add(1)
	go func() {
		A2()
		wg.Done()
	}()
	A1()
	wg.Wait()
}

out:
	g[00001]: enter: [main.A1]
    g[00001]: enter: [main.B1]
    g[00001]: enter: [main.C1]
    g[00001]: enter: [main.D]
    g[00001]: exit: [main.D]
    g[00001]: exit: [main.C1]
    g[00001]: exit: [main.B1]
    g[00001]: exit: [main.A1]
    g[00006]: enter: [main.A2]
    g[00006]: enter: [main.B2]
    g[00006]: enter: [main.C2]
    g[00006]: enter: [main.D]
    g[00006]: exit: [main.D]
    g[00006]: exit: [main.C2]
    g[00006]: exit: [main.B2]
    g[00006]: exit: [main.A2]

由于 Go 运行时对 Goroutine 调度顺序的不确定性，各个 Goroutine 的输出还是会存在交织在一起的问题，这会给你查看某个 Goroutine 的函数调用链跟踪信息带来阻碍。这里提供一个小技巧：你可以将程序的输出重定向到一个本地文件中，然后通过 Goroutine ID 过滤出（可使用 grep 工具）你想查看的 groutine 的全部函数跟踪信息。

优化输出的跟踪信息

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
var goroutineSpace = []byte("goroutine ")

var mu sync.Mutex
var m = make(map[uint64]int)

// printTrace 层次输出追踪栈
func printTrace(id uint64, name, arrow string, indent int) {
	indents := ""
	for i := 0; i < indent; i++ {
		indents += " "
	}
	fmt.Printf("g[%05d]:%s%s%s\n", id, indents, arrow, name)
}

// curGoroutineID 获取当前goroutine的id
func curGoroutineID() uint64 {
	b := make([]byte, 64)
	b = b[:runtime.Stack(b, false)]
	// Parse the 4707 out of "goroutine 4707 ["
	b = bytes.TrimPrefix(b, goroutineSpace)
	i := bytes.IndexByte(b, ' ')
	if i < 0 {
		panic(fmt.Sprintf("No space found in %q", b))
	}
	b = b[:i]
	n, err := strconv.ParseUint(string(b), 10, 64)
	if err != nil {
		panic(fmt.Sprintf("Failed to parse goroutine ID out of %q: %v", b, err))
	}
	return n
}

func Trace() func() {
	pc, _, _, ok := runtime.Caller(1)
	if !ok {
		panic("not found caller")
	}
	fn := runtime.FuncForPC(pc)
	name := fn.Name()
	gid := curGoroutineID()

	mu.Lock()
	indents := m[gid]    // 获取当前gid对应的缩进层次
	m[gid] = indents + 1 // 缩进层次+1后存入map
	mu.Unlock()
	printTrace(gid, name, "->", indents+1)
	return func() {
		mu.Lock()
		indents := m[gid]    // 获取当前gid对应的缩进层次
		m[gid] = indents - 1 // 缩进层次-1后存入map
		mu.Unlock()
		printTrace(gid, name, "<-", indents)
	}
}

...


out:
	g[00001]: ->main.A1
    g[00001]:  ->main.B1
    g[00001]:   ->main.C1
    g[00001]:    ->main.D
    g[00001]:    <-main.D
    g[00001]:   <-main.C1
    g[00001]:  <-main.B1
    g[00001]: <-main.A1
    g[00006]: ->main.A2
    g[00006]:  ->main.B2
    g[00006]:   ->main.C2
    g[00006]:    ->main.D
    g[00006]:    <-main.D
    g[00006]:   <-main.C2
    g[00006]:  <-main.B2
    g[00006]: <-main.A2

为了格式化输出我们使用了一个 map 类型变量 m 来保存每个 Goroutine 当前的缩进信息：

m 的 key 为 Goroutine 的 ID，值为缩进的层次。

然后，考虑到 Trace 函数可能在并发环境中运行，因为的“map 不支持并发写”的，我们增加了一个 sync.Mutex 实例 mu 用于同步对 m 的写操作。这样，对于一个 Goroutine 来说，每次刚进入一个函数调用，我们就在输出入口跟踪信息之前，将缩进层次加一，并输出入口跟踪信息，加一后的缩进层次值也保存到 map 中。然后，在函数退出前，我们取出当前缩进层次值并输出出口跟踪信息，之后再将缩进层次减一后保存到 map中。

除了增加缩进层次信息外，在这一版的 Trace 函数实现中，我们也把输出出入口跟踪信息的操作提取到了一个独立的函数 printTrace 中，这个函数会根据传入的 Goroutine ID、函数名、箭头类型与缩进层次值，按预定的格式拼接跟踪信息并输出。

利用代码生成自动注入 Trace 函数

将一些核心的逻辑封装程一个 module （完整代码在地址在 [trace](luenci/trace: 一个跟踪函数调用链的工具库 (github.com)) ）

扩展：函数调用可视化

推荐一个三方库：https://github.com/ofabry/go-callvis

defer的妙用之跟踪函数调用链#

自动获取所跟踪函数的函数名#

增加 Goroutine 标识#

优化输出的跟踪信息#

利用代码生成自动注入 Trace 函数#

扩展：函数调用可视化#