做了个工具, 可以不修改目标 go 进程的源码 dump 出它的 runtime 信息, 对跑在容器里的进程也 work: https://github.com/monsterxx03/gospy
限制:
可以用来检测 goroutine 的泄漏, dump 一些 go process GMP 调度模型的细节信息. 显示的一些字段的含义, 可能需要对 go 的 runtime 有一定了解才明白, 基本都来自 runtime/runtime2.go, runtime/proc.go
显示每个 goroutine 在做什么: sudo ./gospy summary --pid 123
bin: /proc/5134/exe, goVer: 1.12.8, gomaxprocs: 6
Sched: NMidle 6, NMspinning 0, NMfreed 0, NPidle 5, NGsys 16, Runqsize: 0
P0 idle, schedtick: 642, syscalltick: 81, curM: M0
P1 idle, schedtick: 959, syscalltick: 67, curM: nil
P2 idle, schedtick: 992, syscalltick: 32, curM: nil
P3 idle, schedtick: 581, syscalltick: 17, curM: nil
P4 idle, schedtick: 89, syscalltick: 8, curM: nil
P5 idle, schedtick: 231, syscalltick: 5, curM: nil
Threads: 14 total, 0 running, 14 sleeping, 0 stopped, 0 zombie
Goroutines: 44 total, 0 idle, 0 running, 5 syscall, 39 waiting
goroutines:
1 - waiting for chan receive: main (/usr/local/go/src/runtime/proc.go:110)
2 - waiting for force gc (idle): forcegchelper (/usr/local/go/src/runtime/proc.go:242)
3 - waiting for GC sweep wait: bgsweep (/usr/local/go/src/runtime/mgcsweep.go:64)
7(M8)- syscall: timerproc (/usr/local/go/src/runtime/time.go:247)
8 - waiting for select: start (/app/vendor/go.opencensus.io/stats/view/worker.go:149)
14 - waiting for GC worker (idle): gcBgMarkWorker (/usr/local/go/src/runtime/mgc.go:1807)
15 - waiting for GC worker (idle): gcBgMarkWorker (/usr/local/go/src/runtime/mgc.go:1807)
17 - waiting for finalizer wait: runfinq (/usr/local/go/src/runtime/mfinal.go:161)
19(M4)- syscall: loop (/usr/local/go/src/os/signal/signal_unix.go:21)
...
把 goroutine 按它们正在执行的函数进行 group by count, 哪个持续增长就是有泄漏, sudo gospy top --pid 123:
1
sherlockmao 2019-09-30 14:16:12 +08:00
谢谢!学习到了很多,binary 和 proc 两个文件逻辑很清晰,太感谢了!
|