监控报警了,top 命令查看
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1171 root 20 0 1155672 785112 77040 S 120.3 9.6 157:44.39 kube-apiserver
7903 root 20 0 10.742g 777632 46784 S 5.3 9.5 8:23.43 etcd
8957 root 20 0 1365948 123764 73864 S 1.3 1.5 2:57.95 kubelet
10369 root 20 0 44012 31584 20276 S 1.3 0.4 1:53.49 calico-felix
1147 root 20 0 451168 89944 68120 S 1.0 1.1 1:51.80 kube-scheduler
可以看到 CPU 飙到了 120%,不知道是什么原因导致的
看apisever的日志是
E0425 10:00:25.721663 1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: Error: 'dial tcp 10.43.42.227:443: i/o timeout'
Trying to reach: 'https://10.43.42.227:443/openapi/v2', Header: map[]
I0425 10:00:25.721833 1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
I0425 10:00:25.722490 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.metrics.k8s.io
E0425 10:00:26.573574 1 available_controller.go:311] v1beta1.admission.certmanager.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.admission.certmanager.k8s.io": the object has been modified; please apply your changes to the latest version and try again
E0425 10:00:27.436960 1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.metrics.k8s.io": the object has been modified; please apply your changes to the latest version and try again
1
HypoChen 2019-04-25 17:01:23 +08:00
先查日志,看看有啥异常,比如是不是啥服务 bug 了在 D 你的 api server
|
2
salamanderMH OP @HypoChen
我看了下 apisever 的日志 ``` E0425 09:11:11.383772 1 available_controller.go:311] v1beta1.admission.certmanager.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.admission.certmanager.k8s.io": the object has been modified; please apply your changes to the latest version and try again E0425 09:11:14.341853 1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.metrics.k8s.io": the object has been modified; please apply your changes to the latest version and try again E0425 09:11:16.391080 1 available_controller.go:311] v1beta1.admission.certmanager.k8s.io failed with: Get https://10.43.42.227:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) E0425 09:11:19.349480 1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: Get https://10.43.219.61:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) E0425 09:11:21.400839 1 available_controller.go:311] v1beta1.admission.certmanager.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.admission.certmanager.k8s.io": the object has been modified; please apply your changes to the latest version and try again E0425 09:11:24.367592 1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.metrics.k8s.io": the object has been modified; please apply your changes to the latest version and try again ``` |
3
HypoChen 2019-04-25 18:30:02 +08:00
@salamanderMH api server 的网络请求量如何?
|
4
salamanderMH OP @HypoChen 我看到内网流出带宽有 1.29M bit/s, 内网流入带宽是 400k bit/s
|
5
0312birdzhang 2019-04-26 08:16:42 +08:00
什么版本的?感觉你这个版本有 bug,重启 kubelet 可以缓解
|
6
salamanderMH OP @0312birdzhang 1.11
|
7
0312birdzhang 2019-04-26 12:35:39 +08:00
@salamanderMH 具体到小版本号
|
8
salamanderMH OP @0312birdzhang v1.11.6
|
9
0312birdzhang 2019-04-26 13:36:31 +08:00
@salamanderMH #8 可以直接升级到 1.11.7,有一个 bug 在 1.11.7 修复了。不过看到你的报错还跟我们的不完全一样,我们的是提示 version 已经更改了什么的
|
10
salamanderMH OP @0312birdzhang 好的,我试试
|