节点内存态统计和计算Node-metrics
节点内存态统计和计算agent - Node-metrics
背景
请查看第一篇:https://stack.kubeservice.cn/docs/%E8%AE%BE%E8%AE%A1%E6%96%87%E6%A1%A3/k8s-crane-plus-schduler/
实现
Node Metrics
是内存态统计计算模块,实现metrics的avg
、min
、max
等级的数据聚合查询。
Node Metrics
= Node exporter
+ Prometheus PromSQL
Node Metrics中添加了:
Memory TSDB
, 添加轻量内存化内存存储Statistics
, 实现通用内存avg
、min
、max
等静态function方法Scheduler
, 实现定时采集,数据从proc
中采集统一方法Server Handler
, 数据通过metrics
和statistics
方法对外提供
以存储一天数据为例: 每10s
存储一次,每次存储cpu
、memory
和disk
原生数据 3个
整个存储数量为: 也就是 300KB
不到.
(38Byte(float64)+8Byte(time数据)) 24 * 3600/10 = 276480Byte = 270KB
使用
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app: node-metrics
name: node-metrics
namespace: crane-system
spec:
selector:
matchLabels:
app: node-metrics
template:
metadata:
labels:
app: node-metrics
spec:
containers:
- image: dongjiang1989/node-metrics:latest
name: node-metrics
args:
- --web.listen-address=0.0.0.0:19101
resources:
limits:
cpu: 102m
memory: 180Mi
requests:
cpu: 102m
memory: 180Mi
hostNetwork: true
hostPID: true
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
两类接口:
- 接口“/metrics”接口
...
# HELP node_cpu_usage_active cpu usage active.
# TYPE node_cpu_usage_active gauge
node_cpu_usage_active 6.801955214695443
# HELP node_cpu_usage_avg_5m cpu usage avg 5m.
# TYPE node_cpu_usage_avg_5m gauge
node_cpu_usage_avg_5m 6.8018810008297335
# HELP node_cpu_usage_max_avg_1d cpu usage max avg 1d.
# TYPE node_cpu_usage_max_avg_1d gauge
node_cpu_usage_max_avg_1d 6.801955214695443
# HELP node_cpu_usage_max_avg_1h cpu usage max avg 1h.
# TYPE node_cpu_usage_max_avg_1h gauge
node_cpu_usage_max_avg_1h 6.801955214695443
# HELP node_mem_usage_active mem usage active.
# TYPE node_mem_usage_active gauge
node_mem_usage_active 44.272822236553765
# HELP node_mem_usage_avg_5m mem usage avg 5m.
# TYPE node_mem_usage_avg_5m gauge
node_mem_usage_avg_5m 43.68676937682602
# HELP node_mem_usage_max_avg_1d mem usage max avg 1d.
# TYPE node_mem_usage_max_avg_1d gauge
node_mem_usage_max_avg_1d 44.447325557125225
# HELP node_mem_usage_max_avg_1h mem usage max avg 1h.
# TYPE node_mem_usage_max_avg_1h gauge
node_mem_usage_max_avg_1h 44.447325557125225
...
- 接口“/statistics”接口
{
"cpu_usage_active": 6.801955214695443,
"cpu_usage_avg_5m": 6.8018810008297335,
"cpu_usage_max_avg_1d": 6.801955214695443,
"cpu_usage_max_avg_1h": 6.801955214695443,
"mem_usage_active": 44.272822236553765,
"mem_usage_avg_5m": 43.68676937682602,
"mem_usage_max_avg_1d": 44.447325557125225,
"mem_usage_max_avg_1h": 44.447325557125225
}
真实使用
线上每一个DaemonSet
的node-metrics
占用7MB
内存.
以1000个node节点为例子: 7MB*1000 ~= 7GB
比2台32Core 64GB机器节约不少