节点内存态统计和计算Node-metrics

节点内存态统计和计算agent - Node-metrics

背景

请查看第一篇:https://stack.kubeservice.cn/docs/%E8%AE%BE%E8%AE%A1%E6%96%87%E6%A1%A3/k8s-crane-plus-schduler/

实现

Node Metrics是内存态统计计算模块,实现metrics的avgminmax 等级的数据聚合查询。

Node Metrics = Node exporter + Prometheus PromSQL

Node Metrics中添加了:

  • Memory TSDB, 添加轻量内存化内存存储
  • Statistics, 实现通用内存avgminmax等静态function方法
  • Scheduler, 实现定时采集,数据从proc中采集统一方法
  • Server Handler, 数据通过metricsstatistics 方法对外提供

以存储一天数据为例: 每10s存储一次,每次存储cpumemorydisk 原生数据 3个 整个存储数量为: 也就是 300KB 不到.

(38Byte(float64)+8Byte(time数据)) 24 * 3600/10 = 276480Byte = 270KB

使用

apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app: node-metrics
  name: node-metrics
  namespace: crane-system
spec:
  selector:
    matchLabels:
      app: node-metrics
  template:
    metadata:
      labels:
        app: node-metrics
    spec:
      containers:
      - image: dongjiang1989/node-metrics:latest
        name: node-metrics
        args:
        - --web.listen-address=0.0.0.0:19101
        resources:
          limits:
            cpu: 102m
            memory: 180Mi
          requests:
            cpu: 102m
            memory: 180Mi
      hostNetwork: true
      hostPID: true
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master

两类接口:

  1. 接口“/metrics”接口
...
# HELP node_cpu_usage_active cpu usage active.
# TYPE node_cpu_usage_active gauge
node_cpu_usage_active 6.801955214695443
# HELP node_cpu_usage_avg_5m cpu usage avg 5m.
# TYPE node_cpu_usage_avg_5m gauge
node_cpu_usage_avg_5m 6.8018810008297335
# HELP node_cpu_usage_max_avg_1d cpu usage max avg 1d.
# TYPE node_cpu_usage_max_avg_1d gauge
node_cpu_usage_max_avg_1d 6.801955214695443
# HELP node_cpu_usage_max_avg_1h cpu usage max avg 1h.
# TYPE node_cpu_usage_max_avg_1h gauge
node_cpu_usage_max_avg_1h 6.801955214695443
# HELP node_mem_usage_active mem usage active.
# TYPE node_mem_usage_active gauge
node_mem_usage_active 44.272822236553765
# HELP node_mem_usage_avg_5m mem usage avg 5m.
# TYPE node_mem_usage_avg_5m gauge
node_mem_usage_avg_5m 43.68676937682602
# HELP node_mem_usage_max_avg_1d mem usage max avg 1d.
# TYPE node_mem_usage_max_avg_1d gauge
node_mem_usage_max_avg_1d 44.447325557125225
# HELP node_mem_usage_max_avg_1h mem usage max avg 1h.
# TYPE node_mem_usage_max_avg_1h gauge
node_mem_usage_max_avg_1h 44.447325557125225
...
  1. 接口“/statistics”接口
{
  "cpu_usage_active": 6.801955214695443,
  "cpu_usage_avg_5m": 6.8018810008297335,
  "cpu_usage_max_avg_1d": 6.801955214695443,
  "cpu_usage_max_avg_1h": 6.801955214695443,
  "mem_usage_active": 44.272822236553765,
  "mem_usage_avg_5m": 43.68676937682602,
  "mem_usage_max_avg_1d": 44.447325557125225,
  "mem_usage_max_avg_1h": 44.447325557125225
}

真实使用

线上每一个DaemonSetnode-metrics 占用7MB内存.

以1000个node节点为例子: 7MB*1000 ~= 7GB 比2台32Core 64GB机器节约不少

Source

https://github.com/kubeservice-stack/node-metrics