「PROMETHEUS」- kubelet cAdvisor

cAdvisor 能收集有关给定节点上运行的所有容器的 CPU、内存、文件、网络使用情况的统计信息。kubelet 已集成 cAdvisor 的功能，用于监控资源使用情况并分析容器的性能（cAdvisor 不在 Pod 级别操作，而是针对 Container 级别）；

指标信息

cadvisor/prometheus.md/Prometheus container metrics

磁盘及文件系统相关

Bandwidth = irate(container_fs_(rw)_bytes_total{}[5m])
1）container_fs_reads_bytes_total Counter Cumulative count of bytes read bytes diskIO
2）container_fs_writes_bytes_total Counter Cumulative count of bytes written bytes diskIO

IOPS = irate(container_fs_(rw)_total{}[5m])
1）container_fs_reads_total Counter Cumulative count of reads completed diskIO
2）container_fs_writes_total Counter Cumulative count of writes completed diskIO

Latency = irate(container_fs_(rw)_seconds_total{}[5m]) / irate(container_fs_(rw)_total{}[5m])
1）container_fs_read_seconds_total Counter Cumulative count of seconds spent reading diskIO
2）container_fs_write_seconds_total Counter Cumulative count of seconds spent writing seconds diskIO

Merged = irate(container_fs_(rw)_merged_total{}[5m])
1）container_fs_reads_merged_total Counter Cumulative count of reads merged diskIO
2）container_fs_writes_merged_total Counter Cumulative count of writes merged diskIO

container_blkio_device_usage_total	Counter	Blkio device bytes usage	bytes	diskIO

container_fs_inodes_free	Gauge	Number of available Inodes		disk	
container_fs_inodes_total	Gauge	Total number of Inodes		disk	

container_fs_io_current	Gauge	Number of I/Os currently in progress		diskIO	
container_fs_io_time_seconds_total	Counter	Cumulative count of seconds spent doing I/Os	seconds	diskIO	
container_fs_io_time_weighted_seconds_total	Counter	Cumulative weighted I/O time	seconds	diskIO	
container_fs_limit_bytes	Gauge	Number of bytes that can be consumed by the container on this filesystem	bytes	disk	

container_fs_sector_reads_total	Counter	Cumulative count of sector reads completed		diskIO	
container_fs_sector_writes_total	Counter	Cumulative count of sector writes completed		diskIO	
container_fs_usage_bytes	Gauge	Number of bytes that are consumed by the container on this filesystem	bytes	disk

局限：
1）无法监控所有块设备。例如 container_fs_(r/w)_seconds_total，根据源码（cadvisor/prometheus.go），该指标与文件系统相关。如果块设备没有挂载，则也不会存在该设备相关的指标；
2）无法监控网络文件系统；

服务部署

配置 Exporter 实例

kubelet，通过 API-server/api/v1/nodes/node/proxy/metrics/cadvisor 暴露指标数据，所以不需要单独的 Exporter 实例；

配置 Prometheus Scrape 抓取

  - job_name: 'kubelet-cadvisor'
    scheme: https
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      insecure_skip_verify: true
    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    kubernetes_sd_configs:
    - role: node
    relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor

配置 Grafana Dashbaord 展示

Grafana Labs/Docker-cAdvisor
Grafana Labs/Docker and OS metrics ( cadvisor, node_exporter )

常见问题

container=”POD”

What is the container=”POD” label in Prometheus and why do most examples exclude it?

the pause containers.

参考文献

Kubernetes cAdvisor: Native Monitoring and Metrics
容器监控：cAdvisor – prometheus-book

Filed under: K4NZDROID - @ 11:19 PM

NOTE

/ 记录问题 / 解决问题 / 技术博客 / 工作笔记 /

Table of Contents

Categories

Recent Posts

Archives

「PROMETHEUS」- kubelet cAdvisor

指标信息

磁盘及文件系统相关

服务部署

配置 Exporter 实例

配置 Prometheus Scrape 抓取

配置 Grafana Dashbaord 展示

常见问题

container=”POD”

参考文献

#ezw_tco-9 .ez-toc-title{ font-size: 120%; font-weight: 500; color: #000; } #ezw_tco-9 .ez-toc-widget-container ul.ez-toc-list li a{ font-size: 120%; font-weight: 500; color: #000; } #ezw_tco-9 .ez-toc-widget-container ul.ez-toc-list li.active{ background-color: #ededed; } Table of Contents

Categories

Recent Posts

Archives

指标信息

磁盘及文件系统相关

服务部署

配置 Exporter 实例

配置 Prometheus Scrape 抓取

配置 Grafana Dashbaord 展示

常见问题

container=”POD”

参考文献

Table of Contents