prometheus基础详解

一、为什么需要 Prometheus？

1.1 传统监控的困境

在云计算和容器化浪潮之前，Zabbix、Nagios 等传统监控工具长期占据主流。它们的设计哲学是：

以主机为中心：IP 地址是唯一标识，一台物理机/虚拟机就是一个监控对象
推模式（Push）：Agent 主动上报数据，Server 被动接收
配置驱动：模板、触发器、动作需要大量手工配置

这套体系在 静态基础设施 时代运转良好，但面对 云原生动态环境 时暴露出致命缺陷：

场景	传统监控痛点
Kubernetes Pod 弹性伸缩	IP 每秒都在变，无法预先配置
微服务架构	服务实例成百上千，模板维护爆炸
短生命周期任务	容器启动 30 秒就销毁，Agent 来不及注册
多维指标分析	只能按主机分组，无法按 `namespace`、`pod`、`status_code` 灵活切片

1.2 Prometheus 的设计哲学

Prometheus 由前 Google SRE Julius Volz 于 2012 年在 SoundCloud 创立，2016 年成为 CNCF 第二个毕业项目（仅次于 Kubernetes）。其核心设计完全针对云原生场景：

"Metrics are king, labels are everything, pull is better than push."

特性	说明
多维数据模型	指标名 + 任意标签组合，天然支持动态分组、聚合、切片
Pull 采集模式	被监控端只需暴露 HTTP 端口，无需推送，解耦生命周期
服务发现	自动从 Kubernetes、Consul、云 API 获取目标列表
PromQL	强大的查询语言，支持实时聚合、函数运算、子查询
单机自治	不依赖外部存储，单二进制即可运行

二、核心架构：四层数据流

2.1 整体架构图


展开代码
┌─────────────────────────────────────────────────────────┐
│                      应用层（可视化）                     │
│   ┌─────────────┐    ┌─────────────┐    ┌─────────────┐ │
│   │   Grafana   │    │  Prometheus │    │  告警通知    │ │
│   │  仪表盘     │◄───│   Web UI    │    │ Slack/钉钉   │ │
│   │  (1860模板) │    │  (表达式调试)│    │ PagerDuty   │ │
│   └─────────────┘    └─────────────┘    └─────────────┘ │
│          ▲                   │                          │
│          │                   │ Webhook                  │
│          │                   ▼                          │
│   ┌─────────────────────────────────────────────────┐   │
│   │              控制层（查询+规则+告警）              │   │
│   │  ┌─────────────┐  ┌─────────────┐  ┌───────────┐ │   │
│   │  │  PromQL引擎  │  │ 告警规则    │  │Alertmanager│ │   │
│   │  │  (即时查询)  │  │ (Alerting)  │  │(路由/抑制) │ │   │
│   │  └─────────────┘  └─────────────┘  └───────────┘ │   │
│   └─────────────────────────────────────────────────┘   │
│                         │                               │
│                         ▼                               │
│   ┌─────────────────────────────────────────────────┐   │
│   │              存储层（TSDB时序数据库）              │   │
│   │  ┌─────────┐  ┌─────────┐  ┌─────────────────┐  │   │
│   │  │  Head   │  │  WAL    │  │  Block (2h压缩)  │  │   │
│   │  │ (内存)   │  │(预写日志)│  │  (持久化存储)    │  │   │
│   │  └─────────┘  └─────────┘  └─────────────────┘  │   │
│   └─────────────────────────────────────────────────┘   │
│                         ▲                               │
│                         │ Pull (HTTP GET /metrics)      │
│   ┌─────────────────────────────────────────────────┐   │
│   │              采集层（Targets）                    │   │
│   │  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────┐ │   │
│   │  │Node     │  │MySQL    │  │自定义    │  │K8s  │ │   │
│   │  │Exporter │  │Exporter│  │应用      │  │Pod  │ │   │
│   │  │(:9100)  │  │(:9104)  │  │(:8080)  │  │     │ │   │
│   │  └─────────┘  └─────────┘  └─────────┘  └─────┘ │   │
│   │  ┌─────────┐                                    │   │
│   │  │Push     │  ◄── 短任务/批处理（无法常驻）       │   │
│   │  │Gateway  │                                    │   │
│   │  └─────────┘                                    │   │
│   └─────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────┘

2.2 分层详解

2.2.1 采集层：Targets

Exporter 生态：Prometheus 不直接采集指标，而是通过 Exporter 将第三方数据转换为标准格式。

Exporter	监控对象	端口	关键指标
Node Exporter	Linux/Unix 主机	9100	CPU、内存、磁盘、网络、负载
DCGM Exporter	NVIDIA GPU	9400	显存、温度、利用率、ECC 错误
MySQL Exporter	MySQL 数据库	9104	QPS、连接数、慢查询、复制延迟
Blackbox Exporter	网络探测	9115	HTTP/TCP/ICMP 探活、证书过期
JMX Exporter	Java 应用	自定义	JVM 堆内存、GC、线程
KSM (kube-state-metrics)	Kubernetes 对象	8080	Pod/Deployment/Node 状态

指标格式：纯文本，Content-Type: text/plain; version=0.0.4


展开代码
# HELP node_cpu_seconds_total Seconds the CPUs spent in each mode.
# TYPE node_cpu_seconds_total counter
node_cpu_seconds_total{cpu="0",mode="idle"} 123456.78
node_cpu_seconds_total{cpu="0",mode="user"} 98765.43
node_cpu_seconds_total{cpu="0",mode="system"} 54321.09

关键设计：HELP 说明指标含义，TYPE 声明数据类型（counter/gauge/histogram/summary），标签 cpu="0" 实现多维。

2.2.2 存储层：TSDB 时序数据库

Prometheus 内置的 Time Series Database (TSDB) 专为监控场景优化：

特性	实现	效果
内存 Head	最新 2 小时数据驻留内存	查询极快
WAL 预写日志	崩溃后重放恢复	数据不丢
Block 压缩	每 2 小时落盘，Gorilla 算法压缩	存储成本 1/10
倒排索引	标签 → 时间序列映射	标签查询毫秒级

存储路径：


展开代码
/opt/prometheus/data/
├── 01J8X.../           # 2 小时 Block
│   ├── chunks/         # 压缩后的样本数据
│   ├── index           # 倒排索引
│   ├── meta.json       # 元数据
│   └── tombstones      # 删除标记
├── wal/                # 预写日志
└── chunks_head/        # 内存中的 Head

2.2.3 控制层：查询与告警

PromQL 核心能力：


展开代码
# 1. 即时查询：当前 CPU 使用率
100 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# 2. 聚合：按 instance 分组，取最大内存使用率
max by (instance) (
  (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes)
  / node_memory_MemTotal_bytes * 100
)

# 3. 范围向量：过去 1 小时的请求速率变化
rate(http_requests_total[5m]) offset 1h

# 4. 子查询：过去 10 分钟的最大 5 分钟平均负载
max_over_time(
  avg(node_load1) by (instance)[5m:1m]
)[10m:]

告警规则生命周期：


展开代码
指标采集 → PromQL 评估 → 条件持续(for) → Alertmanager
              ↓
         记录规则（Recording Rules）
         预计算高频查询，加速+降载

Alertmanager 高级路由：


展开代码
route:
  group_by: ['alertname', 'severity']    # 分组去重
  group_wait: 30s                         # 等待聚合同类告警
  group_interval: 5m                      # 组内告警间隔
  repeat_interval: 4h                     # 重复告警抑制
  
  routes:
    - match:
        severity: critical
      receiver: 'pagerduty-oncall'        # 关键告警 → 电话
      continue: true
    
    - match:
        severity: warning
      receiver: 'slack-infra'             # 一般告警 → 群通知

inhibit_rules:                            # 抑制规则
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'instance']       # 关键告警触发时，同类 warning 静默

2.2.4 扩展层：高可用与长期存储

方案	架构	适用场景
Thanos	Sidecar + Query + Store + Compact + Receive	全局视图、对象存储、降采样
Cortex	微服务 + 分片 + 云存储	多租户、超大集群
VictoriaMetrics	单二进制或集群版	兼容 PromQL、资源效率高
Mimir	Grafana Labs 出品	云原生、S3 后端

联邦集群（Federation）：上层 Prometheus 只拉取下层聚合指标，用于跨机房级联。

三、生产部署实战

3.1 单节点部署（快速起步）


展开代码
# 1. 下载并解压
wget https://github.com/prometheus/prometheus/releases/download/v2.51.2/prometheus-2.51.2.linux-amd64.tar.gz
tar -xzf prometheus-2.51.2.linux-amd64.tar.gz
sudo mv prometheus-2.51.2.linux-amd64 /opt/prometheus

# 2. 创建数据目录
sudo mkdir -p /opt/prometheus/data
sudo useradd -r -s /bin/false prometheus
sudo chown -R prometheus:prometheus /opt/prometheus

# 3. 配置文件（带 Node Exporter 采集）
sudo tee /opt/prometheus/prometheus.yml <<'EOF'
global:
  scrape_interval: 15s
  evaluation_interval: 15s
  external_labels:
    cluster: 'prod'
    replica: 'prometheus-01'

alerting:
  alertmanagers:
    - static_configs:
        - targets: ['localhost:9093']

rule_files:
  - /opt/prometheus/rules/*.yml

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
    metrics_path: /metrics

  - job_name: 'node-exporter'
    static_configs:
      - targets: ['localhost:9100', '10.0.0.5:9100', '10.0.0.6:9100']
    relabel_configs:
      - source_labels: [__address__]
        target_label: instance

  - job_name: 'kubernetes-pods'
    kubernetes_sd_configs:
      - role: pod
        namespaces:
          names: ['default', 'kube-system']
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
EOF

# 4. Systemd 服务（生产级参数）
sudo tee /etc/systemd/system/prometheus.service <<'EOF'
[Unit]
Description=Prometheus Monitoring System
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=prometheus
Group=prometheus
ExecStart=/opt/prometheus/prometheus \
  --config.file=/opt/prometheus/prometheus.yml \
  --storage.tsdb.path=/opt/prometheus/data \
  --storage.tsdb.retention.time=30d \
  --storage.tsdb.retention.size=50GB \
  --storage.tsdb.wal-compression \
  --web.console.templates=/opt/prometheus/consoles \
  --web.console.libraries=/opt/prometheus/console_libraries \
  --web.enable-lifecycle \
  --web.enable-admin-api \
  --web.listen-address=0.0.0.0:9090 \
  --log.level=info \
  --log.format=json

Restart=always
RestartSec=5
LimitNOFILE=65535

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable --now prometheus

关键启动参数说明：

参数	作用	建议值
`--storage.tsdb.retention.time`	保留时长	15d-30d
`--storage.tsdb.retention.size`	最大磁盘占用	50GB-200GB
`--storage.tsdb.wal-compression`	WAL 压缩	启用，省 50% 磁盘
`--web.enable-lifecycle`	支持 `/-/reload` 热加载	必须启用
`--web.enable-admin-api`	支持快照、删除数据	生产慎用
`--log.format=json`	结构化日志	方便接入 ELK/Loki

3.2 Node Exporter 部署（主机监控）


展开代码
# 安装
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz
tar -xzf node_exporter-1.7.0.linux-amd64.tar.gz
sudo mv node_exporter-1.7.0.linux-amd64/node_exporter /usr/local/bin/

# 启用所有 collectors（生产推荐）
sudo tee /etc/systemd/system/node_exporter.service <<'EOF'
[Unit]
Description=Node Exporter
After=network.target

[Service]
Type=simple
User=node_exporter
Group=node_exporter
ExecStart=/usr/local/bin/node_exporter \
  --path.rootfs=/host \
  --path.procfs=/host/proc \
  --path.sysfs=/host/sys \
  --collector.cpu \
  --collector.meminfo \
  --collector.diskstats \
  --collector.filesystem \
  --collector.loadavg \
  --collector.stat \
  --collector.time \
  --collector.uname \
  --collector.vmstat \
  --collector.systemd \
  --collector.tcpstat \
  --collector.processes \
  --web.listen-address=0.0.0.0:9100 \
  --web.telemetry-path=/metrics

Restart=always

[Install]
WantedBy=multi-user.target
EOF

sudo useradd -r -s /bin/false node_exporter
sudo systemctl daemon-reload
sudo systemctl enable --now node_exporter

3.3 Grafana 可视化（模板 ID: 1860）


展开代码
# 安装 Grafana
sudo apt-get install -y apt-transport-https software-properties-common
sudo mkdir -p /etc/apt/keyrings/
wget -q -O- https://apt.grafana.com/gpg.key | sudo gpg --dearmor -o /etc/apt/keyrings/grafana.gpg
echo "deb [signed-by=/etc/apt/keyrings/grafana.gpg] https://apt.grafana.com stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt-get update
sudo apt-get install -y grafana

sudo systemctl enable --now grafana-server

# 登录 http://<IP>:3000，默认 admin/admin
# 配置步骤：
# 1. Configuration → Data Sources → Add Prometheus → URL: http://localhost:9090
# 2. Create → Import → 输入 1860（Node Exporter Full）→ 选择 Prometheus 数据源

四、Prometheus vs Zabbix：选型决策

对比维度	Prometheus	Zabbix
设计年代	2012（云原生时代）	1998（传统 IT 时代）
核心抽象	多维时间序列（指标+标签）	主机+模板+触发器
标识方式	多维度 Label（IP 只是 label 之一）	以 IP 为主键，主机概念强
自动发现	原生支持 K8s、Consul、DNS、云 API	需模板+LLD，配置复杂
采集模式	纯 Pull（含 Push-Gateway 中转）	Push 或 Pull 可选
Agent 架构	每类指标一个 Exporter（多进程、无状态）	单 Zabbix Agent（插件扩展）
自定义监控	需写 Exporter 或嵌入 SDK，开发量较大	Shell/Python 脚本即可，Agent 用户参数秒级上线
存储引擎	自带 TSDB，块+WAL，压缩率高	MySQL/PostgreSQL/Oracle，需单独维护 DB
查询语言	PromQL（函数丰富，聚合灵活）	内置计算+SQL，聚合能力有限
告警能力	Alertmanager 去重、抑制、路由、静默	内置动作+媒介，功能全但配置重
部署复杂度	单二进制+systemd 即可起步	Server+DB+Agent，组件多
云原生集成	官方 K8s Operator、ServiceMonitor CRD	社区模板，非一等公民
学习曲线	中等（需理解 PromQL、标签思维）	陡峭（模板、触发器、动作嵌套）
适用场景	容器、微服务、动态云环境	传统物理机、虚拟机、网络设备

4.1 选型建议

场景	推荐	理由
Kubernetes 集群监控	Prometheus	官方集成，自动发现 Pod/Service
物理机/VM 基础监控	Zabbix	Agent 成熟，模板丰富，脚本扩展快
混合云/多云环境	Prometheus + Thanos	统一查询，长期存储
传统企业 IT（已有 Zabbix）	Zabbix 为主，Prometheus 补容器	保护投资，渐进演进
快速原型/个人项目	Prometheus	部署简单，社区活跃

4.2 混合架构实践


展开代码
┌─────────────────────────────────────────┐
│           Zabbix（传统层）               │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐ │
│  │物理服务器│  │网络设备 │  │数据库   │ │
│  │Zabbix Agent│ │SNMP    │  │Agent   │ │
│  └─────────┘  └─────────┘  └─────────┘ │
└─────────────────────────────────────────┘
                    │
                    ▼  Zabbix API 同步主机元数据
┌─────────────────────────────────────────┐
│        Prometheus（云原生层）            │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐ │
│  │K8s集群  │  │容器微服务│  │应用指标 │ │
│  │Operator │  │Sidecar │  │SDK嵌入 │ │
│  └─────────┘  └─────────┘  └─────────┘ │
│  ┌─────────────────────────────────────┐ │
│  │        Thanos（全局查询）            │ │
│  │  统一 Zabbix + Prometheus 数据       │ │
│  └─────────────────────────────────────┘ │
└─────────────────────────────────────────┘
                    │
                    ▼
              ┌─────────┐
              │ Grafana │
              │统一展示  │
              └─────────┘