Grafana-监控

官网：https://grafana.com/

中文：https://grafana.com/zh-cn/grafana/

下载地址：https://grafana.com/grafana/download?pg=graf&plcmt=deploy-box-1

模板社区：https://grafana.com/grafana/dashboards/

简介

Grafana 是一个开源的、功能强大的指标数据可视化与分析平台。你可以把它理解为一个专业的 “数据仪表盘（Dashboard）制作和展示工具”。

它的核心作用是：连接各种数据源，将枯燥的数字指标转化为直观的图表、图形和警报，帮助你监控基础设施、应用程序的性能以及业务数据。

与 Prometheus 的区别

这是一个非常常见的疑问：

Prometheus：主要负责抓取和存储时序数据。它是数据库和数据采集器。
Grafana：主要负责查询 Prometheus（或其他数据源）中的数据并展示。它是数据的可视化前端。
关系： “Prometheus 是发动机和油箱，Grafana 是炫酷的仪表盘和方向盘。”

核心特性与优势

多数据源支持（最核心的优势）

Grafana 并不绑定任何特定的数据库。它可以同时从众多数据源中查询和聚合数据，并在同一个仪表盘上展示。这是它击败很多竞品的根本原因。

时序数据库： Prometheus, InfluxDB, TimescaleDB, Graphite, OpenTSDB
日志系统： Loki (Grafana 自家的日志聚合系统), Elasticsearch
分布式追踪： Tempo, Jaeger, Zipkin
云服务：阿里云 (Alibaba Cloud)、腾讯云 (Tencent Cloud)、华为云 (Huawei Cloud)
关系型数据库： MySQL, PostgreSQL, SQL Server
...以及许多其他数据源。

强大灵活的可视化

丰富的面板（Panel）：除了基础的折线图、柱状图、仪表盘（Gauge）、统计值（Stat），还支持热图（Heatmap）、地理地图、直方图、日志列表等。
可定制化：每个面板都有极其详细的配置选项，可以自定义颜色、单位、坐标轴、阈值、图例等。
插件生态：社区提供了大量官方和第三方的面板插件，可以满足各种特殊的可视化需求。

动态与交互式仪表盘

模板变量（Templating）：可以创建下拉菜单选择器（例如：选择不同的主机、服务、环境），仪表盘内的所有图表会根据选择的值动态刷新。这是构建高级、通用型仪表盘的关键。
钻取（Drill-down）：可以设置链接，从一张图点击后跳转到更详细的另一张仪表盘，实现故障的层层深入排查。

灵活的告警系统

Grafana 内置了强大的告警引擎，可以让你基于仪表盘中的查询结果定义告警规则。

多通知渠道：告警可以通过 Slack, PagerDuty, Email, Webhook 等多种方式通知到相关人员。
可视化配置：直接在图表上绘制阈值线，并基于此创建告警，非常直观。

典型应用场景

IT 基础设施监控
- 监控服务器（CPU、内存、磁盘、网络流量）。
- 监控数据库、中间件（MySQL, Redis, Nginx）的性能指标。
- 技术栈： Node Exporter+ Prometheus+ Grafana是当前最流行的方案。
云原生与容器监控
- 监控 Kubernetes 集群（节点状态、Pod/容器资源使用、部署状态）。
- 技术栈： cAdvisor+ node-exporter+ kube-state-metrics+ Prometheus+ Grafana。
应用性能监控（APM）
- 监控应用程序的吞吐量、响应时间、错误率（例如：QPS、Latency、500错误数）。
- 通常需要应用程序埋点或通过 Sidecar 模式导出指标。
业务智能（BI）与报表：由于支持传统 SQL 数据库，Grafana 也可以用来制作业务数据的报表，如每日活跃用户数、订单数量、销售额等。
统一可观测性平台：结合 Grafana Loki（日志）、Tempo（追踪） 和 Prometheus（指标），Grafana 提供了在一个平台内关联指标（Metrics）、日志（Logs）和追踪（Traces）的能力，极大地提升了排障效率。

Grafana-部署

bash

# Red Hat, CentOS, RHEL, and Fedora
## 官方源
dnf install -y https://dl.grafana.com/grafana-enterprise/release/12.1.1/grafana-enterprise_12.1.1_16903967602_linux_amd64.rpm
## 阿里云源
dnf -y install https://mirrors.aliyun.com/grafana/yum/rpm/Package/grafana-enterprise-10.1.0-1.x86_64.rpm

# Ubuntu and Debian
sudo apt-get install -y adduser libfontconfig1 musl
wget https://dl.grafana.com/grafana-enterprise/release/12.1.1/grafana-enterprise_12.1.1_16903967602_linux_amd64.deb
sudo dpkg -i grafana-enterprise_12.1.1_16903967602_linux_amd64.deb

grafana默认端口为3000，打开浏览器输入 http://ip:3000 访问，默认账号密码都为admin。

Grafana 配置

汉化

从**v7.x+**开始对多语言的支持更为完善，部分版本（如v7.3.1）有完全兼容的中文语言包

界面设置

通过修改配置文件

编辑 Grafana 的配置文件 grafana.ini，yum安装的位置在/etc/grafana/grafana.ini
将 default_language或 default的值修改为 zh-CN
保存修改后，重启 Grafana 服务使更改生效。

添加连接器

点击Configuration —> Add new connection —> 搜索框搜索redis —> 右上角Install

添加数据源

点击Configuration —> Data Source，进入数据源配置页面后，点击“ADD data source”按键

Grafana 常用模板

监控物理机/虚拟机ID（Linux）

8919
9276
1860

监控物理机/虚拟机ID（windows）

10467
10171
2129

监控容器ID

3146
8685
10000
8588
315

监控数据库ID

7362
10101

监控网站或者协议端口ID

http监控某个网站
icmp监控某台机器
tcp监控某个端口
dns监控dns
9965

Nginx

14900
9614
2949

Grafana 优化

配置缓存

rpm 安装的路径在/var/lib/grafana

ini

# grafana.ini
[metrics]
# Enable caching of metric results
enable_metrics_source_cache = true

# Cache results for this many seconds
metrics_source_cache_ttl_seconds = 60

数据源缓存

某些数据源插件支持自己的缓存机制，例如 Prometheus 插件。

ini

# grafana.ini
[datasources]
# Prometheus data source cache settings
prometheus:
  # Enable caching
  enable_metrics_source_cache = true
  # Cache results for this many seconds
  metrics_source_cache_ttl_seconds = 60

配置 Nginx 反向代理

Nginx 可以作为 Grafana 的反向代理，提供静态缓存功能

nginx

server {
   
  listen 80;
  server_name grafana.example.com;

  location / {
   
    proxy_pass http://localhost:3000;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;

    # Enable caching
    proxy_cache_valid 200 60m;
    proxy_cache_use_stale error timeout invalid_header http_500 http_502 http_503 http_504;
    proxy_cache_key "$scheme$request_method$host$request_uri";
  }
}

Grafana-插件

Grafana 的强大功能很大程度上得益于其丰富的插件生态系统。这些插件能帮你扩展数据源、增加可视化面板类型，甚至集成整个应用。

插件安装与管理

Grafana 提供了多种方式来安装和管理插件。

通过 Grafana CLI 安装（推荐）

bash

# 查看所有远程可用插件
grafana-cli plugins list-remote

# 安装指定插件（例如安装 Zabbix 应用插件）
grafana-cli plugins install alexanderzobnin-zabbix-app

# 安装指定版本的插件
grafana-cli plugins install <plugin-id> <version>

# 列出已安装的插件
grafana-cli plugins ls

# 更新所有已安装插件
grafana-cli plugins update-all

# 卸载插件
grafana-cli plugins remove <plugin-id>

# 完成安装或卸载后，通常需要重启 Grafana 服务才能生效
systemctl restart grafana-server

通过 Web 界面安装

对于一些环境，你也可以在 Grafana 的 Web 界面中管理插件：

登录 Grafana，导航到 Configuration -> Plugins and data -> Plugins。
你可以查看所有插件（All）或已安装的插件（Installed）。
找到想要的插件后，点击 Install 即可[citation1]。

通过 ZIP 文件手动安装

在某些无法直接访问互联网的环境中，你可以手动下载和安装插件

官网下载地址：https://grafana.com/grafana/plugins/
将 ZIP 包上传到服务器，并解压到 Grafana 的插件目录（通常是 /var/lib/grafana/plugins）
重启 Grafana 服务

插件名称	类型	功能描述
alexanderzobnin-zabbix-app	应用插件	集成 Zabbix 监控系统，提供现成的仪表盘和模板，用于监控基础设施和应用
grafana-clock-panel	面板插件	在仪表盘上显示一个可高度自定义的时钟，支持时间、倒计时、计时器等模式
grafana-piechart-panel	面板插件	提供比内置饼图更高级功能的饼图面板
grafana-kubernetes-app	应用插件	提供对 Kubernetes 集群的全面监控，包括集群、节点、Pod/容器和部署的仪表板
DevOpsProdigy KubeGraf	应用插件	提供高级的 Kubernetes 集群监控和可视化，能展示集群中主要服务的指标和特性
Pie Chart	面板插件	允许在仪表板上创建饼图，可视化不同类型资源（如Pods、Services等）的分布
Status Panel	面板插件	用于创建状态指示器，显示集群或特定资源的状态（如绿色正常，红色有问题）

Redis 插件

Grafana >=8.0.0

安装插件

bash

# 在线安装
grafana-cli plugins install redis-datasource
# 离线安装
## Github地址
wget https://github.com/RedisGrafana/grafana-redis-datasource/releases/download/v2.2.0/redis-datasource-2.2.0.zip
## Grafana地址
wget https://grafana.com/api/plugins/redis-datasource/versions/2.2.0/download

# 解压
unzip redis-datasource-2.2.0.zip -d /var/lib/grafana/plugins

# 重启
systemctl restart grafana-server

配置数据源

左侧菜单 —> Connections —> Data sources—>Add data source—>搜索Redis点击进入详情—>Install

Name：数据源名称（例如：Redis）。
填写Address：Redis 服务器的地址和端口：redis://127.0.0.1:6379。
如果 Redis 需要密码，在 Password 字段填写。
点击 "Save & Test" 测试连接并保存

Zabbix 插件

Grafana >=10.4.8

安装插件

bash

# 在线安装
grafana-cli plugins install alexanderzobnin-zabbix-app

# 离线安装
## Github地址
wget https://github.com/grafana/grafana-zabbix/releases/download/v5.0.1/alexanderzobnin-zabbix-app-5.0.1.linux_amd64.zip
## Grafana官方 
wget https://grafana.com/api/plugins/alexanderzobnin-zabbix-app/versions/5.0.1/download?os=linux&arch=amd64
# 解压
unzip alexanderzobnin-zabbix-app-5.0.1.linux_amd64.zip -d /var/lib/grafana/plugins

# 重启
systemctl restart grafana-server

配置数据源

提前在Zabbix创建API Token

左侧菜单 —> Connections —> Data sources—>Add data source—>搜索Zabbix点击进入详情—>Install

Name：默认Zabbix
URL：http://192.168.148.104/api_jsonrpc.phpZabbixAPI地址
Zabbix Connection：选择API Token
Trends：对于查询长时间范围的数据，Grafana 会使用 Zabbix 中的趋势数据（trends），这比查询历史数据（history）要快几个数量级。
Cache TTl：缓存存活时间。为了提升性能，可以设置一个较短的时间（如 30s）
Direct DB Connection（可选）：为了使用某些高级功能（如模板变量），插件可以直接从 Zabbix 的数据库中读取数据。
- SQL Data source: 选择你的数据库类型，通常是 MySQL/PostgreSQL。
- SQL Host: Zabbix 数据库的地址和端口，例如 192.168.1.100:3306。
- SQL Database: 数据库名，默认是 zabbix。
- SQL User / SQL Password: 具有只读权限的数据库用户密码（非常重要：不要使用 root 用户！创建一个专用用户）。
Save & test

仪表盘展示

在 Grafana 中，点击 Create -> Import—>5363/8677—>数据源选择Zabbix

TencentCloud

安装

bash

# 在线安装
grafana-cli plugins install tencentcloud-monitor-app

# 离线安装
wget https://github.com/TencentCloud/tencentcloud-monitor-grafana-app/releases/download/v2.9.5/tencentcloud-monitor-app-2.9.5.zip
unzip -d tencentcloud-monitor-app-2.9.5.zip /var/lib/grafana/plugins

# 重启服务
systemctl restart grafana-server

配置数据源

鼠标悬浮左侧导航栏的齿轮图标，单击 Plugins 选项，进入 Plugins 管理页面，如果插件列表中正常展示 Tencent Cloud Monitor App 插件，表示插件安装成功。

进入应用详情页面，单击 Enable 按钮，启用成功后，即可在 Grafana 中使用腾讯云监控应用插件。

腾讯云监控应用插件通过调用云监控 API 的方式获取各云产品的监控指标数据，通过以下步骤，配置相应云产品的数据源。

鼠标悬浮左侧导航栏的齿轮图标，单击【Data Sources】选项，进入数据源管理页面；
单击右上角的【Add data source】按钮，然后单击【Tencent Cloud Monitoring】数据源，进入数据源配置页面；
Name数据源名称，可以是任意名称，默认为Tencent Cloud Monitoring`；
SecretId和SecretKey` 是调用云监控 API 必需的安全证书信息，二者可以通过腾讯云控制台云 API 密钥页面获取；
选择需要获取监控数据的云产品；
单击【Save & Test】按钮，测试数据源的配置信息是否正确，配置成功后，即可以在 Dashboard 中使用该数据源。

aliyun-log-grafana-datasource-plugin

Grafana 对接阿里云日志服务

参考文档：https://help.aliyun.com/zh/sls/developer-reference/connect-log-service-to-grafana?spm=a2c4g.11186623.help-menu-28958.d_9_6_6.31ff512ebnfwHO&scm=20140722.H_60952._.OR_help-T_cn~zh-V_1

安装插件

bash

wget https://github.com/aliyun/aliyun-log-grafana-datasource-plugin/archive/refs/heads/master.zip
unzip aliyun-log-grafana-datasource-plugin-master.zip -d /var/lib/grafana/plugins

# 修改Grafana配置文件
- 使用YUM或RPM安装的Grafana：/etc/grafana/grafana.ini
- 使用.tar.gz文件安装的Grafana：*{PATH_TO}*/grafana-11.4.0/conf/defaults.ini

# 在配置文件的[plugins] 节点中，设置allow_loading_unsigned_plugins参数。
allow_loading_unsigned_plugins = aliyun-log-service-datasource

# 重启服务
systemctl restart grafana-server

添加数据源

登录Grafana。
在左侧菜单栏，选择Connections > Data Sources。
在Data Sources页签，单击Add data source。
在Add data source页面，搜索log-service-datasource。找到后单击log-service-datasource。

在打开的aliyun-log-service-datasource页面，配置以下信息。

参数	说明
Endpoint	Project的服务入口，例如`http://cn-qingdao.log.aliyuncs.com`。请根据实际情况替换服务入口。更多信息，请参见服务接入点。
Project	需要对接的日志服务Project的名称。
AccessKeyID	AccessKey ID用于标识用户，更多信息，请参见访问密钥。建议您遵循最小化原则，按需授予RAM用户必要的权限。关于授权的具体操作，请参见创建RAM用户及授权、RAM自定义授权示例。
AccessKeySecret	AccessKey Secret是用户用于加密签名字符串和日志服务用来验证签名字符串的密钥，必须保密。
Name	输入数据源的名称。默认为：aliyun-log-service-datasource。
Default	默认打开。
Default Logstore	如果不填写 LogStore, 请确保你的填写的Ak具备当前Project的ListProject权限，
RoleArn	配置STS跳转时需要填写对应RAM角色Arn。
HTTP headers	支持自定义Headers，仅在数据源类型为MetricStore（PromQL）生效。具体配置参考时序存储FormValue配置项：查询加速。Headers参数说明如下所示：x-sls-parallel-enable：是否开启并发计算，默认关闭。x-sls-parallel-time-piece-interval：按照时间区间进行分片的时间段单元，单位秒。支持的范围为[3600, 86400*30]，默认21600（6小时）。x-sls-parallel-time-piece-count：按照时间区间进行分片的分片数，支持1-16，默认8。x-sls-parallel-count：全局并发数，支持2-64，默认8。x-sls-parallel-count-per-host：单机并发数，支持1-8，默认值为2。x-sls-global-cache-enable：是否开启全局缓存，默认关闭。
Region	支持 v4 签名，提供更高的安全性。

配置完成后，单击Save & Test。

添加仪表盘

在左侧导航栏，单击Dashboards。
在Dashboards面板中，单击**+ Created dashboard**。然后单击**+ Add visualization**
在Select data source页面，选择数据源为aliyun-log-service-datasource。
添加可视化图表。
- 数据源类型：数据源类型主要是两种语法区别SQL和 PromQL，再加上存储库的类型不同，有四种类型可选：ALL(SQL)、Logstore(SQL)、MetricStore(SQL)、MetricStore(PromQL)。
  - 日志库（Logstore）支持SQL查询与分析。
  - 时序库（MetricStore）支持SQL + PromQL查询与分析。
  - **MetricStore(PromQL)**支持添加custom Headers，具体在该数据源的配置界面进行添加。

exporter-监控案例

node_expoter

安装插件

bash

# 下载包
wget https://github.com/prometheus/node_exporter/releases/download/v1.9.1/node_exporter-1.9.1.linux-amd64.tar.gz
tar -xvf node_exporter-1.9.1.linux-amd64.tar.gz
mv node_exporter-1.9.1.linux-amd64/node_exporter /usr/local/bin

# 验证
node_exporter --version

# 创建程序用户
useradd -Ms /sbin/nologin prometheus

# 配置service管理文件
cat > /usr/lib/systemd/system/node_exporter.service <<"EOF"
[Unit]
Description=Node Exporter for Prometheus
Documentation=https://prometheus.io/docs/guides/node-exporter/
After=network.target

[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/bin/node_exporter \
    --web.listen-address=0.0.0.0:9100 \
    --no-collector.ipvs 

Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable node_exporter --now

# 验证
# 浏览器通过访问：http://localhost:9100/metrics

配置 Prometheus 抓取指标

bash

scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "linux"
    static_configs:
      - targets: ["localhost:9090"]
        labels:
          app: "linux"
          
# 或者配置自动发现
scrape_configs:
  file_sd_configs:
      - files: ['/usr/local/prometheus/sd_configs/*.yaml']
        refresh_interval: 5s

配置 Grafana 仪表盘

进入 Grafana，点击 "Dashboards" -> "Import"
输入仪表盘 ID（推荐 8919 这些是社区提供的热门 Linux 监控仪表盘），点击 Load
选择对应的 Prometheus 数据源，点击 Import。

监控 MySQL

创建监控用户

sql

create user mysql_monitor@'localhost' identified by '123456';
GRANT PROCESS,REPLICATION CLIENT,SELECT ON *.* TO mysql_monitor@'localhost';
FLUSH PRIVILEGES;

安装插件

bash

wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.17.2/mysqld_exporter-0.17.2.linux-amd64.tar.gz

tar -xvf mysqld_exporter-0.17.2.linux-amd64.tar.gz
mv mysqld_exporter-0.17.2.linux-amd64/mysqld_exporter /usr/local/bin

# 创建程序用户
useradd -Ms /sbin/nologin prometheus

# 创建MySQL客户端连接信息
cat > /etc/mysqld_exporter.conf <<'EOF'
[client]
user=mysql_monitor
password=123456
EOF
chown prometheus:prometheus /etc/mysqld_exporter.conf

# 创建service管理文件
cat > /usr/lib/systemd/system/mysqld_exporter.service << 'EOF'
[Unit]
Description=MySQL Exporter
After=network.target

[Service]
User=prometheus
ExecStart=/usr/local/bin/mysqld_exporter --config.my-cnf=/etc/mysqld_exporter.conf
  
Restart=always

[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable mysqld_exporter --now

# 验证安装：访问 http://服务器IP:9104/metrics，若能看到 MySQL 相关指标

配置Prometheus 抓取指标

yaml

- targets:
  - '192.168.80.143:9104'
  labels:
  yunjisuan: mysql

配置Grafana仪表盘

进入 Grafana，点击 "Dashboards" -> "Import"
输入仪表盘 ID（推荐 7362 这些是社区提供的热门 MySQL监控仪表盘），点击 Load
选择对应的 Prometheus 数据源，点击 Import。

监控 Redis

安装插件

bash

wget https://github.com/oliver006/redis_exporter/releases/download/v1.76.0/redis_exporter-v1.76.0.linux-amd64.tar.gz
tar -xvf redis_exporter-v1.76.0.linux-amd64.tar.gz
mv redis_exporter-v1.76.0.linux-amd64/redis_exporter /usr/local/bin

# 创建程序用户
useradd -Ms /sbin/nologin prometheus

# 创建service管理文件
cat > /usr/lib/systemd/system/redis_exporter.service << 'EOF'
[Unit]
Description=Redis Exporter
After=network.target

[Service]
User=prometheus
ExecStart=/usr/local/bin/redis_exporter \
  --redis.addr=redis://localhost:6379 \
  --redis.password=your_redis_password
  
Restart=always

[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable redis_exporter --now

# 验证安装：访问 http://服务器IP:9121/metrics，若能看到 Redis 相关指标（如 redis_up、redis_connected_clients），说明启动成功

配置 Prometheus 抓取指标

编辑 Prometheus 的配置文件 prometheus.yml，添加一个抓取 redis_exporter 数据的 job

yaml

scrape_configs:
  - job_name: 'redis'
    static_configs:
      - targets: ['localhost:9121']  # Redis Exporter 的地址和端口

重启 Prometheus 或发送 SIGHUP 信号使其重载配置

配置 Grafana 仪表盘

进入 Grafana，点击 "Dashboards" -> "Import"
输入仪表盘 ID（推荐 763 或 11835，这些是社区提供的热门 Redis 监控仪表盘），点击 Load
选择对应的 Prometheus 数据源，点击 Import。

关键监控指标

可用性：redis_up（值为 1 表示正常）。
连接数：redis_connected_clients（当前连接数）。
内存使用：redis_used_memory（已用内存）、redis_used_memory_peak（内存使用峰值）。
命中率：redis_keyspace_hits（命中数）、redis_keyspace_misses（未命中数），命中率 = hits/(hits+misses)。
持久化：redis_rdb_last_save_time（最近 RDB 保存时间）

监控 Nginx

检测Nginx模块

Nginx 的 stub_status 模块能提供基础的状态信息。首先确保它已启用并配置好

bash

nginx -V 2>&1 | grep -o with-http_stub_status_module

配置监控项

nginx

server {
    listen 80;
    server_name localhost;
    location /nginx_status {
        stub_status on;       # 启用状态模块
        access_log off;       # 可选：关闭此位置的访问日志
        allow 127.0.0.1;      # 允许本地访问，重要！
        deny all;             # 拒绝所有其他访问
    }
}

# 重载 Nginx 配置：sudo nginx -s reload。
# 访问 http://你的服务器IP/nginx_status 验证是否输出状态数据。

安装Nginx插件

nginx-prometheus-exporter 会抓取 /nginx_status 页面的数据并将其转换为 Prometheus 可读取的格式

bash

wget https://github.com/nginxinc/nginx-prometheus-exporter/releases/download/v1.0.0/nginx-prometheus-exporter_1.0.0_linux_amd64.tar.gz
tar -xzf nginx-prometheus-exporter_1.0.0_linux_amd64.tar.gz -C /usr/local/bin/

# 检测
nginx-prometheus-exporter --version
# 创建程序用户
useradd -Ms /sbin/nologin prometheus

# 创建管理文件
cat > /usr/lib/systemd/system/nginx_exporter.service << 'EOF'
[Unit]
Description=NGINX Prometheus Exporter
After=network.target

[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/bin/nginx-prometheus-exporter -nginx.scrape-uri http://localhost/nginx_status
Restart=always

[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl start nginx_exporter
sudo systemctl enable nginx_exporter

配置 Prometheus 抓取指标

编辑 Prometheus 的配置文件 prometheus.yml，添加一个抓取任务

yaml

scrape_configs:
  - job_name: 'nginx'
    static_configs:
      - targets: ['localhost:9113']  # nginx-exporter 的地址和端口
    scrape_interval: 15s             # 抓取间隔，可根据需要调整

导入仪表盘

进入 Grafana，点击 "Dashboards" -> "Import"。
输入仪表盘 ID 12708（这是一个流行的 Nginx 仪表盘），点击 Load。
选择对应的 Prometheus 数据源，点击 Import。

监控systemd服务

启用Systemd 监控

监控服务状态，在启动node_exporter的时候要增加监控systemd的选项功能

bash

/usr/local/bin/node_exporter \
    --web.listen-address=0.0.0.0:9100 \
    --no-collector.ipvs \
    --collector.systemd \
    --collector.systemd.unit-include=(nginx|httpd).service
# 对于通过 systemd 管理的 Node Exporter，修改其 service 文件（如 /etc/systemd/system/node_exporter.service）的 ExecStart 行
ExecStart=/usr/local/bin/node_exporter --collector.systemd

# --collector.systemd 						开启systemd的监控
# --collector.systemd.unit-whitelist=".+" 	增加需要监控的服务名单
# --collector.systemd.unit-include="(nginx|mysql|docker)" 	只监控特定的服务单元，避免数据过多
# --collector.systemd.enable-task-metrics	监控服务的任务数
# --collector.systemd.enable-restarts-metrics	# 记录服务的重启次数

指标

指标名称	指标类型	核心标签	描述
`node_systemd_unit_state`	Gauge	`name`, `state`, `type`	服务单元状态（0/1表示状态）
`node_systemd_units`	Gauge	`state`	各状态服务数量统计
`node_systemd_system_running`	Gauge	-	系统整体运行状态（1表示正常运行）
`node_systemd_service_restart_total`	Counter	`name`	服务的重启次数
`node_systemd_timer_last_trigger_seconds`	Gauge	`name`	定时器最后一次触发的时间戳
`node_systemd_socket_accepted_connections_total`	Counter	`name`	套接字服务接受的总连接数
`node_systemd_socket_current_connections`	Gauge	`name`	套接字服务的当前连接数
`node_systemd_socket_refused_connections_total`	Counter	`name`	套接字服务拒绝的连接数

仪表盘查看

打开你的 Grafana 仪表盘，点击顶部的 "Add" 按钮，然后选择 "Add new panel"

bash

node_systemd_unit_state{name="docker.service", state="inactive"}

JMX Exporter

下载 JMX Exporter Agent

bash

wget https://github.com/prometheus/jmx_exporter/releases/download/1.4.0/jmx_prometheus_javaagent-1.4.0.jar

# 或者
wget https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.17.2/jmx_prometheus_javaagent-0.17.2.jar

创建配置文件

MX Exporter 需要一个配置文件来定义要收集的指标。创建一个 YAML 文件（例如 config.yaml）。初期为了测试，可以使用一个简单的配置收集所有指标

bash

startDelaySeconds: 0
lowercaseOutputName: true
lowercaseOutputLabelNames: true
rules:
  # - pattern: ".*"	# 匹配所有
  - pattern: 'java.lang<type=Memory><>HeapMemoryUsage'
    name: jvm_memory_heap_usage_bytes
    help: JVM heap memory usage
    type: GAUGE

startDelaySeconds：延迟开始收集指标的时间（秒）。
lowercaseOutputName, lowercaseOutputLabelNames：将指标名称和标签名称转换为小写，符合 Prometheus 规范。
rules：定义如何匹配和转换 JMX MBean 的规则。pattern: ".*" 表示匹配所有 MBean。

启动应用并集成 JMX Exporter

推荐的方式是将 JMX Exporter 作为 Java Agent 运行，与你的 Java 应用程序在同一个 JVM 进程中启动

bash

java -javaagent:/path/to/jmx_prometheus_javaagent-0.17.2.jar=<暴露端口>:<配置文件路径> -jar your_application.jar

# 示例
java -javaagent:./jmx_prometheus_javaagent-0.17.2.jar=8088:./config.yaml -jar myapp.jar

8088：JMX Exporter 暴露指标的 HTTP 端口。
config.yaml：你的配置文件路径。
myapp.jar：你的应用程序 jar 包。

对于 Tomcat 等 Web 容器，需要修改其启动脚本（如 catalina.sh），在 JAVA_OPTS 环境变量中添加 -javaagent 参数

配置 Prometheus 抓取指标

JMX Exporter 启动并暴露指标后，接下来需要配置 Prometheus 来抓取这些数据。

编辑 Prometheus 的配置文件 prometheus.yml，在 scrape_configs 部分添加一个新的 job

yaml

scrape_configs:
  - job_name: 'jvm'  # 任务名称，可根据应用自定义
    static_configs:
      - targets: ['<你的应用主机IP>:8088']  # JMX Exporter 暴露的地址和端口
    scrape_interval: 15s  # 抓取间隔，可根据需要调整

配置后，重启 Prometheus 或向其发送 SIGHUP 信号以重载配置。

在 Grafana 中可视化 JVM 指标

数据被 Prometheus 采集后，可以通过 Grafana 进行强大的可视化展示。

添加数据源：在 Grafana 中，添加你的 Prometheus 服务器作为数据源（URL 通常是 http://<prometheus-server>:9090）。
导入仪表盘：Grafana 社区提供了丰富的现成 JVM 监控仪表盘模板。
进入 Grafana → Dashboards → Import。
输入仪表盘 ID（例如 4701 或 8563），点击 Load。
选择对应的 Prometheus 数据源，点击 Import。

监控容器

监控 Docker

配置监控Docker客户端

下载监控镜像

bash

docker pull docker.io/google/cadvisor

启动容器

bash

cat > cadvisor.yaml << 'EOF'
version: '3.8'
services:
  cadvisor:
    image: google/cadvisor:latest
    container_name: cadvisor
    ports:
      - "8080:8080"
    volumes:
      - "/:/rootfs:ro"
      - "/var/run:/var/run:rw"
      - "/sys:/sys:ro"
      - "/var/lib/docker/:/var/lib/docker:ro"
      - "/dev/disk/:/dev/disk:ro"
    privileged: true
    restart: unless-stopped
EOF

docker compose -f cadvisor.yaml up -d 

# 访问
# 浏览器打开http://$ip:8080 ，可查看CAdvisor的web界面

配置 Prometheus

CAdvisor是一个简单易用的工具，它除了有详细的监控指标，也提供了可供查看的WEB图表界面。但CAdvisor本身的数据保存时间只有2分钟，而且在多主机的情况下，要单独去登录每台机器查看docker数据也是一件麻烦的事情。

对此，更好的方法是与Prometheus集成，实现Docker容器数据的收集与保存。由于CAdvisor提供了支持Prometheus的metrics格式接口，所以Prometheus只需要按照获取Exporter指标的方式，创建相关的Job即可。

yaml

- job_name: 'docker'
  static_configs:
  - targets:
    - '192.168.80.143:8080'
    labels:
    group: docker

监控指标

bash

# CPU指标
container_cpu_load_average_10s #最近10秒容器的CPU平均负载情况
container_cpu_usage_seconds_total #容器的CPU累积占用时间

# 内存指标
container_memory_max_usage_bytes #容器的最大内存使用量（单位:字节）
container_memory_usage_bytes #容器的当前内存使用量（单位：字节）
container_spec_memory_limit_bytes #容器的可使用最大内存数量（单位：字节）

# 网络指标
container_network_receive_bytes_total #容器网络累积接收字节数据总量（单位：字节）
container_network_transmit_bytes_total #容器网络累积传输数据总量（单位：字节）

# 存储指标
container_fs_usage_bytes #容器中的文件系统存储使用量（单位：字节）
container_fs_limit_bytes #容器中的文件系统存储总量（单位：字节）

Grafana展示

dashboard —> import —> 193 —> 数据源选择Prometheus

Grafana-监控 ​

简介 ​

与 Prometheus 的区别 ​

核心特性与优势 ​

多数据源支持（最核心的优势） ​

强大灵活的可视化 ​

动态与交互式仪表盘 ​

灵活的告警系统 ​

典型应用场景 ​

Grafana-部署 ​

Grafana 配置 ​

汉化 ​

界面设置 ​

通过修改配置文件 ​

添加连接器 ​

添加数据源 ​

Grafana 常用模板 ​

监控物理机/虚拟机ID（Linux） ​

监控物理机/虚拟机ID（windows） ​

监控容器ID ​

监控数据库ID ​

监控网站或者协议端口ID ​

Nginx ​

Grafana 优化 ​

配置缓存 ​

数据源缓存 ​

配置 Nginx 反向代理 ​

Grafana-插件 ​

插件安装与管理 ​

通过 Grafana CLI 安装（推荐） ​

通过 Web 界面安装 ​

通过 ZIP 文件手动安装 ​

热门与实用插件推荐 ​

Redis 插件 ​

安装插件 ​

配置数据源 ​

Zabbix 插件 ​

安装插件 ​

配置数据源 ​

仪表盘展示 ​

TencentCloud ​

安装 ​

配置数据源 ​

aliyun-log-grafana-datasource-plugin ​

安装插件 ​

添加数据源 ​

添加仪表盘 ​

exporter-监控案例 ​

node_expoter ​

安装插件 ​

配置 Prometheus 抓取指标 ​

配置 Grafana 仪表盘 ​

监控 MySQL ​

创建监控用户 ​

安装插件 ​

配置Prometheus 抓取指标 ​

配置Grafana仪表盘 ​

监控 Redis ​

安装插件 ​

配置 Prometheus 抓取指标 ​

配置 Grafana 仪表盘 ​

关键监控指标 ​

监控 Nginx ​

检测Nginx模块 ​

配置监控项 ​

安装Nginx插件 ​

配置 Prometheus 抓取指标 ​

导入仪表盘 ​

监控systemd服务 ​

启用Systemd 监控 ​

指标 ​

仪表盘查看 ​

JMX Exporter ​

下载 JMX Exporter Agent ​

创建配置文件 ​

启动应用并集成 JMX Exporter ​

配置 Prometheus 抓取指标 ​

在 Grafana 中可视化 JVM 指标 ​

监控容器 ​

监控 Docker ​

Grafana-监控

简介

与 Prometheus 的区别

核心特性与优势

多数据源支持（最核心的优势）

强大灵活的可视化

动态与交互式仪表盘

灵活的告警系统

典型应用场景

Grafana-部署

Grafana 配置

汉化

界面设置

通过修改配置文件

添加连接器

添加数据源

Grafana 常用模板

监控物理机/虚拟机ID（Linux）

监控物理机/虚拟机ID（windows）

监控容器ID

监控数据库ID

监控网站或者协议端口ID

Nginx

Grafana 优化

配置缓存

数据源缓存

配置 Nginx 反向代理

Grafana-插件

插件安装与管理

通过 Grafana CLI 安装（推荐）

通过 Web 界面安装

通过 ZIP 文件手动安装

热门与实用插件推荐

Redis 插件

安装插件

配置数据源

Zabbix 插件

安装插件

配置数据源

仪表盘展示

TencentCloud

安装

配置数据源

aliyun-log-grafana-datasource-plugin

安装插件

添加数据源

添加仪表盘

exporter-监控案例

node_expoter

安装插件

配置 Prometheus 抓取指标

配置 Grafana 仪表盘

监控 MySQL

创建监控用户

安装插件

配置Prometheus 抓取指标

配置Grafana仪表盘

监控 Redis

安装插件

配置 Prometheus 抓取指标

配置 Grafana 仪表盘

关键监控指标

监控 Nginx

检测Nginx模块

配置监控项

安装Nginx插件

配置 Prometheus 抓取指标

导入仪表盘

监控systemd服务

启用Systemd 监控

指标

仪表盘查看

JMX Exporter

下载 JMX Exporter Agent

创建配置文件

启动应用并集成 JMX Exporter

配置 Prometheus 抓取指标

在 Grafana 中可视化 JVM 指标

监控容器

监控 Docker