kubesphere 部署 promethues-工具盒子

kubesphere 部署 promethues

PrometheusAlert+prometheus+Alertmanager实现各种类型告警（企业微信告警、飞书告警、钉钉告警、）

https://blog.csdn.net/W1124824402/article/details/128846493、

prometheu是有状态的，因为要保存时序数据库

1- 镜像

bitnami/prometheus  # 不能挂载数据，所以pass

prom/prometheus:v2.34.0

可以把数据path 挂载 /prometheus

先不配置存储卷和字典，走低2步第3步。

2- 配置存储卷

prometheus-db

3- 配置 configmap -配置字典

prometheus-yml

这里要注意，因为镜像原因，一些其他的报警规则，我也写在这里边了。方便实用。

prometheus.yml 内容

global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).
Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
- 10.0.0.201:31007
Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
- "first_rules.yml"
- "second_rules.yml"

"/etc/prometheus/*_rules.yml"

A scrape configuration containing exactly one endpoint to scrape:
Here it's Prometheus itself.
scrape_configs:
The job name is added as a label job=&lt;job_name&gt; to any timeseries scraped from this config.


job_name: "prometheus"
static_configs:

targets: ["localhost:9090"]



job_name: "mysql-exporter"
static_configs:

targets: ["10.0.0.201:31004"]



job_name: "node-exporter"
static_configs:

targets: ["10.0.0.201:31003"]



job_name: "nginx-exporter"
static_configs:

targets: ["10.0.0.201:31005"]



job_name: "tomcat-exporter"
static_configs:

targets: ["10.0.0.1:8080"]



job_name: "es-exporter"
static_configs:

targets: ["10.0.0.201:31006"]



job_name: "baimei-node-exporter"
static_configs:

targets:

"10.0.0.205:9100"
"10.0.0.207:9100"

4- 配置存储卷和配置字典

（1）prometheus.yml 配置挂载

/etc/prometheus/prometheus.yml
prometheus.yml

（2）报警规则文件配置

/etc/prometheus/mysql_rules.yml

mysql_rules.yml

(3) 存储卷

/prometheus

检测

http://10.0.0.201:31010/alerts

mysql_rules.yml

groups:
- name: MySQLStatsAlert
  rules:
  - alert: MySQL is down
    expr: mysql_up == 0
    for: 1m
    labels:
        severity: critical
    annotations:
        summary: "Instance {{ $labels.instance }} MySQL is down"
        description: "MySQL database is down. This requires immediate action!"


alert: Mysql_High_QPS
expr: rate(mysql_global_status_questions[5m]) > 500
for: 2m
labels:
severity: warning
annotations:
summary: "{{$labels.instance}}: Mysql_High_QPS detected"
description: "{{$labels.instance}}: Mysql opreation is more than 500 per second ,(current value is: {{ $value }})"


alert: Mysql_Too_Many_Connections
expr: rate(mysql_global_status_threads_connected[5m]) > 200
for: 2m
labels:
severity: warning
annotations:
summary: "{{$labels.instance}}: Mysql Too Many Connections detected"
description: "{{$labels.instance}}: Mysql Connections is more than 100 per second ,(current value is: {{ $value }})"


alert: Mysql_Too_Many_slow_queries
expr: rate(mysql_global_status_slow_queries[5m]) > 3
for: 2m
labels:
severity: warning
annotations:
summary: "{{$labels.instance}}: Mysql_Too_Many_slow_queries detected"
description: "{{$labels.instance}}: Mysql slow_queries is more than 3 per second ,(current value is: {{ $value }})"


alert: SQL thread stopped
expr: mysql_slave_status_slave_sql_running != 1
for: 1m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.instance }} Sync Binlog is enabled"
description: "SQL thread has stopped. This is usually because it cannot apply a SQL statement received from the master."


alert: Slave lagging behind Master
expr: rate(mysql_slave_status_seconds_behind_master[5m]) >30
for: 1m
labels:
severity: warning
annotations:
summary: "Instance {{ $labels.instance }} Slave lagging behind Master"
description: "Slave is lagging behind Master. Please check if Slave threads are running and if there are some performance issues!"

{#more-15475}

参考点：