51工具盒子

依楼听风雨
笑看云卷云舒,淡观潮起潮落

kubesphere 部署 promethues

kubesphere 部署 promethues

PrometheusAlert+prometheus+Alertmanager实现各种类型告警 (企业微信告警、飞书告警、钉钉告警、)

https://blog.csdn.net/W1124824402/article/details/128846493、

prometheu是 有状态的 ,因为要保存 时序数据库

1- 镜像

bitnami/prometheus  # 不能挂载数据,所以pass
prom/prometheus:v2.34.0

可以把数据path 挂载 /prometheus

先不配置 存储卷和 字典,走低2步 第3步。

2- 配置 存储卷

prometheus-db

3- 配置 configmap -配置字典

prometheus-yml

这里要注意, 因为镜像原因, 一些 其他 的 报警规则,我也写在这里边了。方便实用。

prometheus.yml 内容

global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093
          - 10.0.0.201:31007

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"
  - "/etc/prometheus/*_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]
  - job_name: "mysql-exporter"
    static_configs:
      - targets: ["10.0.0.201:31004"] 
  - job_name: "node-exporter"
    static_configs:
      - targets: ["10.0.0.201:31003"] 
  - job_name: "nginx-exporter"
    static_configs:
      - targets: ["10.0.0.201:31005"] 

  - job_name: "tomcat-exporter"
    static_configs:
      - targets: ["10.0.0.1:8080"]

  - job_name: "es-exporter"
    static_configs:
      - targets: ["10.0.0.201:31006"]

  - job_name: "baimei-node-exporter"
    static_configs:
      - targets:
          - "10.0.0.205:9100"
          - "10.0.0.207:9100"

4- 配置 存储卷和配置字典

(1)prometheus.yml 配置 挂载

/etc/prometheus/prometheus.yml
prometheus.yml

(2)报警规则文件 配置

/etc/prometheus/mysql_rules.yml


mysql_rules.yml

(3) 存储卷

/prometheus

检测

http://10.0.0.201:31010/alerts

mysql_rules.yml

groups:
- name: MySQLStatsAlert
  rules:
  - alert: MySQL is down
    expr: mysql_up == 0
    for: 1m
    labels:
        severity: critical
    annotations:
        summary: "Instance {{ $labels.instance }} MySQL is down"
        description: "MySQL database is down. This requires immediate action!"

  - alert: Mysql_High_QPS
    expr: rate(mysql_global_status_questions[5m]) > 500 
    for: 2m
    labels:
        severity: warning
    annotations:
        summary: "{{$labels.instance}}: Mysql_High_QPS detected"
        description: "{{$labels.instance}}: Mysql opreation is more than 500 per second ,(current value is: {{ $value }})"  
  - alert: Mysql_Too_Many_Connections
    expr: rate(mysql_global_status_threads_connected[5m]) > 200
    for: 2m
    labels:
        severity: warning
    annotations:
        summary: "{{$labels.instance}}: Mysql Too Many Connections detected"
        description: "{{$labels.instance}}: Mysql Connections is more than 100 per second ,(current value is: {{ $value }})"  

  - alert: Mysql_Too_Many_slow_queries
    expr: rate(mysql_global_status_slow_queries[5m]) > 3
    for: 2m
    labels:
        severity: warning
    annotations:
        summary: "{{$labels.instance}}: Mysql_Too_Many_slow_queries detected"
        description: "{{$labels.instance}}: Mysql slow_queries is more than 3 per second ,(current value is: {{ $value }})"  

  - alert: SQL thread stopped
    expr: mysql_slave_status_slave_sql_running != 1
    for: 1m
    labels:
        severity: critical
    annotations:
        summary: "Instance {{ $labels.instance }} Sync Binlog is enabled"
        description: "SQL thread has stopped. This is usually because it cannot apply a SQL statement received from the master."
  - alert: Slave lagging behind Master
    expr: rate(mysql_slave_status_seconds_behind_master[5m]) >30 
    for: 1m
    labels:
        severity: warning 
    annotations:
        summary: "Instance {{ $labels.instance }} Slave lagging behind Master"
        description: "Slave is lagging behind Master. Please check if Slave threads are running and if there are some performance issues!"

{#more-15475}

参考点:

赞(5)
未经允许不得转载:工具盒子 » kubesphere 部署 promethues