51工具盒子

依楼听风雨
笑看云卷云舒,淡观潮起潮落

kubesphere 部署 promethues

kubesphere 部署 promethues

PrometheusAlert+prometheus+Alertmanager实现各种类型告警 (企业微信告警、飞书告警、钉钉告警、)

https://blog.csdn.net/W1124824402/article/details/128846493、

prometheu是 有状态的 ,因为要保存 时序数据库

1- 镜像

bitnami/prometheus  # 不能挂载数据,所以pass
prom/prometheus:v2.34.0

可以把数据path 挂载 /prometheus

先不配置 存储卷和 字典,走低2步 第3步。

2- 配置 存储卷

prometheus-db

3- 配置 configmap -配置字典

prometheus-yml

这里要注意, 因为镜像原因, 一些 其他 的 报警规则,我也写在这里边了。方便实用。

prometheus.yml 内容

global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

Alertmanager configuration

alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 - 10.0.0.201:31007

Load rules once and periodically evaluate them according to the global 'evaluation_interval'.

rule_files:

- "first_rules.yml"

- "second_rules.yml"

  • "/etc/prometheus/*_rules.yml"

A scrape configuration containing exactly one endpoint to scrape:

Here it's Prometheus itself.

scrape_configs:

The job name is added as a label job=<job_name> to any timeseries scraped from this config.

  • job_name: "prometheus" static_configs:

    • targets: ["localhost:9090"]
  • job_name: "mysql-exporter" static_configs:

    • targets: ["10.0.0.201:31004"]
  • job_name: "node-exporter" static_configs:

    • targets: ["10.0.0.201:31003"]
  • job_name: "nginx-exporter" static_configs:

    • targets: ["10.0.0.201:31005"]
  • job_name: "tomcat-exporter" static_configs:

    • targets: ["10.0.0.1:8080"]
  • job_name: "es-exporter" static_configs:

    • targets: ["10.0.0.201:31006"]
  • job_name: "baimei-node-exporter" static_configs:

    • targets:
      • "10.0.0.205:9100"
      • "10.0.0.207:9100"

4- 配置 存储卷和配置字典

(1)prometheus.yml 配置 挂载

/etc/prometheus/prometheus.yml
prometheus.yml

(2)报警规则文件 配置

/etc/prometheus/mysql_rules.yml

mysql_rules.yml

(3) 存储卷

/prometheus

检测

http://10.0.0.201:31010/alerts

mysql_rules.yml

groups:
- name: MySQLStatsAlert
  rules:
  - alert: MySQL is down
    expr: mysql_up == 0
    for: 1m
    labels:
        severity: critical
    annotations:
        summary: "Instance {{ $labels.instance }} MySQL is down"
        description: "MySQL database is down. This requires immediate action!"
  • alert: Mysql_High_QPS expr: rate(mysql_global_status_questions[5m]) > 500 for: 2m labels: severity: warning annotations: summary: "{{$labels.instance}}: Mysql_High_QPS detected" description: "{{$labels.instance}}: Mysql opreation is more than 500 per second ,(current value is: {{ $value }})"

  • alert: Mysql_Too_Many_Connections expr: rate(mysql_global_status_threads_connected[5m]) > 200 for: 2m labels: severity: warning annotations: summary: "{{$labels.instance}}: Mysql Too Many Connections detected" description: "{{$labels.instance}}: Mysql Connections is more than 100 per second ,(current value is: {{ $value }})"

  • alert: Mysql_Too_Many_slow_queries expr: rate(mysql_global_status_slow_queries[5m]) > 3 for: 2m labels: severity: warning annotations: summary: "{{$labels.instance}}: Mysql_Too_Many_slow_queries detected" description: "{{$labels.instance}}: Mysql slow_queries is more than 3 per second ,(current value is: {{ $value }})"

  • alert: SQL thread stopped expr: mysql_slave_status_slave_sql_running != 1 for: 1m labels: severity: critical annotations: summary: "Instance {{ $labels.instance }} Sync Binlog is enabled" description: "SQL thread has stopped. This is usually because it cannot apply a SQL statement received from the master."

  • alert: Slave lagging behind Master expr: rate(mysql_slave_status_seconds_behind_master[5m]) >30 for: 1m labels: severity: warning annotations: summary: "Instance {{ $labels.instance }} Slave lagging behind Master" description: "Slave is lagging behind Master. Please check if Slave threads are running and if there are some performance issues!"

{#more-15475}

参考点:

赞(5)
未经允许不得转载:工具盒子 » kubesphere 部署 promethues