1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89
| vim prometheus.yml #添加相应节点 rule_files: #规则部分,添加对应的监控规则 - "/usr/local/prometheus_all/prometheus/mysql_rule.yml"
scrape_configs: #配置部分,添加相应节点 - job_name: "mysql" static_configs: - targets: ['192.168.92.131:9104','192.168.92.132:9104','192.168.92.133:9104'] vim /usr/local/prometheus_all/prometheus/rules.yml #这里使用了mysqld_exporter github上的示例规则。未添加pxc相应监控。
groups: - name: example.rules rules: - record: mysql_slave_lag_seconds expr: mysql_slave_status_seconds_behind_master - mysql_slave_status_sql_delay - record: mysql_heartbeat_lag_seconds expr: mysql_heartbeat_now_timestamp_seconds - mysql_heartbeat_stored_timestamp_seconds - record: job:mysql_transactions:rate5m expr: sum(rate(mysql_global_status_commands_total{command=~"(commit|rollback)"}[5m])) WITHOUT (command) - alert: MySQLGaleraNotReady expr: mysql_global_status_wsrep_ready != 1 for: 5m labels: severity: warning annotations: description: '{{$labels.job}} on {{$labels.instance}} is not ready.' summary: Galera cluster node not ready - alert: MySQLGaleraOutOfSync expr: (mysql_global_status_wsrep_local_state != 4 and mysql_global_variables_wsrep_desync == 0) for: 5m labels: severity: warning annotations: description: '{{$labels.job}} on {{$labels.instance}} is not in sync ({{$value}} != 4).' summary: Galera cluster node out of sync - alert: MySQLGaleraDonorFallingBehind expr: (mysql_global_status_wsrep_local_state == 2 and mysql_global_status_wsrep_local_recv_queue > 100) for: 5m labels: severity: warning annotations: description: '{{$labels.job}} on {{$labels.instance}} is a donor (hotbackup) and is falling behind (queue size {{$value}}).' summary: xtradb cluster donor node falling behind - alert: MySQLReplicationNotRunning expr: mysql_slave_status_slave_io_running == 0 or mysql_slave_status_slave_sql_running == 0 for: 2m labels: severity: critical annotations: description: Slave replication (IO or SQL) has been down for more than 2 minutes. summary: Slave replication is not running - alert: MySQLReplicationLag expr: (mysql_slave_lag_seconds > 30) and ON(instance) (predict_linear(mysql_slave_lag_seconds[5m], 60 * 2) > 0) for: 1m labels: severity: critical annotations: description: The mysql slave replication has fallen behind and is not recovering summary: MySQL slave replication is lagging - alert: MySQLReplicationLag expr: (mysql_heartbeat_lag_seconds > 30) and ON(instance) (predict_linear(mysql_heartbeat_lag_seconds[5m], 60 * 2) > 0) for: 1m labels: severity: critical annotations: description: The mysql slave replication has fallen behind and is not recovering summary: MySQL slave replication is lagging - alert: MySQLInnoDBLogWaits expr: rate(mysql_global_status_innodb_log_waits[15m]) > 10 labels: severity: warning annotations: description: The innodb logs are waiting for disk at a rate of {{$value}} / second summary: MySQL innodb log writes stalling
systemctl restart prometheus #重启prometheus。配置方法见《prometheus配置成系统服务》
|