基于mysqld_exporter的pxc监控项
基于mysqld_exporter的pxc监控项
集群完整性检查:
- wsrep_cluster_status:集群组成的状态.如果不为”Primary”,说明出现”分区”或是”split-brain”状况.
HELP mysql_global_status_wsrep_cluster_status Generic metric from SHOW GLOBAL STATUS.
TYPE mysql_global_status_wsrep_cluster_status untypedmysql_global_status_wsrep_cluster_status 1
- wsrep_cluster_size:如果这个值跟预期的节点数一致,则所有的集群节点已经连接.
HELP mysql_global_status_wsrep_cluster_size Generic metric from SHOW GLOBAL STATUS.
TYPE mysql_global_status_wsrep_cluster_size untypedmysql_global_status_wsrep_cluster_size 3
- wsrep_cluster_status:集群组成的状态.如果不为”Primary”,说明出现”分区”或是”split-brain”状况.
节点状态检查:
- wsrep_ready: 该值为ON,则说明可以接受SQL负载.如果为Off,则需要检查wsrep_connected.
HELP mysql_global_status_wsrep_ready Generic metric from SHOW GLOBAL STATUS.
TYPE mysql_global_status_wsrep_ready untypedmysql_global_status_wsrep_ready 1
- wsrep_connected: 如果该值为Off,且wsrep_ready的值也为Off,则说明该节点没有连接到集群.(可能是wsrep_cluster_address或wsrep_cluster_name等配置错造成的.具体错误需要查看错误日志)
HELP mysql_global_status_wsrep_connected Generic metric from SHOW GLOBAL STATUS.
TYPE mysql_global_status_wsrep_connected untypedmysql_global_status_wsrep_connected 1
- wsrep_ready: 该值为ON,则说明可以接受SQL负载.如果为Off,则需要检查wsrep_connected.
复制健康检查:
- wsrep_flow_control_paused:表示复制停止了多长时间.即表明集群因为Slave延迟而慢的程度.值为0~1,越靠近0越好,值为1表示复制完全停止.可优化wsrep_slave_threads的值来改善.
HELP mysql_global_status_wsrep_flow_control_paused Generic metric from SHOW GLOBAL STATUS.
TYPE mysql_global_status_wsrep_flow_control_paused untypedmysql_global_status_wsrep_flow_control_paused 0
- wsrep_cert_deps_distance:有多少事务可以并行应用处理.wsrep_slave_threads设置的值不应该高出该值太多.
HELP mysql_global_status_wsrep_cert_deps_distance Generic metric from SHOW GLOBAL STATUS.
TYPE mysql_global_status_wsrep_cert_deps_distance untypedmysql_global_status_wsrep_cert_deps_distance 0
- wsrep_flow_control_sent:表示该节点已经停止复制了多少次.
HELP mysql_global_status_wsrep_flow_control_sent Generic metric from SHOW GLOBAL STATUS.
TYPE mysql_global_status_wsrep_flow_control_sent untypedmysql_global_status_wsrep_flow_control_sent 0
- wsrep_local_recv_queue_avg:表示slave事务队列的平均长度.slave瓶颈的预兆.
HELP mysql_global_status_wsrep_local_recv_queue_avg Generic metric from SHOW GLOBAL STATUS.
TYPE mysql_global_status_wsrep_local_recv_queue_avg untypedmysql_global_status_wsrep_local_recv_queue_avg 0.125
最慢的节点的wsrep_flow_control_sent和wsrep_local_recv_queue_avg这两个值最高.这两个值较低的话,相对更好.
- wsrep_flow_control_paused:表示复制停止了多长时间.即表明集群因为Slave延迟而慢的程度.值为0~1,越靠近0越好,值为1表示复制完全停止.可优化wsrep_slave_threads的值来改善.
检测慢网络问题:
- wsrep_local_send_queue_avg:网络瓶颈的预兆.如果这个值比较高的话,可能存在网络瓶颈
HELP mysql_global_status_wsrep_local_send_queue_avg Generic metric from SHOW GLOBAL STATUS.
TYPE mysql_global_status_wsrep_local_send_queue_avg untypedmysql_global_status_wsrep_local_send_queue_avg 0
- wsrep_last_committed:最后提交的事务数目
HELP mysql_global_status_wsrep_last_committed Generic metric from SHOW GLOBAL STATUS.
TYPE mysql_global_status_wsrep_last_committed untypedmysql_global_status_wsrep_last_committed 20
- wsrep_local_cert_failures和wsrep_local_bf_aborts:回滚,检测到的冲突数目
HELP mysql_global_status_wsrep_local_bf_aborts Generic metric from SHOW GLOBAL STATUS.
TYPE mysql_global_status_wsrep_local_bf_aborts untypedmysql_global_status_wsrep_local_bf_aborts 0
HELP mysql_global_status_wsrep_local_cert_failures Generic metric from SHOW GLOBAL STATUS.
TYPE mysql_global_status_wsrep_local_cert_failures untyped
mysql_global_status_wsrep_local_cert_failures 0
- wsrep_local_send_queue_avg:网络瓶颈的预兆.如果这个值比较高的话,可能存在网络瓶颈