Galera Cluster node receive queue
Job details
Name: |
Galera Cluster node receive queue |
Platform: |
Mariadb |
Category: |
Cluster and Replication |
Description: |
Checks current length of the receive queue, which is the number of writesets waiting to be applied. |
Long description: |
Checks current length of the receive queue, which is the number of writesets waiting to be applied. |
Version: |
1.1 |
Default schedule: |
1,11,21,31,41,51 * * * |
Requires engine install: |
Yes |
Compatibility tag: |
.[type=‘instance’ & is_mariadb_branch=‘1′]/.[hasengine=‘YES’ & use_global_variables_information_schema = ‘1′ & wsrep_cluster != ‘0′] |
Parameters
Name |
Default value |
Description |
warning threshold |
20 |
The maximum value of queue length before a warning is triggered. |
alarm threshold |
100 |
The maximum value of queue length before an alarm is triggered. |
enable warnings and alarms |
NO |
If set to “NO”, the alert will only collect statistics without returning status warning or alarm. Value “YES“ will activate the alert. |
history threshold |
7 |
The maximum number of day to kept statistics for in the historic tables. |
Job Summary
- Purpose: The purpose of this job is to monitor and check the current length of the receive queue in a MariaDB Galera Cluster.
- Why: This job is crucial because the receive queue length indicates the number of writesets waiting to be applied. Monitoring this helps in identifying delays or backlogs in node synchronization, which can impact database performance and consistency.
- Manual checking: You can check this manually in the database by issuing the following SQL command:
SELECT VARIABLE_VALUE AS 'Receive Queue Length'
FROM information_schema.global_status
WHERE variable_name = 'wsrep_local_recv_queue';
Implementation Details
The monitoring job involves the following SQL implementational commands:
- Create history and last value tables to hold past and current data for comparison and tracking:
CREATE TABLE dbw_galera_cluster_node_receive_queue_histr (receive_queue BIGINT, receive_queue_diff BIGINT, histr_date DATETIME);
CREATE TABLE dbw_galera_cluster_node_receive_queue_last (receive_queue BIGINT, histr_date DATETIME);
ALTER TABLE dbw_galera_cluster_node_receive_queue_last ADD PRIMARY KEY (histr_date);
* Insert current receive queue data obtained from the global status:
INSERT INTO dbw_galera_cluster_node_receive_queue_last
SELECT CAST(VARIABLE_VALUE AS SIGNED), NOW()
FROM information_schema.global_status
WHERE variable_name = 'wsrep_local_recv_queue';
* The stored procedure to handle the logic for monitoring, comparison, alerting (if enabled), and data purging:
CREATE PROCEDURE dbw_galera_cluster_node_receive_queue(taskId INT)
DECLARE ... (various declarations)
...
BEGIN
...
SELECT ... INTO ... FROM ...
INSERT INTO dbw_galera_cluster_node_receive_queue_histr VALUES (...)
DELETE FROM dbw_galera_cluster_node_receive_queue_histr WHERE histr_date < DATE_SUB(...);
END;
Dependencies and Cleanup
- The job depends on previously mentioned tables and procedure:
- dbw_galera_cluster_node_receive_queue
- dbw_galera_cluster_node_receive_queue_histr
- dbw_galera_cluster_node_receive_queue_last
- Cleanup commands to be executed upon failure:
DROP TABLE dbw_galera_cluster_node_receive_queue_histr;
DROP TABLE dbw_galera_cluster_node_receive_queue_last;
DROP PROCEDURE dbw_galera_cluster_node_receive_queue;
Reporting
- A report template is defined to visually display the history of the queue:
- The report shows changes in the ‘receive_queue’ over time, helping to visualize trends or issues:
SELECT receive_queue_diff "Receive queue", histr_date "History date"
FROM dbw_galera_cluster_node_receive_queue_histr
ORDER BY histr_date;
* Presentation includes:
** A category chart plotting 'Receive queue' values against 'History date' to provide a graphical representation of data over time.
These features ensure that DBA teams can effectively monitor and manage the performance and health of their MariaDB Galera Clusters, contributing to overall database reliability and performance.