Data Guard role switch
Job details
Name: |
Data Guard role switch |
Platform: |
Oracle |
Category: |
Cluster and Replication |
Description: |
Checks size of all snapshot logs. |
Long description: |
|
Version: |
1.5 |
Default schedule: |
60s |
Requires engine install: |
No |
Compatibility tag: |
.[type=‘instance’ & databasetype=‘oracle’] |
Parameters
Name |
Default value |
Description |
return_status |
1 |
Return status value (ALARM – 2, WARNING – 1, or OK – 0) when Primary/Standby switch occurs. |
keep_status |
10 |
For how long (in minutes) the jobs status will be kept when Primary/Standby switch occurs. |
Job Summary
- Purpose: The purpose of this job is to monitor the Oracle Data Guard environment and detect Primary/Standby role switches.
- Why: This job is critical for maintaining the availability and reliability of the Oracle database by ensuring that role switches between Primary and Standby nodes are detected promptly. Timely detection helps in verifying that all the database instances are functioning correctly post-switch, as Data Guard role reversals are crucial for disaster recovery and data integrity.
- Manual checking: You can check this manually in the database by issuing the following SQL command:
select database_role from v$database;
Job Mechanics
- The job uses XML configurations and Javascript engines to check and display the status of the Oracle Data Guard.
- It queries instance roles and timestamps to monitor any changes that indicate a switch in roles from Primary to Standby or vice versa.
- A status is returned based on the result of the checks, where a non-primary state after a switch or within the specified “keep status” duration triggers an alarm status.
- The job automatically updates its internal properties to keep track of the latest Data Guard statuses and switch counts.
- The frequency of this job’s operation is set by default to run every 60 seconds, ensuring almost real-time monitoring of the database state.
Output and Reporting
Field |
Detail |
Status |
Indicates the current alarm status based on role switch detection |
Details |
Provides descriptive text about the current Data Guard role and recent switch activities. For example, messages like “Running as PRIMARY host. Last switch 10 minutes ago.” are dynamically generated based on the context. |
Last Data Guard Status |
Stores the last known state of Data Guard to compare against new fetches |
Last Data Guard Switch |
Timestamp for the last detected role switch, used for calculating time elapsed |
Last Data Guard Switch Count |
Counter that increments with each detected switch, useful for historical context |
Scenario Handling
- The job is designed to address different scenarios involving role state changes and timings.
- It differentiates between switches to and from primary states and adjusts status and alerts accordingly.
- For instance, if a switch to Standby occurs and is detected within the configured time to keep status, it raises an alert. Conversely, if stable primary state is observed post the recent switch, status messages reflect this stability.
- Alerts and statuses are set based on the current role, the last known state, the number of switches, and the time elapsed since the last switch, providing a comprehensive insight into the Data Guard status.
This meticulous monitoring ensures the robustness and readiness of Oracle databases configured with Data Guard, facilitating high availability and disaster recovery configurations.