Suspect pages
Job details
Name: |
Suspect pages |
Platform: |
Sqlserver |
Category: |
Maintenance |
Premium package: |
Maintenance package |
Description: |
Monitors suspect pages statistics in suspect_pages table. |
Long description: |
This alert monitors suspect pages by parsing statistics from suspect_pages table. |
Version: |
1.2 |
Default schedule: |
0 * * * |
Requires engine install: |
Yes |
Compatibility tag: |
.[type=‘instance’ & databasetype=‘sqlserver’]/instance[maj_version > ‘2000′ & hasengine=‘YES’]/.[eng_inst_priv = 0 & (engine_edition = ‘Microsoft SQL Server’ |
engine_edition = ‘Azure SQL Managed Instance’)] |
Parameters
Name |
Default value |
Description |
return status when error 823/824 |
1 |
Return status value (ALARM – 2, WARNING – 1, or OK – 0) when 823 error caused by an operating system CRC error or 824 error other than a bad checksum or a torn page (for example, a bad page ID). |
return status when bad checksum |
1 |
Return status value (ALARM – 2, WARNING – 1, or OK – 0) when bad checksum. |
return status when torn page |
1 |
Return status value (ALARM – 2, WARNING – 1, or OK – 0) when torn page. |
return status when restored |
0 |
Return status value (ALARM – 2, WARNING – 1, or OK – 0) when restored (the page was restored after it was marked bad) |
return status when repaired |
0 |
Return status value (ALARM – 2, WARNING – 1, or OK – 0) when repaired (DBCC repaired the page) |
return status when deallocated |
0 |
Return status value (ALARM – 2, WARNING – 1, or OK – 0) when deallocated by DBCC |
history threshold |
60 |
Ignores records in msdb..dbo.suspect_pages which are older than this parameter value (in days). |
Monitoring Job Summary for dbWatch Control Center
Purpose of the Job
- The key purpose of this monitoring job is to track the status of suspect pages in the msdb..suspect_pages table of SQL Server databases.
- It is designed to maintain system integrity by monitoring pages that have failed due to errors like bad checksums, torn pages, and other related issues.
Why This Job is Important
- This task is crucial for ensuring the health and recoverability of the SQL Server databases.
- It helps in deciding if a restore or repair operation is necessary by monitoring specific pages that have been flagged as suspect.
- Avoiding or delaying the detection of these errors could lead to severe data loss or corruption, potentially disrupting business operations.
Manual Checking
- To manually check suspect pages in your SQL Server database, you can use the following SQL command:
SELECT * FROM msdb..suspect_pages;
Core Functionality
- The job automates the monitoring process by checking the suspect_pages table for any pages marked with errors and checks them against defined criteria, such as severity of errors and error counts.
- It processes different types of errors including bad checksums, torn pages, and errors related to page restoration or repair.
- Depending on the results, different statuses (ALARM – 2, WARNING – 1, or OK – 0) are returned based on the severity of the issue detected.
Implementation Details
- The monitoring job uses a stored procedure “dbw_suspect_pages_proc” that queries the suspect_pages table and analyses entries based on pre-defined thresholds.
- The procedure dynamically adjusts its responses based on the type and severity of the errors encountered, ensuring accurate monitoring.
Reporting
- The job generates reports listing suspect pages per database, detailing the type of error, error count, and date information.
- Reports differentiate between fresh instances of suspect pages and ones that have been resolved (restored, repaired, or deallocated).
Scheduling
- The job is scheduled to run periodically (default schedule: every hour) to ensure timely detection and resolution of potential problems.
Upgrading and Dependencies
- Contains logic for upgrading from previous versions, verifying and adjusting parameters as needed.
- Needs access to the msdb.dbo.suspect_pages table and relies on internal stored procedure adjustments reflecting newer SQL Server editions’ capabilities.
This job represents a critical part of database maintenance strategy, automating the surveillance of potential points of failure and ensuring timely intervention when database pages become suspect due to various errors.