Thursday, January 15, 2009

Exchange Check Log Gen Checkpoint Depth

When a full backup occurs, Commvault uses APIs for exchange to inform the selected Exchange 2003 Mailbox server that a full backup is in progress. Exchange takes note of this instruction and begins queuing log files and not committing any transactions to the databases. Although backups are done at the storage group level, Commvault actually only backs up one store within the storage group at any given time. During this backup process, the logs are generated but not committed to the databases. This is where the Log Gen Checkpoint Depth is measured. It is the performance counter that reports the number of uncommitted transaction log files. That’s only uncommitted log files.

If a backup is not occurring, the checkpoint depth should remain around 2 or slightly higher but no more than that. If you’re consistenly seeing reports of high checkpoint depths, there are possibly other issues at hand.

If the transation log file depth reaches a hardcoded number of 1000, ESE (Extensible Storage Engine) dismounts all the stores in the respective storage group as a safety measure.

Since Commvault backs up one database at a time, exchange is able to commit any uncommitted transaction log files to all the databases before commvault begins backing up the next database. This is the point where the Log Generation Checkpoint Depth will drop and also give an indication that a store backup is complete.

You can monitor this value constantly and simply execute a script that kills all backup jobs in progress on the client server when a critical depth (801) has been reached.

If you’re interested in determining the current Log Gen Checkpoint Depth for a server, you can use Perfmon to find the answer.

Conveniently, Microsoft provides you with may perfmon counters for Exchange 2003. You can start perfmon by doing the following:

Start -> Run -> Perfmon -> OK

To add more counters click the + (plus) button on the window and select the appropriate Object and Counter.

For the Log Gen Checkpoint Depth, find the “Database --> Instances” performance object and select “Log Generation Checkpoint Depth” from the counter list below. On the right, you will be shown all the available instances for that particular counter. In this case, you’re looking for the names of your Storage Groups.

Click “Add” and watch your new counters.

Change the graph type around to what suits you best. Don’t forget to take the “Scale” value into consideration.

For example: if the checkpoint depth for a SG is currently 12 and the scale is 10.000, your graph will display “120”. If your graph’s vertical scale is set to 100, the graph can be very misleading. Hence, its more important to focus on the actual values being displayed. Those are Last, Average, Minimum, Maximum, and Duration.

1 comment:

JB said...

We're running into a situation where Commvault is unable to complete it's backup before the number of uncommitted log files reaches 1000, at which point, of course, the store shuts down. I don't administer the backups, but I can only assume that the backup job is ruin as well when that happens.