Tips for improved monitoring of your DNG environment

Let’s assume that you are convinced that application monitoring of your IBM CE/CLM environment is a good best practice. There are many JMX MBeans defined in the MBeans Reference List that your enterprise monitoring application can collect and manage. If just getting started, focus on the set of MBeans described in CLM Monitoring Primer. Once you’ve implemented the base set, you can expand from there.

Proactively monitor those MBeans by setting recommended thresholds with corresponding alert notifications to prompt further investigation. Thresholds may need to be adjusted over time based on experience. Monitor normal operations to establish appropriate baselines and adjust thresholds accordingly to reduce false negative alerts. Note that some monitoring tools have the ability to use machine learning and statistical analysis to adapt thresholds.

Based on some of my recent customer experiences, for DNG in particular, there are two key items I recommend you monitor beyond what’s called out in the primer or even currently available via an MBean. These will help you optimize the performance of your deployment.

JFS indexBacklog
Jena index updates occur when a write is being processed by DNG. The status of the index can be monitored through an indexing page (https://<server&gt;:<port>/rm/indexing). A backlog indicates there are updates yet to be passed on to the Jena indexer for processing (e.g. after a large import). Idealy the backlog of the indexer should be low on a well performing system. When high, system performance may suffer temporarily until the indexer catches up. Symptoms of heavy indexing can be slow performance, or users not seeing data immediately after creation. See technote 1662167.

Even better, for those clients using an enterprise application monitoring tool and gathering our pubished MBeans, there is one that tracks the index backlog. The JFS Index Information MBean is available as of 6.0.5 and collected by the IndexDataCollectorTask. It can be used to gather not only the size of the index but the backlog of items waiting to be indexed. By default, data is collected every 60 mins. Alerts can be set so that if the backlog gets high, e.g. over 1000, admins may choose to warn users and slow down system activity so the indexer can catch up.

DNG write journal file
Once the DNG indexer completes indexing of changes to artifacts, if there are no read operations in progress, the update is committed to the main index, otherwise, it is written to a journal file. Once the last active read operation completes, changes in the journal file are written to the main index. This approach allows in-progress read queries to maintain a consistent view of the index, while allowing new read queries to see the latest data.

The DNG write journal file (journal.jrnl) is located in server/conf/rm/indices/<id>/jfs-rdfindex. The size of the journal file can be monitored through standard OS commands and scripts or through OS integrations typically available with enterprise monitoring applications. This file will grow over time but should regularly go back to zero. In the unlikely event that it does not, it’s a sign of a bottleneck where read activity is blocking write activity. System performance may be impacted at this point. When this happens, it’s best for DNG users to pause their DNG activity while the system catches up. One customer does this by notifying their users, removing DNG from the Reverse Proxy configuration (commenting out its entry), monitoring the journal file size until it returns to zero, then adding DNG back into the proxy configuration and informing the users.

As a teaser to future content, check out 124663: As an administrator, I should be able to monitor my server using MBeans, which will provide new DNG application MBeans to further aid administrators in proactively monitoring and managing a DNG system.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s