Recommended practices when integrating Jenkins builds and RTC SCM

Many of our customers are using Jenkins for their software builds integrated with Rational Team Concert (RTC) software configuration management (SCM). This is a powerful combination but one that should be managed carefully so as not to put undue load on the RTC server. Based on several customer experiences my colleagues and I from development and services have put together an article that captures our guidance and recommendations when using RTC as an SCM system in Jenkins jobs. Check it out here: Using Rational Team Concert for source code management in Jenkins jobs.

Advertisements

Supported options for Reverse Proxies and Load Balancers in an IBM Engineering Lifecycle Management solution deployment

As a deployment architect when advising customers on their topology, I’m often asked about the supported options for reverse proxies and load balancing. Further, some customers ask about using DNS aliasing over a Reverse Proxy, which is supported but not a best practice.

While some of these details can be found in the Knowledge Center and deloyment wiki, we’ve not had a central article listing options with any comparisons. My colleagues in IBM Support have recently published an article with these details here: Reverse Proxies and Load Balancers in CLM Deployment.

Type System Manager Part 2

Ralph Schoon has done some very useful work providing some automation around our best practice guidance for DNG type system management. Check out his blog post to learn more.

rsjazz

We finally published Maintaining the Rational DOORS Next Generation type system in a configuration-management-enabled environment. Part 3: Automation tool deep dive on Jazz.net.This was a major effort and took a long time to do. This article provides a closer look at the source code, what it does and how it does it. It also provides some insight in how OSLC4J works and can be used. The information in the article, especially for setup and deployment of the automation prototype is very reusable for other scenarios and I hope to be able to reuse it in later articles and blog posts.

Type System Manager

When this effort was planned and performed last year, we had no idea what would come out of this effort. When we finished the first iterations and I started to write Maintaining the Rational DOORS Next Generation type system in a configuration-management-enabled environment. Part 3: Automation tool…

View original post 461 more words

How to register your custom utilities as a resource-intensive scenario

In Resource-intensive scenarios that can degrade CLM application performance I describe how certain IBM Collaborative Lifecycle Management (CLM) application scenarios can be resource-intensive and known to degrade system performance at times. As I’ve intereracted with customers on their deployments and performance concerns, it is apparent that they are getting more and more creative in building custom automation scripts/utilities using our APIs. At times, these custom utilities have generated significant load on the system.

As a best practice, we now recommend that customers evaluate their custom utilities and determine if any are candidates to be resource-intensive. For those that are, they should be modified and registered as resource-intensive with appropriate start and stop scenario markers included in the code. Until recently, all we could provide to help do this was some code snippets.

Thanks to my colleagues Ralph Schoon, Dinesh Kumar and Shubjit Naik, we now have documented guidance and sample code to help you do this. Have a look at Register Custom Scripts As a Resource Intensive Scenario. Ralph also gives some additional detail behind the motivation for the custom scenario registration in his blog post.

Once registered, you will now be able to track their occurrence in the appropriate application log. If you’ve implemented enterprise application monitoring, you can track for available JMX MBeans as described in CLM Monitoring.

Detecting mixed use of RTC SCM clients

The IBM Continuous Engineering (CE) solution supports the notion of N-1 client compatibility. This means a v5.0.x client such as Rational Team Concert (RTC) for Eclipse IDE can connect to a v6.0.x RTC server. Customer deployments may have 100s if not 1000s of user deployments with various clients. It is not typically feasible to have them all upgraded at the same time as when a server upgrade occurs. In some cases, to upgrade the client requires a corresponding upgrade in development tooling, e.g. version of Eclipse, but doing so is not possible yet for other reasons. 

In an environment when there is a mix of RTC clients (inclusive of build clients), a data format conversion on the server is required to ensure that the data from the older clients is made compatible with the format of the newer version managed by the RTC server. Reducing these conversions is one less thing for the server to do, especially when multiplied by 100s or 1000s of clients.  Additionally, new capabilities from more recent versions often can’t be fully realized until older clients are upgraded.

There are two ways to determine the extent of how much N-1 activity is ongoing in your environment.

ICounterContentService

Enable the RepoDebug service and go to https://<host:port/context>/service/com.ibm.team.repository.service.internal.counters.ICounterContentService. From there, scroll down to the Compatible client logins table.

The sample table above was taken from a 6.0.3 RTC server. It shows that out of 75685 client logins since the server was last started, only 782 of them were from the most recent client. 70% are from 5.x clients. Ideally, these would be upgraded to the latest client.

Compatible client logins MBean

If using an enterprise monitoring application integrated with our JMX MBeans, starting in 6.0.5, the Client Compatible Logins MBean can be used to get similar details as shown in the table from ICounterContentService. The advantage of the MBean is that the login values can be tracked over time so you can assess whether efforts to reduce use of old clients are succeeding

Detecting which users (or builds) are using old clients

In order to determine which users, build servers or other integrations are using older clients, you could parse the access log for your reverse proxy. In particular, find all occurrences of versionCompatibility?clientVersion in the log.

As you can see from the sample access log, there are several instances of calls with a 5.0.2 client. The log entries have the IP address of the end client sourcing the call. If this can be associated to a particular user (or build user), you can then have someone to talk to regarding upgrading their client version.

Detecting RTC SCM access not using Content Caching Proxy

One of our best practices for improved Rational Team Concert (RTC) Software Configuration Management (SCM) response times and reduced load on the RTC server is to use a content caching proxy server. These are located near users at servers in remote locations where the WAN performance is poor (high latency to RTC server). What is often missed, is that we also recommend they be placed near build servers, especially with significant continuous integration volume, to improve the repeated loading of source content for building.

This practice is not enforceable. That is, SCM and build client configurations must be manually setup to use the caching proxy. This is particularly troublesome for large remote user populations or where large numbers of build servers exist, especially when not centrally managed.

The question that naturally comes is how can one detect that a caching proxy is not being used when it should? One way is to look at active services and find service calls beginning with com.ibm.team.scm and com.ibm.team.filesystem for RTC SCM operations or com.ibm.team.build and com.ibm.rational.hudson for RTC build operations.

Since the IP address of the available caching proxies are static and known, you can find any entries on the Active Services page with a Service Name of any SCM or build related service calls that are not coming from an IP address (Scenario Id) belonging to a caching proxy. Since the active services entry captures the requesting user ID (Requested By), you can then check with the offending user to understand why the proxy wasn’t used and encourage them to correct their usage.

Active services detail is also available via the Active Services JMX MBean. If an an enterprise monitoring application is being used and integrated with our JMX MBeans, then it can be configured to capture this detail, parse it and generate appropriate alerts or lists to identify when a proxy is not being used.

One other option is to parse the access log for your reverse proxy.  Shown below is sample output from an IBM HTTP Server (IHS) access log.

The access log does not have user ID information but it does have the service calls and the IP address they are coming from.  You would need to have a way to determine associate an IP address with a user machine (for those entries not coming from a caching proxy).  Note that if a load balancer is used, the IP address recorded in the access log may not be the true IP address that originated the request.  For this reason and since the user ID information is not directly available, the Active Services method may be better.

Tips for improved monitoring of your DNG environment

Let’s assume that you are convinced that application monitoring of your IBM CE/CLM environment is a good best practice. There are many JMX MBeans defined in the MBeans Reference List that your enterprise monitoring application can collect and manage. If just getting started, focus on the set of MBeans described in CLM Monitoring Primer. Once you’ve implemented the base set, you can expand from there.

Proactively monitor those MBeans by setting recommended thresholds with corresponding alert notifications to prompt further investigation. Thresholds may need to be adjusted over time based on experience. Monitor normal operations to establish appropriate baselines and adjust thresholds accordingly to reduce false negative alerts. Note that some monitoring tools have the ability to use machine learning and statistical analysis to adapt thresholds.

Based on some of my recent customer experiences, for DNG in particular, there are two key items I recommend you monitor beyond what’s called out in the primer or even currently available via an MBean. These will help you optimize the performance of your deployment.

JFS indexBacklog
Jena index updates occur when a write is being processed by DNG. The status of the index can be monitored through an indexing page (https://<server:port>/rm/indexing). A backlog indicates there are updates yet to be passed on to the Jena indexer for processing (e.g. after a large import). Ideally the backlog of the indexer should be low on a well performing system. When high, system performance may suffer temporarily until the indexer catches up. Symptoms of heavy indexing can be slow performance, or users not seeing data immediately after creation. See technote 1662167.

Even better, for those clients using an enterprise application monitoring tool and gathering our pubished MBeans, there is one that tracks the index backlog. The JFS Index Information MBean is available as of 6.0.5 and collected by the IndexDataCollectorTask. It can be used to gather not only the size of the index but the backlog of items waiting to be indexed. By default, data is collected every 60 mins. Alerts can be set so that if the backlog gets high, e.g. over 1000, admins may choose to warn users and slow down system activity so the indexer can catch up.

DNG write journal file
Once the DNG indexer completes indexing of changes to artifacts, if there are no read operations in progress, the update is committed to the main index, otherwise, it is written to a journal file. Once the last active read operation completes, changes in the journal file are written to the main index. This approach allows in-progress read queries to maintain a consistent view of the index, while allowing new read queries to see the latest data.

The DNG write journal file (journal.jrnl) is located in server/conf/rm/indices/<id>/jfs-rdfindex. The size of the journal file can be monitored through standard OS commands and scripts or through OS integrations typically available with enterprise monitoring applications. This file will grow over time but should regularly go back to zero. In the unlikely event that it does not, it’s a sign of a bottleneck where read activity is blocking write activity. System performance may be impacted at this point. When this happens, it’s best for DNG users to pause their DNG activity while the system catches up. One customer does this by notifying their users, removing DNG from the Reverse Proxy configuration (commenting out its entry), monitoring the journal file size until it returns to zero, then adding DNG back into the proxy configuration and informing the users.

As a teaser to future content, check out 124663: As an administrator, I should be able to monitor my server using MBeans, which will provide new DNG application MBeans to further aid administrators in proactively monitoring and managing a DNG system.