Follow-up to Help! My RTC Database is getting big!

Almost five years ago I posted Help! My RTC Database is getting big! There have been a number of improvements since then so I think an update is appropriate.

First, to keep the offering managers and marketing team happy, I should point out that Rational Team Concert (RTC) is no longer the name but is now Engineering Workflow Management (EWM). See Renaming the IBM Continuous Engineering Portfolio for more information.

It is still true that build results, work item attachments and versioned conent are the largest contributors to EWM database size. Previously I referenced out of the box reports useful for determining which EWM namespaces occupied the most space. Since then I have documented web services and repotools reports, and Item Count Details JMX MBean that can be used to get the same information. See How many artifacts do I have in my Jazz application repository?.

Once you know which artifacts are taking up the most space, you then need to know what can and cannot be deleted. I point out some of this in the previous blog but we now have an article on the deployment wiki that goes into more detail. See Deleting data in Rational Team Concert.

A new technique for reducing the size of your EWM database is using an external content repository such as Artifactory for managing large versioned files. See Rational Team Concert: External content repositories. The article describes how to configure EWM to use an external content repository, move content in and out of it as well as serveral new JMX MBeans to monitor the size of external content repositories, size of EWM SCM components and size of the largest files.

Techniques for publishing build results outside of the EWM repository have been around for a long time. See one example in my colleague Ralph Schoon’s blog post Build Artifacts Publishing and Automated Build Output Management Using the Plain Java Client Libraries.

Finally, just released in 6.0.6.1 is a capability to move an SCM component from one EWM repository to another. This allows you to reparitition your SCM content across existing or new EWM repositories. See Moving an Engineering Workflow Management (EWM) SCM component to a new EWM SCM server.

I think the improvements since the original posting make it easier to monitor the growth of your repositories, understand what is causing it and provides better strategies for mitigating growth. As always your feedback is appreciated. If you have other techniques you find useful, please pass them on.

Recommended practices when integrating Jenkins builds and RTC SCM

Many of our customers are using Jenkins for their software builds integrated with Rational Team Concert (RTC) software configuration management (SCM). This is a powerful combination but one that should be managed carefully so as not to put undue load on the RTC server. Based on several customer experiences my colleagues and I from development and services have put together an article that captures our guidance and recommendations when using RTC as an SCM system in Jenkins jobs. Check it out here: Using Rational Team Concert for source code management in Jenkins jobs.

Detecting mixed use of RTC SCM clients

The IBM Continuous Engineering (CE) solution supports the notion of N-1 client compatibility. This means a v5.0.x client such as Rational Team Concert (RTC) for Eclipse IDE can connect to a v6.0.x RTC server. Customer deployments may have 100s if not 1000s of user deployments with various clients. It is not typically feasible to have them all upgraded at the same time as when a server upgrade occurs. In some cases, to upgrade the client requires a corresponding upgrade in development tooling, e.g. version of Eclipse, but doing so is not possible yet for other reasons. 

In an environment when there is a mix of RTC clients (inclusive of build clients), a data format conversion on the server is required to ensure that the data from the older clients is made compatible with the format of the newer version managed by the RTC server. Reducing these conversions is one less thing for the server to do, especially when multiplied by 100s or 1000s of clients.  Additionally, new capabilities from more recent versions often can’t be fully realized until older clients are upgraded.

There are two ways to determine the extent of how much N-1 activity is ongoing in your environment.

ICounterContentService

Enable the RepoDebug service and go to https://<host:port/context>/service/com.ibm.team.repository.service.internal.counters.ICounterContentService. From there, scroll down to the Compatible client logins table.

The sample table above was taken from a 6.0.3 RTC server. It shows that out of 75685 client logins since the server was last started, only 782 of them were from the most recent client. 70% are from 5.x clients. Ideally, these would be upgraded to the latest client.

Compatible client logins MBean

If using an enterprise monitoring application integrated with our JMX MBeans, starting in 6.0.5, the Client Compatible Logins MBean can be used to get similar details as shown in the table from ICounterContentService. The advantage of the MBean is that the login values can be tracked over time so you can assess whether efforts to reduce use of old clients are succeeding

Detecting which users (or builds) are using old clients

In order to determine which users, build servers or other integrations are using older clients, you could parse the access log for your reverse proxy. In particular, find all occurrences of versionCompatibility?clientVersion in the log.

As you can see from the sample access log, there are several instances of calls with a 5.0.2 client. The log entries have the IP address of the end client sourcing the call. If this can be associated to a particular user (or build user), you can then have someone to talk to regarding upgrading their client version.

Detecting RTC SCM access not using Content Caching Proxy

One of our best practices for improved Rational Team Concert (RTC) Software Configuration Management (SCM) response times and reduced load on the RTC server is to use a content caching proxy server. These are located near users at servers in remote locations where the WAN performance is poor (high latency to RTC server). What is often missed, is that we also recommend they be placed near build servers, especially with significant continuous integration volume, to improve the repeated loading of source content for building.

This practice is not enforceable. That is, SCM and build client configurations must be manually setup to use the caching proxy. This is particularly troublesome for large remote user populations or where large numbers of build servers exist, especially when not centrally managed.

The question that naturally comes is how can one detect that a caching proxy is not being used when it should? One way is to look at active services and find service calls beginning with com.ibm.team.scm and com.ibm.team.filesystem for RTC SCM operations or com.ibm.team.build and com.ibm.rational.hudson for RTC build operations.

Since the IP address of the available caching proxies are static and known, you can find any entries on the Active Services page with a Service Name of any SCM or build related service calls that are not coming from an IP address (Scenario Id) belonging to a caching proxy. Since the active services entry captures the requesting user ID (Requested By), you can then check with the offending user to understand why the proxy wasn’t used and encourage them to correct their usage.

Active services detail is also available via the Active Services JMX MBean. If an an enterprise monitoring application is being used and integrated with our JMX MBeans, then it can be configured to capture this detail, parse it and generate appropriate alerts or lists to identify when a proxy is not being used.

One other option is to parse the access log for your reverse proxy.  Shown below is sample output from an IBM HTTP Server (IHS) access log.

The access log does not have user ID information but it does have the service calls and the IP address they are coming from.  You would need to have a way to determine associate an IP address with a user machine (for those entries not coming from a caching proxy).  Note that if a load balancer is used, the IP address recorded in the access log may not be the true IP address that originated the request.  For this reason and since the user ID information is not directly available, the Active Services method may be better.

Help! My RTC Database is getting big!

Many customers who have been using RTC for some time are seeing their database size grow, in some cases, to 100s of GBs if not TBs of data.  While growth is to be expected and is of no issue to modern DBMSes, proactive administrators want to understand the growth and how it can be mitigated especially in light of anticipated user growth in their environments.  Naturally, they come to us and ask what can be done.  While at this time we don’t have the solutions many customers are asking for (project copy, move, archive, delete, etc.), that isn’t to say we don’t have approaches that may be of value in some situations.

Working with our own self-hosted RTC environment as well as those of our enterprise customers, we generally find that the largest contributors to database size our build results, work item attachments and versioned content.  How would you know that?  Fortunately, we have a couple of out of the box reports you can run: Latest Repository Metrics and Latest Repository Metrics by Namespace.  Below are some samples showing a subset of the available namespaces and item types.

image

image

Looking at all the namespaces and item types begs the question…what do they all mean?  Yeah, they aren’t all obvious to me either.  Luckily, I have access  to the smart developers who wrote this stuff and can tell me.  If you find one you don’t know, send me a note/comment or post it on the jazz.net forums.  image

Once you find the larger contributors to size, the next questions asked are can they be deleted and who (that is, which project) is producing them.  In keeping with my team’s No Nonsense Tech Talk theme, I’ll be honest, there’s not much we can delete/archive and we certainly can’t do it at a project level, which would be of greater value, nor can we easily tell who produced it all.  It’s not all doom and gloom because there are some things we can do.

As mentioned earlier, we can delete build results, which often are a huge contributor to size growth.  We can delete work items, even attachments from work items.  Versioned content can be deleted, though you don’t usually want to do that, except for security reasons or to remove binaries versioned by mistake.  Then there are plans, streams, workspaces, etc, that can be deleted, but these don’t tend to take up much space.

So what happens when something is deleted? Well, in some cases, it’s not really removed from the database, only reference to it is removed or made less accessible.  For example, work item attachments don’t really go away when removed from a work item.  Try this.  Add an attachment to a work item then save the work item.  Hover over the attachment link and copy it.  Now remove the attachment from the work item then save the work item.  In the browser, paste the copied attachment URL and it will be found.  Similarly, if you delete a work item that has attachments, the attachment links still remain valid.  However, if you delete (not remove) a work item from the Eclipse client, the work item is actually deleted.

If you find that you’ve removed but not deleted an attachment, it is possible to go back and have it truly deleted.  To do so, using the Eclipse client, paste the URL to the attachment (which should be visible in a discussion comment from when it was first attached) somewhere into the work item (into a comment or the description), right click over that link and select “add to favorites”. Once it is in the favorites, you can drag it from Favorites and drop onto the Attachments section, which re-attaches it to the work item, at which point you can then delete it in the normal way.

Now some things like build results and versioned content once deleted can truly be deleted from the database.

At the repository level there are two key background tasks that are used to mark content as deletable then later deletes it.

  • an “Item Cleanup Task” background task at repository level marks newly orphaned content blobs as needing deletion (runs every 17 mins by default)
  • a “Deleted Items Scrub Task” background task at repository level deletes any content blobs that have been orphaned for more than ~2 hours (runs every 24 hours by default)

Once these both run, any content blobs that were deleted more than 2 hours ago should be fully deleted from the database.

However, DBMSes (particularly those in production) don’t generally release storage allocated for their tables immediately.  A compaction task usually needs to be run to reclaim the disk space.  The DBMS should have tools to indicate in advance how much space can be reclaimed by compaction.  Typical utilities to do this are shown below.  Details for using them should be left to a qualified DBA.

  • Oracle – ALTER TABLE … SHRINK SPACE
  • DB2 – REORG
  • SQL Server – SHRINKDATABASE

My teammate Takehiko Amano has done a very nice job of showing how deleting versioned content and later running DB2 commands reduces database size.  See his article Reducing the size of the Rational Team Concert repository database.

We find the build results often take up a good amount of the RTC database size.  These results often include all the log files from compilations, tests and other activities performed during the build.  Some times they will contain downloadable outputs, e.g. application archives and executables.  What happens is these results are often kept around and never deleted.  In some cases, especially for release milestones, they should be, but all those personal or interim continuous builds don’t need to be kept.  Build results can be deleted and will result in their database content being orphaned and subsequently deleted per the aforementioned process.  Rather than manually deleting results, consider setting up a pruning policy to automatically delete old results.  For those results you want to keep around and not be pruned, just mark them as not deletable.

In cases you know your build results are taking up a lot of space, the natural follow on question is which builds and who owns them.  Our development team recently had cause to address that question which resulted in a very useful script written by Nick Edgar, the RTC Build Component Lead.

Nick created a Groovy script that collects into a CSV file the pruning policy settings and footprint data for each build in all projects across a repository.

SNAGHTML2fbd67b2

He further creates an interactive HTML report that parses the CSV file for display in a more visual form.  image

With this information you can find out which builds from which projects are taking up the most space and whether they have a pruning policy in place.  Armed with this information, an administrator could go to the appropriate release teams and take action.  Imagine running it weekly and posting to a dashboard or emailing it weekly to release teams.  The Groovy script to collect the data and index.html to render the report are attached to task work item 330478.

For gathering the CSV data you’ll need to 1) install Groovy, 2) install a build system toolkit that’s compatible (ideally at same version) as RTC server, 3) set environment variables (see top of Groovy script), and 4) run the script with: groovy -cp “$BUILD_TOOLKIT/*” <groovy file name> <any arguments needed by script>.

For the chart, just put the chart index.html file and CSV in the same directory and open the HTML file. Some browsers will require these to be served up by a web server to allow the HTML file to read the CSV file. For my testing, I used Python’s simple server support for this: python -m SimpleHTTPServer.

Given I am referencing code samples, I’ll keep our lawyers happy by stating that any code referenced is derived from examples on Jazz.net as well as the RTC SDK. The usage of such code is governed by this license. Please also remember, as stated in the disclaimer, that this code comes with the usual lack of promise or guarantee. Enjoy!

Being able to monitor the size and growth of your data, getting granular and actionable information about it and ultimately, being able to do something positive about that growth, is a key concern for IBM Rational and something we are exploring as part of our Platinum Initiative.  I welcome your input in this area.  Perhaps we can interact at IBM InterConnect 2015.

Automating changes to a build workspace configuration

Here’s the scenario.  A build engineer gets a request to establish a build workspace and corresponding build workspace.  He creates the build workspace, gets the desired components added to it, sets the flow targets then changes ownership of the workspace to the build user’s functional ID.  After that, he creates the build definition and associates it with the newly created workspace.  All is good and developers can begin to use the new build definition.  After a time, a request comes in to change the configuration of the build workspace, specifically the components to use.  Since the build workspace is owned by the build user, any changes to it must be made by that user.  Many organizations eschew the use of functional IDs or at least minimize who knows their credentials and are concerned about extra maintenance brought on by password expiration rules.  What to do?

This specific scenario came up recently with a customer of mine.  In particular, their build workspaces have three flow targets.  The current/default is the stream with the application source and it is scoped to include a subset of the components in the stream.  The other two streams are build script and configuration related.  At times, the release team needs to change the components included in the scope of the application source stream.  They do so currently by logging in as the build user, something they detest doing.

What they much prefer is support for team ownership of repository workspaces but that isn’t currently possible (though requested via 271760).  We instead proposed a solution that put the workspace configuration change in the build script which is already performed as the build user and owner of the workspace.

As of 4.0.1, the SCM CLI includes the capability to add/change workspace flow targets.  It was later refactored in 4.0.6 to the current verb-oriented form.

${scmExe} –non-interactive set flowtarget -r ${repositoryAddress} -u ${userId} -P ${password} ${team.scm.workspaceUUID} ${targetStreamUUID} –flow-components ${componentsToLoad}

Where:

  • scmExe – path to scm CLI executable
  • repositoryAddress – URL of CCM server
  • userId – build user ID
  • password – password of build user
  • team.scm.workspaceUUID – build repository workspace UUID
  • targetStreamUUID – stream flow target to set component scope
  • componentsToLoad – space delimited list of components

Unfortunately the SCM CLI does not understand the password file format used by the Ant tasks. You either need to give it as plain text using –P shown above or login outside the build system with the option to remain logged in (on each build machine).

A simple way to get the UUID of a workspace or stream is to open it up in the Eclipse editor and select Copy URL workspace editor menu in the view header or browse to it in the Web UI to get the URL.  The end portion of the URL is the UUID.  For example, the UUID for the repository workspace URL shown below is _GVqXYLRpEeOdavKqgVc36Q

https://clm.jkebanking.net:9443/ccm/resource/itemOid/com.ibm.team.scm.Workspace/_GVqXYLRpEeOdavKqgVc36Q

For components with spaces in their name, care must be taken to offset them by appropriate quotes.  For my tests, single quotes worked on Linux and double on Windows.

The command will set the scope of the specified flow target (e.g. application development stream) for the specified workspace (e.g. build workspace) to include only those components in the specified list.  The current configuration of the component(s) in the target stream is used.

Note if multiple flow targets exist for the workspace and a component listed is included in multiple flow targets, then scoped flows need to be used to avoid conflicts.  See How should my source code be loaded from Jazz SCM?

Let’s take a look at an example.  Below, the build.brm.continuous repository workspace, owned by the build user, has six components and three flow targets in its configuration. Banking Logic, Database, Java UI and Prerequisites are from the BRM StreamBuild comes from Build Scripts and Build Config comes from the stream of the same name.

image

The BRM Stream is scoped to only include only a subset of the components.

SNAGHTML9b0895d

Assume that we wanted to add the Database component to the scope.  Using SCM CLI command similar to that shown in the screenshot below, the scope can be changed to add it in.

SNAGHTML9bded5d

This results in the flow target scope being changed.

SNAGHTML9beca53

Now to add this to the build script so it can be automated.  Observe in the editor screenshot below that the command has been added to an exec statement in an Ant build script.  image

The targetStreamUUID and componentsToLoad values need to be passed to the build script.  Add these as properties of the build definitions.  For example:

image

When the build is requested, the componentsToLoad value can be changed.  In our original example, the Database component can be added in by editing the componentsToLoad build property at the time of the build request.

SNAGHTML9cfe091

Should you be concerned that adding the ‘set flowtarget’ command to the build script will add unnecessary overhead, albeit very minimal, to every build execution even when component list isn’t changed, you can create a build script and definition that only performs the ‘set flowtarget’ command and run it when needed.

The example shown was for changing the flow target component scope.  The technique used can be applied for other build workspace manipulations needed and supported by the SCM CLI.

Thanks to Nick Edgar, RTC Build component lead, for making me aware of the ‘set flowtarget’ command and suggesting its application to the problem described.