Getting to a right-sized Jazz environment

2015-05-24 14.05.27You’ve just made the decision to adopt one of the Jazz solutions from IBM.  Of course, being the conscientious and proactive IT professional that you are, you want to ensure that you deploy the solution to an environment that is performant and scalable.  Undoubtedly you begin scouring the IBM Knowledge Center and the latest System Requirements.  You’ll find some help and guidance on Deployment and Installation and even a reference to advanced information on the Deployment wiki.  Unlike the incongruous electric vehicle charging station in a no parking zone, you are looking for definitive guidance but come away scratching your head still unsure of how many servers are needed and how big they should be.

This is a common question I am often asked, especially lately.  I’ve been advising customers in this regard for several years now and thought it would be good to start capturing some of my thoughts.  As much as we’d like it to be a cut and dried process, it’s not.  This is an art not a science.

My aim here is to capture my thought process and some of the questions I ask and references I use to arrive at a recommendation.  Additionally, I’ll add in some useful tips and best practices.  If this proves useful, it will eventually move over to the Deployment wiki.

I find that the topology and sizing recommendations are similar regardless of whether the server is to be physical or virtual, on-prem or in the cloud, managed or otherwise.  These impact other aspects of your deployment architecture to be sure, but generally not the number of servers to include in your deployment or their size.

BUS30093From the outset, let me say that no matter what recommendation I or one of my colleagues gives you, it’s only a point in time recommendation based on the limited information given, the fidelity of which will increase over time.  You must monitor your Jazz solution environment.  In this way you can watch for trends to know when a given server is at capacity and needs to scale by increasing system resources, changing the distribution of applications in the topology and/or adding a new server.  See Monitoring: Where to Start? for some initial guidance.  There’s a lot going on in the monitoring area ranging from publishing additional information to existing monitor solutions or providing a lightweight appliance with some monitoring capabilities.  Keep an eye on work items 386672 and 390245.

enterpriseBefore we even talk about how many servers and their size, the other standard recommendation is to ensure you have a strategy for keeping the Public URI stable which maximizes your flexibility in changing your topology.  We’ve also spent a lot of time deriving standard topologies based on our knowledge of the solution, functional and performance testing, and our experience with customers.  Those topologies show a range in number of servers included.  The evaluation topology is really only useful for demonstrations.  The departmental topology is useful for a small proof of concept or sandbox environment for developing your processes and procedures and required configuration and customization.  For most production environments, a distributed enterprise topology is needed.

The tricky part is that the enterprise topology specifies a minimum of 8 servers to host just the Jazz-based applications, not counting the Reverse Proxy Server, Database Server, License Server, Directory Server or any of the servers required for non-Jazz applications (IBM or 3rd Party).  For ‘large’ deployments of 1000 users or more that seems reasonable.  What about smaller deployments of 100, 200, 300, etc. users?  Clearly 8+ servers is overkill and will be a deterrent to standing up an environment.  This is where some of the ‘art’ comes in.  I find more often than not, I am recommending a topology that is some where between the department and enterprise topologies.  In some cases, a federated topology is needed when a deployment has separate and independent Jazz instances but needs to provide a common view from a reporting perspective and/or for global configurations, in case of a product line strategy. The driving need for separate instances could be isolation, sizing, reduce exposure to failures, organizational boundaries, merger/acquisition, customer/supplier separation, etc.

The other part of the ‘art’ is recommending the sizing for a given server.  Here I make extensive use of all the performance testing that has been done.

4.1.1The CLM Sizing Strategy provides a comfortable range of concurrent users that a given Jazz application can support on a given sized server for a given workload.  Should your range of users be higher or lower, your server be bigger or smaller or your workload be more or less demanding, then you can expect your range to be different or to need a different sizing.  In other words, judge your sizing or expected range of users up or down based on how closely you match the test environment and workload used to produce the CLM Sizing Strategy.  Concurrent use can come from direct use by the Jazz users but also 3rd party integrations as well as build systems and scripts.  All such usage drives load so be sure to factor that into the sizing.  There are other factors such as isolating one group of users and projects from another, that would motivate you to have separate servers even if all those users could be supported on a single server.

Should your expected number of concurrent users be beyond the range for a given application, you’ll likely need an additional application server of that type.  For example, the CLM Sizing Strategy indicates a comfortable range of 400-600 concurrent users on a CCM (RTC) server if just being used for work items (tracking and planning functions).  If you expect to have 900 concurrent users, it’s a reasonable assumption that you’ll need two CCM servers.  As of v6.0.2, scaling a Jazz application to support higher loads involves adding an additional server, which the Jazz architecture easily supports.  Be aware though that there are some behavioral differences and limitations when working with multiple applications (of same type) in a given Jazz instance.  See Planning for multiple Jazz application server instances and its related topic links to get a sense of considerations to be aware of up front as you define your topology and supporting usage models.  Note that we are currently investigating a scalable and highly available clustered solution which would, in most cases, remove the need for distributing projects and users across multiple application servers and thus avoid the behavioral differences mentioned.  Follow this investigation in work item 381515.

This post doesn’t address other servers likely needed in your topology such as a Reverse Proxy, Jazz Authorization Server (which can be clustered), Content Caching Proxy and License Key Server Administration and Reporting tool.  Be sure to read up on those so you understand when/how they should be incorporated into your topology.  Additionally, many of the performance and sizing references I listed earlier include recommendations for various JVM settings.  Review those and others included in the complete set of Performance Datasheets and Sizing Guidelines.  It isn’t just critical to get the server sizing right but the JVM properly tuned for a given application.

To get to the crux of the primary question of number of servers and their size, I ask a number of questions.  Here’s a quick checklist of them.

  1. What Jazz applications are you deploying?
  2. What other IBM or 3rd party tools are you integrating with your Jazz applications?
  3. How many total and concurrent users by role and geography are you targeting and expect to have initially?  What is the projected adoption rate?
  4. What is the average latency from each of the remote locations?
  5. How much data (number of artifacts by domain) are you migrating into the environment? What is the projected growth rate?
  6. If adopting Rational Team Concert, which capabilities will you be using (tracking and planning, SCM, build)?
  7. What is your build strategy? frequency/volume?
  8. Do you have any hard boundaries needed between groups of users, e.g. organizational, customer/supplier, etc. such that these groups should be separated onto distinct servers?
  9. Do you anticipate adopting the global or local configuration management capability (released in v6.0)?
  10. What are your reporting needs? document generation vs. ad hoc? frequency? volume/size?

Most of these questions primarily allow me to get a sense of what applications are needed and what could contribute to load on the servers.  This helps me determine whether the sizing guidance from the previously mentioned performance reports need to be judged higher or lower and how many servers to recommend.  Other uses are to determine if some optimization strategies are needed (questions 4 and 7).

4.1.1As you answer these questions, document them and revisit them periodically to determine if the original assumptions, that led to a given recommended topology and size, have changed and thus necessitate a change in the deployment architecture.  Validate them too with a cohesive monitoring strategy to determine if the environment usage is growing slower/faster than expected or detect if a server is nearing capacity.  Another good best practice is to create a suite of tests to establish a baseline of response times for common day to day scenarios from each primary location.  As you make changes in the environment, e.g. server hardware, memory or cores, software versions, network optimizations, etc., rerun the tests to check the effect of the changes.  How you construct the tests can be as simple as a manual run of a scenario and a tool to monitor and measure network activity (e.g. Firebug).  Alternatively, you can automate the tests using a performance testing tool.  Our performance testing team has begun to capture their practices and strategies in a series of articles starting with Creating a performance simulation for Rational Team Concert using Rational Performance Tester.

In closing, the kind of guidance I’ve talked about often comes out in the context of a larger discussion which looks at the technical deployment architecture in a more wholistic perspective, taking into account several of the non-functional requirements for a deployment.  This discussion is typically in the form of a Deployment Workshop and covers many of the deployment best practices captured on the Deployment wiki.  These non-functional requirements can impact your topology and deployment strategy.  Take advantage of the resources on the wiki or engage IBM to conduct one of these workshops.

My first foray into IBM Internet of Things Foundation (part 2)

In part 1 of this series, I setup an environment where a Python application running on a Raspberry Pi, collected temperature, humidity and soil moisture data from connected barometer and moisture sensors.  This data was published to an MQTT message broker then retrieved by an application hosted on IBM Bluemix.  This application stored the data in a Cloudant database for later visualization via an HTML5 application. 

There are many different types of MQTT brokers available given it is an open source protocol.  These brokers vary by feature sets provided, some of which are in addition to the standard MQTT functionality.  IBM’s IoT Foundation, now known as Watson IoT Platform, serves as an managed and secure MQTT broker and much more.  With the Watson IoT Platform, you have a fully managed IoT platform allowing secure connection of devices with further access to a scalable platform as a service environment and a plethora of other services such as storage and analytics.

The diagram below provides a simple depiction of the interactions between a ‘Device’ and ‘Application’ through an MQTT Broker.  In part 1, I used an open source MQTT Broker.  In part 2, I will be using IBM Watson IoT platform as the MQTT Broker.  In my environment, I am only publishing ‘Events’ to the MQTT Broker for action by the Node-Red app.  I am not currently publishing ‘Commands’ back to the Raspberry Pi device.

image

My goal then for this next step in the series is to modify the Poseidon Client (Python application running on Raspberry Pi) to publish events to Watson IoT Platform and change the application running on Bluemix to receive that published data from the new source.  In this entry, I will outline the changes to each of these applications.

Before making any application modifications, I first need to configure the Pi to connect to the Watson IoT Platform using this recipe. At a high level, this recipe had me do the following:

  1. Download, install and start the IoT service which would manage the connection and publish events to the IBM IoT Foundation Quickstart.
  2. Register my Raspberry Pi device in Watson IoT Platform following another recipe.
  3. Use the credentials provided by the registration process to connect the device to Watson IoT.

When complete, I had a device registered with the IBM Watson IoT Platform as shown below.  In the process I was provided a device ID and authentication token, to be used later when connecting the device to the Watson IoT Platform service.

image

Now I am set to modify the application code to publish events directly to Watson IoT Platform using the IoT service now running on my Raspberry Pi.  I will keep the existing code that is using the paho MQTT client libraries to publish to an open source MQTT broker.  This code is gated by a configuration parameter that I will just turn off so it no longer executes.  I will add new code and configuration parameters to use the Watson IoT Platform Device API to publish events to the Watson IoT Platform.

The config.py file contains a number of configuration parameters for the Poseidon Client.  Set ‘sendToCloud’ to ‘False’ in order to turn off sending data through the open source MQTT broker.  Add a new ‘sendToIoTF’ parameter and set it to ‘True’ to publish via the Watson IoT Platform.

image

In the config.py file add a parameter to capture the credentials used to connect to the Raspberry Pi device I registered earlier.

image

In the PoseidonClient.py code add the requisite import statement to utilize the Watson IoT Platform Device API.

image

Skip ahead in PoseidonClient.py to the section of code that initializes the sensors and device interfaces.  The new code below creates a device client instance using the credentials added  to confg.py then connects to it.

image

Go back in PoseidonClient.py to the processData code and a publishEvent call (with appropriate callback) to send the sensor data to the Waston IoT Platform to be read later by the application running in Bluemix.

image

At the end of the main processing loop, add a call to disconnect from the device client instance.

image

Now we move on to the application in Bluemix.  Here is how it looked at the end of part 1.

image

Use the Node-Red editor to add the following nodes and connections.

  1. Add an ibmiot input node and configure it to read events published by the Raspberry Pi device (see subsequent screen capture with showing the parameters for the node)
  2. Add a debug node to capture the data received from the device
  3. Connect the ibmiot input node to the debug node
  4. Connect the ibmiot input node to the existing Cloudant output storage node
  5. Connect the ibmiot input node to the existing json function node to transform the sensor data payload from json format to Javascript object.

image

This screen capture illustrates the parameter settings for the new ibmiot input node.

image

After making the above changes, running the PoseideonClient on the device and the Node-Red app in Bluemix caused the following debug output in the Node-Red debug console.

image

The first debug output is the sensor data coming in from the Watson IoT Platform.  The next debug output is the formatted message being sent to Twitter.

Nothing changed in the Cloudant queries and views nor the HTML5 app.  From part 1 of this series, these will produce a visualization similar to the following:

image

I have successfully altered the device to publish events to the managed IBM Watson IoT platform MQTT Broker and the Node-Red app in Bluemix to receive and process those events.  Next up for me is to investigate possible inclusion of the IBM IoT Real-Time Insights service on Bluemix or perhaps Watson Analytics.  Another area of interest is the currently experimental IBM IoT Workbench.

My first foray into IBM Internet of Things Foundation

For the last year, I have been a part of what is now the IBM Watson Internet of Things (IoT) business unit.  I’m still in the same role but have just been working with a different set of customers and interested in a solution that addressed their Continuous Engineering (CE) needs.  This solution is only one aspect of the overall IBM Internet of Things solution.  When you consider the broader IoT solution, it involves not just the designing and engineering of ‘things’ (using our CE solution) but the operation/management of them, the collection/control of data from them and the analysis/optimization of that data with all of that being cloud enabled and done in a secure manner.

Since I like to learn new things and since my solution focus is a part of a broader solution I was less familiar with, I decided to begin exploring one foundational element of it, namely the IoT Foundation.

image

“The IBM IoT Foundation platform allows organizations to securely and easily connect devices, from chips to intelligent appliances to applications and industry solutions. Scaling through cloud-based services and using rich analytics, IoT Foundation platform provides organizations with new insight for innovation and transformation.”

 

 
The next question was how best to get that experience.  My colleague Dennis Schultz, who has written a nice series of blogs on My Internet of Things and MobileFirst adventure advised that I decide on a project that was practical, real and of interest to me.  At first I thought of something related to a weather station but then I thought it would be cool to do something that helped me figure out when to water my yard.  You see, I live in Texas and as I was considering this, it was in the middle of summer and our typical water restrictions were in effect such that I couldn’t always water when it was needed.  I also don’t want to over water so on my day of the week to water, I would typically stick my finger in the soil to see how wet the soil was and decide whether to turn the sprinklers on or not.  Wouldn’t it be nice if I could some how do that with a sensor that would let me know that through some sort of email, text or other mobile alert?

As I searched around, I found that some IBMers have already done some investigation along those lines.  The Poseidon Project, led by a team in Europe, “is a voluntary initiative that aims to reduce water usage in the world”.  They created a great three part tutorial series that connects a Raspberry Pi with sensors to monitor temperature and soil moisture, have that data published to an MQTT Message Broker, then input it to an application hosted on IBM Bluemix built using a Node-Red editor.  This application tweets the sensor data and also stores it in a Cloudant database for later visualization through an HTML5 app.  This appeared to have much of what I needed and more and would provide a nice launching point for my project.  Take a look at the solution architecture below.

Diagram of the solution design

I didn’t want to just jump in and follow the tutorial.  I first wanted to do some background reading and get some basic experience with the IBM IoT Foundation on Bluemix.

As you could imagine, there is a plethora of background information online.  Here are some links I found useful:

To get started with IBM IoT Foundation on Bluemix, I found there is a growing number of recipes on IBM developerWorks.  These are step by step tutorials that walk you through some aspect of using the technology of interest to you.  In particular, I followed the recipe that used a simulated device to provide temperature and humidity data that I could visualize using a simple quickstart web app (which can take inputs from real devices too).  From there I tried the IoT starter application which allowed me to feed the simulated sensor data into IoT Foundation application using a Node-Red editor in Bluemix.  I found this all to be quite fun and informative.

Thus armed and dangerous with some background information and basic experience I charged ahead into the Poseidon Project tutorial. My setup is depicted below.

$7B01FE0F3C44879

The specific hardware I purchased included:

The tutorial is well written and I pretty well followed it verbatim. To keep it impervious to changes in the Bluemix user interface, some of the steps were written more general with fewer specific screenshots or navigation steps making some harder, but only minimally so, for new users like me to follow. I did find that several steps were missing in Part 3, Step 2 relating to creating the APIs to be used by the HTML5 visualization app.  The authors intend to submit an update with the missing steps.  Should you decide to follow the tutorial, if it doesn’t show a revision date since it’s original publication of 16-December-2014, check back with me for the missing steps.

Here’s one output from the visualization app.

image

My experiment isn’t ready to be taken out of doors and used to monitor the soil moisture of my yard, I’d have to weatherproof the setup and find a different moisture sensor as I subsequently found that the one I am using isn’t meant to be used outdoors nor prolonged time in soil.  Still, I enjoyed the experience and believe with a little more work it could be applied as I intended.

It also gave me broad exposure to a number of different technologies enabling rapid development of IoT applications.

  • IBM Bluemix
  • Node.js
  • Node-Red
  • MQTT
  • IBM IoT Foundation
  • Cloudant DB
  • IBM DevOps Services
  • Git
  • Twitter
  • Data Drive Documents
  • HTML5

There are a couple of extensions to the tutorial I’d like to do.  First, it was written so that the sensor data is published to an MQTT Message Broker then received by the IoT Foundation app on Bluemix.  I want to instead publish the data directly to the IBM IoT Foundation using the ibmiotf client library.  I am currently working on that.  Second, I’d like to explore making use of the IBM IoT Real-Time Insights service on Bluemix.  I’ll publish any work on these in a subsequent post.

Thinking beyond this project, another idea I’m thinking of is making use of the TI SensorTag and Estimote Beacons to monitor temperatures in my home at the thermostat and in areas seemingly hotter/colder, compare these to weather outdoors and visualize on a mobile device with an app built using the IBM MobileFirst Platform or perhaps use the MyWeatherCenter app.

Finally, I will close with a thanks to Bram Havers and Davy Vanherbergen who helped me with any issues I came across in the Poseidon Client tutorial.

Configuration Management much improved in CLM 6.0.1

The 6.0 release of the Rational solution for Collaborative Lifecycle Management (CLM) included the addition of new configuration management capabilities.  It had some limitations to consider, that is, temporary differences in some CLM capabilities when configuration management is enabled for a project versus not.  Some workarounds for these were detailed in Finding suspect links and requirements to reconcile in CLM 6.0 configuration management enabled projects and Alternatives to filtering on lifecycle traceability links in CLM 6.0 configuration management enabled projects

I am happy to say that the development team worked hard to address these and more in the 6.0.1 release.  A few considerations still remain, but far less than were in the previous release.  I am now comfortable recommending its use by many of my customers.  In this blog entry, I will compare and contrast the configuration management limitations between 6.0 and 6.0.1 and highlight a few other configuration management enhancements.

Considerations from v6.0 and their improvements in v6.0.1:

  1. Will you use the configuration management capabilities in a production environment or a pilot environment?

  2. Will you upgrade all CLM applications to 6.0?

  3. Do your configuration-enabled RM or QM project areas link to other RM or QM project areas?

    While these first three considerations all remain true, they were moved to the ‘Important Factors’ section as they are more recommendations and best practices versus changes in behavior from non-enabled projects.  In v6.0, configuration management was new and we thought it better to draw attention to these recommendations so included them at the top of the v6.0 considerations list.

    Piloting use of configuration management is recommended due to its complexities and to ensure it meets your needs.  It also gives opportunity to try out new usage models/workflows before implementing in production.

    Keeping all the CLM apps at the same v6.0.x rev level is the only practical way to take advantage of all the configuration management capabilities and ensure they work correctly.

    Because of the new linking model in v6.0 when using configuration management, it will always be the case that you’ll want to enable configuration management for all the RM and QM projects between which you’ll be linking artifacts otherwise linking will not always work as desired.

  4. Which type of reporting do you need?

    In v6.0, configuration-aware reporting was available for document generation only.  All other options were technology preview only and not to be used in production.  In v6.0.1, document generation continues to be available using built-in, template-based reporting in the CLM application or by IBM Rational Publishing Engine; it now includes interactive reporting through the Jazz Reporting Service Report Builder and Rational Engineering Lifecycle Manager. 

    Most configuration aware data is now available.  v6.0.1 added version aware data for DOORS Next Generation, versioned data in global and local configurations and Rational Team Concert enumeration values.  This means, for example, that you can construct a report that spans requirements, work items and tests for a particular configuration or a report that includes artifacts from multiple project areas within the same domain. Some gaps remain in available configuration aware data and some is only available via custom SPARQL queries.  Take a look at the limitations section of Getting started with reporting by using LQE data sources

    Access control for reporting on LQE data sources is now enforced at the project area level for RM, CCM and QM applications .  Project-area level access control is not yet implemented for DM and Link Validity (can be set manually in LQE).

  5. Do you need to link or synchronize your CLM project area data with other tools, products, or databases (including IBM, in-house, or other tools)?

    Most OSLC-based integrations outside the CLM applications do not support versioned artifacts.  To do so requires support for OASIS OSLC Configuration Management spec (draft).  We do expect the list of supporting applications to grow over time.  Note that integrations to RTC work items continue to work as expected, because work items aren’t versioned.  Several RQM test execution adapters have been verified to work correctly with enabled projects.  We expect progress to be made in this area throughout 2016 with other IBM and third-party applications.  For information on setting up configuration aware integrations, see Enabling your application to integrate with configuration-management-enabled CLM applications.

  6. Do you rely on suspect links for requirements in RM or for requirement reconciliation in QM?

    In v6.0.1, the new Link Validity Service replaces “suspect links”.  Now you can show and assert the validity of links between requirements, and between requirements and test artifacts.  Automatic “suspect” assertion occurs when attributes change.  Validity status is reused across configurations when the same artifacts with the same content are linked in multiple configurations.

    QM requirements reconciliation against linked requirements collections is now available in configuration management enabled projects. 

  7. Do you need to filter views based on lifecycle traceability links?

    RTC plans can now be linked to versioned requirements collections and test plans in v6.0.1.  What remains in this limitation area is it is still not yet possible in configuration management enabled projects to filter RM views based on lifecycle traceability status nor filter QM views based on RTC work item and plan traceability links.  These should all be addressed in a subsequent release.

  8. (For QM users) Do you use command-line tools, the mobile application for off-line execution, or import from Microsoft Excel or Word?

    In the v6.0.1 release, all command-line tools but the Mobile application for offline execution utility are now configuration aware and can be used with configuration management enabled projects.

The v6.0.1 release includes some other configuration management enhancements of note unrelated to the limitations/considerations:

  • One step creation of global configuration baseline hierarchy
  • Bulk create streams for selected baselines in the context of a global stream
  • Requirements change management supported by optionally requiring change sets to edit contents of a stream and further requiring those change sets to be associated with an approved work item to deliver those changes
  • Improved ability to deliver changes across streams that require merging of modules
  • Several improvements to make it easier to work with DNG change sets in personal streams

See v6.0.1 CLM New & Noteworthy for more details on these and other improvements.

One other enhancement not yet fully realized is the provision for Fine Grain Components.  Currently each project area is a component, which could to a proliferation of project areas for complex product hierarchies.  The intent in future is to support more granularity of component breakdown within a project area.  More work remains to get this fully supported.  In the mean time, some customers may limit their adoption of configuration management until this is supported.

To wrap up, I believe we’ve made great strides in improving the configuration management capability and addressing the limitations from its initial release.  To me, the primary limitation that will constrain a customer’s adoption of the capability is whether the needed integrations to 3rd party or other IBM applications are configuration aware and secondarily if there are any aspects of the configuration aware reporting that won’t meet there reporting needs. 

To give this release a try, you can download CLM v6.0.1 or access one of our sandboxes already populated with sample data (select the latest CLM milestone “with configuration management” when creating your sandbox).

Alternatives to filtering on lifecycle traceability links in CLM 6.0 configuration management enabled projects

Today I’d like to continue on the theme started in Finding suspect links and requirements to reconcile in Collaborative Lifecycle Management (CLM) 6.0 configuration management enabled projects by addressing another consideration from Enabling configuration management in CLM 6.0 applications.

Do you need to filter views based on lifecycle traceability links?

In CLM 6.0 for those projects with configuration management enabled, you can view artifacts with their lifecycle traceability links, but cannot filter by those relationships.  There are three key areas this impacts:

  • Limit by lifecycle status in DOORS Next Generation (DNG)
  • Traceability views for RQM test artifacts
  • Linking RTC plans to artifact collections

I’ll explore some alternative workarounds to these in this blog.

Limit by lifecycle status in DOORS Next Generation (DNG)

In CLM 5.x, views of requirements artifacts can be filtered by the status of lifecycle artifacts linked to them.  The same is true in CLM 6.0 but only for projects that don’t have configuration management enabled.  This limitation should be addressed in a future release by work item 97071.   Below is an example showing all feature requirements with failed test case runs.

image

Similarly, the following shows all feature requirements whose linked development items have been resolved.

image

In CLM 6.0, for configuration management enabled projects, the lifecycle status filter option doesn’t even appear.

image

It is still possible to show a view of requirements with their lifecycle traceability links shown; it’s only the filtering that isn’t possible (at present).

image

Here you could scroll through the Validated By column, for instance, and manually scan for test cases whose icon indicated a failure occurred.  This wouldn’t be viable for more than a short list of requirements.

What’s needed then is to display a view/query of the linked artifacts, filtered appropriately, and display, if possible, their linked requirements.

For example, show all failed test case runs in Rational Quality Manager (RQM) and their associated requirements.  When displaying all test cases, you can see visually by their associated icon, whether the test case has been run successfully or not.  This isn’t ideal given you aren’t able to filter by the test case status and must instead visually pick out the failed runs.  It does, however, show the linked requirement being validated.

image

Alternatively, show a view of test case results and filter by their status.  Below is a list of test case results that are Failed or Blocked and their associated test case.

image

Unfortunately, it doesn’t also show the related requirement.  Instead you would need to drill down into the test case from the failed run and see its linked requirements.

Use of the Jazz Reporting Service Report Builder may be an option in limited cases.  First, it’s use for configuration management aware reporting is Technology Preview in 6.0 and only really contains configuration data for RQM.  For DNG, only the default configuration data is included for configuration management enabled DNG projects.  If your requirements configuration management needs are basic, where a single configuration/stream is sufficient, this may be an option.

For example, the following report shows all requirements with failed/blocked test case runs.

image

Now you’ll likely have multiple DNG projects.  DNG doesn’t publish project area data to the Lifecycle Query Engine (LQE) using Configurations data source so you can’t choose only those requirements artifacts from a given project, limit scope by that project nor set a condition to query by some DNG project area attribute.  You can, however, choose test artifacts for a given project area (and configuration) so if there’s a 1:1 relationship between DNG and RQM projects, you can produce a report that just shows the requirements from failed test case runs in the desired RQM project belonging to the desired global configuration (this is what is shown in the previous screenshot).

I tried looking at this from the opposite direction, that is, show all failed/blocked test case runs and their associated requirements.  You get the right list of failed runs, but it shows all their associated requirements, not all of which were tested and failed in that run.

image

For the other example I gave earlier, show all requirements whose linked development items were resolved, you could go to Rational Team Concert (RTC) and run a lifecycle query such as Plan Items implementing Requirements, but you’d need to visually look for plan items whose status met your criteria as the query isn’t editable and thus you couldn’t add a filter.

image

Traceability views for RQM test artifacts

In CLM 5.x, views of test artifacts can be filtered by the presence (or not) of a linked development item.

image

The same is true in CLM 6.0 but only for projects that don’t have configuration management enabled.  RQM projects that have configuration management enabled, don’t include that filter option.  This limitation should be addressed in a future release by work item 134672.

image

From this view, you would then need to visually scan the Test Development Item column for whichever condition you needed.

RTC has some lifecycle queries, such as Plan Items with no Test Case or Plan Items with failing Tests that could help.

image

image

Here again, Report Builder could help as you could construct a report that shows test cases with or without associated work items.  For example, the report below shows test cases without any work item associated.

image

Linking RTC plans to artifact collections

In CLM 5.x, it is possible to link an RTC plan to a DNG requirements collection and/or a RQM test plan.  Use of these grouped artifacts allows for shared scope and constraints across these lifecycle plans and are useful in auto filling gaps in plans or reconciling changes.

image

In CLM 6.0, links to collections and test plans from an RTC plan only resolve to the default configuration in the project they reside in.  In other words, you cannot link an RTC plan to a versioned requirements collection or test plan.  This limitation should be addressed in a future release by work item 355613.  The primary limitation that translates to is you are unable to auto generate workitems for requirements in a collection when working with projects that have configuration management enabled.

image

Missing that capability then the only work around is to manually create the workitems so that every requirement in the collection is linked to a corresponding work item.

A traceability plan view in RTC that includes a color filter will help identify those plan items without requirements links.

image

Such a view will highlight cases where a work item may need to be removed as the scope has changed, e.g. the collection has had requirements removed.

In DNG, view the collection with the Implemented By column included and scan for requirements with no corresponding work item.

image

If your requirements set is too large to view manually, export the collection view to a CSV file then open the exported file and filter or sort by the Implemented By column to more easily see those requirements without work items.

image

Conclusion

Of the limitations discussed, I find the first one, inability to filter by lifecycle status, will be more problematic for customers though I’ve found it’s usage to be mixed.  I’m also not particularly enamored with the workarounds described because they too are limited and involve some manual steps.  I would be interested in hearing how significant these limitations are in your environment or if you have additional ideas on workarounds for them.

Finding suspect links and requirements to reconcile in CLM 6.0 configuration management enabled projects

The newly released CLM 6.0 has some great configuration management across the lifecycle capabilities.  As described in Enabling configuration management in CLM 6.0 applications there are some considerations to be made before enabling configuration management.  One of these is

Do you rely on suspect links for requirements in RM or for requirement reconciliation in QM?

The reason being is that in this release, automatic detection of suspect links (artifacts that need to be evaluated and possibly modified because a linked artifact has changed) is not working in configuration management enabled DOORS Next Generation (DNG) projects. Further, the requirements reconciliation process in Rational Quality Manager (RQM) to find any changed/deleted requirements impacting associated test cases is also not supported in configuration management enabled RQM projects.  Note that a new mechanism for determining link suspicion is intended for a future release, in the interim though, teams must use a workaround.  I’ll explore some of those in this blog.

Suspect Links Workaround

To determine what requirements have changed and which impacted test cases may need updating, we’ll need to look at a test coverage view of requirements filtered by a some modified date.  It’s likely that you’ll want to know which requirements have changed since the last release baseline.

Let’s assume we are working in the AMR Mobile US global configuration in DNG.  Open the Configuration Context Banner Button (CCBB) and navigate to the global configuration.

image

Show the baselines for this global configuration.

image

Observe the most recent baseline is AMR Mobile 3.0 US GA.  Drill down into it and see the baseline was created on May 27, 2015, 12:35:23 AM.

image

Note at this time, baselines for the each of the contributors to the global configuration must be created first before the baseline for the global configuration can be committed/finalized.  This means there likely be some time disparity between the global configuration baseline and the baselines for contributing local configurations (they could be earlier or later than the global configuration creation date).  For this reason, it’d be more accurate to use the local configuration baseline creation time for the filtering.  While looking at the global configuration baseline, select the row with the desired local configuration and observe its creation time on the right.

image

While viewing requirements of interest, likely on a per module by module basis, e.g. AMR System Requirements, open a Test Coverage view and add a Filter by Attribute for Modified on set to be after the baseline date.

image

Here we see there have been four requirements altered, two which have no linked test case, but may need one if they are new, and two that have linked test cases, but may need to be updated.

Now you could look at each individual requirement one by one to understand if they are new or modified.  In this example, opening requirement 1830 shows it has been created since the baseline date.

image

You could also add the Created On attribute to the view columns displayed.

image

This requirement doesn’t have a related test case so now you would evaluate whether one should be created.

Looking at requirement 517, you observe that it was created before the baseline and modified since.  There is a related test case but you need to understand what the change was to better evaluate if it necessitates a change in the test case.

Open the requirement history and get a sense of the changes.

image

Should the changes be such that a reevaluation of the test case is warranted, follow the Validated By links to navigate to the test case(s) and check if they need updating.

To track the potential impact of those substantive requirement changes, you could tag the requirements and create a traceability view to look only at those.

image

Alternatively, create a Tracked By link from the suspect requirement to a new RTC work item task that would be assigned to the test team to evaluate whether any linked test cases should be updated.

image

Now rather than going through each requirement individually, an alternative is to make use of the Compare Configuration capability to compare the current local DNG configuration/stream to the DNG baseline included in the AMR Mobile 3.0 US GA baseline.

image

image

image

The compare results will show additions, changes and deletions to project properties, folders and artifacts.  With this information, the analyst can make a reasonable determination of how substantive each change was.  Armed with this information, they would need to return to the Test Coverage view(s) and tag the appropriate requirements as suspect and/or create RTC work items to analyze the linked test cases.  Note that previously we were looking at requirements changes on a per module basis (if modules were being used) but the Compare Configuration will look at all changes to all artifacts across the stream, that is, consolidating all change and not giving a module perspective.

Now if you were paying attention to my scenario,  you’ll notice that the last screen shot above, showing the results of the compare, doesn’t line up with the Test Coverage view shown earlier as the system requirements that were shown to have changed since the baseline are not shown in the compare results.  No, this isn’t a bug in the software.  I was using a shared internal sandbox whose configuration was changed in between the time when I started writing this blog and the time I tried to capture the compare results.  Rather than trying recreate the scenario, I left things as they were as I think I still get the concept across (though the anal side of me really has a problem with it all not lining up!).

Requirements Reconciliation Workaround

Requirements reconciliation is a capability in RQM that looks at the requirements linked to a test case or the requirement collection linked to a test plan and determine if there are new, updated or deleted requirements that necessitate changes in the test case(s).  In CLM 6.0, requirements reconciliation is not supported in projects that have configuration management enabled.

While you can query for test cases changed since a certain baseline date, this doesn’t really help determine if there are test cases to be updated due to requirements changes.  It’s not possible from RQM to query on test cases and filter based on attributes of the linked requirements.

Thus, the reconciliation process would need to be driven by DNG such that the tester would need to use the same technique used by the analyst for the suspect links workaround.  That is, the tester would look at a test coverage view of requirements in DNG, filtered to show requirements updates since the baseline date and evaluate if a test case addition, deletion or update was warranted.  This process would be further helped if the analysts used tagging as previously described so that that tester wouldn’t need to sift through all the requirements to find only those with substantive changes.  Use of impact analysis tasks in RTC would as well.

Use of these test coverage views would only identify requirements added or changed since the baseline.  It would not list requirements removed.  So for a comprehensive view of requirements changes that need to be reconciled with test cases, RM stream needs to be compared against the baseline to see any requirements that have been deleted.

Conclusion

While it is unfortunate that this initial release of the global configuration management capability doesn’t include support for suspect links and requirements reconciliation, there are some manual workarounds available, while not ideal, can help mitigate the gap until such time such time a replacement is available.  For some customers, such a manual process may be untenable due to the volume of changes between baselines.  Rather than performing the analysis after the fact, perhaps being more proactive about flagging potential impacts from the beginning of and throughout the release is more appropriate.  As requirements get changed, assess then whether there are test case updates needed by tagging the requirement and/or creating an impact analysis task in RTC.  These can be reviewed/refined at different milestones throughout the release.  Again not ideal but does distribute the analysis burden.

Help! My RTC Database is getting big!

Many customers who have been using RTC for some time are seeing their database size grow, in some cases, to 100s of GBs if not TBs of data.  While growth is to be expected and is of no issue to modern DBMSes, proactive administrators want to understand the growth and how it can be mitigated especially in light of anticipated user growth in their environments.  Naturally, they come to us and ask what can be done.  While at this time we don’t have the solutions many customers are asking for (project copy, move, archive, delete, etc.), that isn’t to say we don’t have approaches that may be of value in some situations.

Working with our own self-hosted RTC environment as well as those of our enterprise customers, we generally find that the largest contributors to database size our build results, work item attachments and versioned content.  How would you know that?  Fortunately, we have a couple of out of the box reports you can run: Latest Repository Metrics and Latest Repository Metrics by Namespace.  Below are some samples showing a subset of the available namespaces and item types.

image

image

Looking at all the namespaces and item types begs the question…what do they all mean?  Yeah, they aren’t all obvious to me either.  Luckily, I have access  to the smart developers who wrote this stuff and can tell me.  If you find one you don’t know, send me a note/comment or post it on the jazz.net forums.  image

Once you find the larger contributors to size, the next questions asked are can they be deleted and who (that is, which project) is producing them.  In keeping with my team’s No Nonsense Tech Talk theme, I’ll be honest, there’s not much we can delete/archive and we certainly can’t do it at a project level, which would be of greater value, nor can we easily tell who produced it all.  It’s not all doom and gloom because there are some things we can do.

As mentioned earlier, we can delete build results, which often are a huge contributor to size growth.  We can delete work items, even attachments from work items.  Versioned content can be deleted, though you don’t usually want to do that, except for security reasons or to remove binaries versioned by mistake.  Then there are plans, streams, workspaces, etc, that can be deleted, but these don’t tend to take up much space.

So what happens when something is deleted? Well, in some cases, it’s not really removed from the database, only reference to it is removed or made less accessible.  For example, work item attachments don’t really go away when removed from a work item.  Try this.  Add an attachment to a work item then save the work item.  Hover over the attachment link and copy it.  Now remove the attachment from the work item then save the work item.  In the browser, paste the copied attachment URL and it will be found.  Similarly, if you delete a work item that has attachments, the attachment links still remain valid.  However, if you delete (not remove) a work item from the Eclipse client, the work item is actually deleted.

If you find that you’ve removed but not deleted an attachment, it is possible to go back and have it truly deleted.  To do so, using the Eclipse client, paste the URL to the attachment (which should be visible in a discussion comment from when it was first attached) somewhere into the work item (into a comment or the description), right click over that link and select “add to favorites”. Once it is in the favorites, you can drag it from Favorites and drop onto the Attachments section, which re-attaches it to the work item, at which point you can then delete it in the normal way.

Now some things like build results and versioned content once deleted can truly be deleted from the database.

At the repository level there are two key background tasks that are used to mark content as deletable then later deletes it.

  • an “Item Cleanup Task” background task at repository level marks newly orphaned content blobs as needing deletion (runs every 17 mins by default)
  • a “Deleted Items Scrub Task” background task at repository level deletes any content blobs that have been orphaned for more than ~2 hours (runs every 24 hours by default)

Once these both run, any content blobs that were deleted more than 2 hours ago should be fully deleted from the database.

However, DBMSes (particularly those in production) don’t generally release storage allocated for their tables immediately.  A compaction task usually needs to be run to reclaim the disk space.  The DBMS should have tools to indicate in advance how much space can be reclaimed by compaction.  Typical utilities to do this are shown below.  Details for using them should be left to a qualified DBA.

  • Oracle – ALTER TABLE … SHRINK SPACE
  • DB2 – REORG
  • SQL Server – SHRINKDATABASE

My teammate Takehiko Amano has done a very nice job of showing how deleting versioned content and later running DB2 commands reduces database size.  See his article Reducing the size of the Rational Team Concert repository database.

We find the build results often take up a good amount of the RTC database size.  These results often include all the log files from compilations, tests and other activities performed during the build.  Some times they will contain downloadable outputs, e.g. application archives and executables.  What happens is these results are often kept around and never deleted.  In some cases, especially for release milestones, they should be, but all those personal or interim continuous builds don’t need to be kept.  Build results can be deleted and will result in their database content being orphaned and subsequently deleted per the aforementioned process.  Rather than manually deleting results, consider setting up a pruning policy to automatically delete old results.  For those results you want to keep around and not be pruned, just mark them as not deletable.

In cases you know your build results are taking up a lot of space, the natural follow on question is which builds and who owns them.  Our development team recently had cause to address that question which resulted in a very useful script written by Nick Edgar, the RTC Build Component Lead.

Nick created a Groovy script that collects into a CSV file the pruning policy settings and footprint data for each build in all projects across a repository.

SNAGHTML2fbd67b2

He further creates an interactive HTML report that parses the CSV file for display in a more visual form.  image

With this information you can find out which builds from which projects are taking up the most space and whether they have a pruning policy in place.  Armed with this information, an administrator could go to the appropriate release teams and take action.  Imagine running it weekly and posting to a dashboard or emailing it weekly to release teams.  The Groovy script to collect the data and index.html to render the report are attached to task work item 330478.

For gathering the CSV data you’ll need to 1) install Groovy, 2) install a build system toolkit that’s compatible (ideally at same version) as RTC server, 3) set environment variables (see top of Groovy script), and 4) run the script with: groovy -cp “$BUILD_TOOLKIT/*” <groovy file name> <any arguments needed by script>.

For the chart, just put the chart index.html file and CSV in the same directory and open the HTML file. Some browsers will require these to be served up by a web server to allow the HTML file to read the CSV file. For my testing, I used Python’s simple server support for this: python -m SimpleHTTPServer.

Given I am referencing code samples, I’ll keep our lawyers happy by stating that any code referenced is derived from examples on Jazz.net as well as the RTC SDK. The usage of such code is governed by this license. Please also remember, as stated in the disclaimer, that this code comes with the usual lack of promise or guarantee. Enjoy!

Being able to monitor the size and growth of your data, getting granular and actionable information about it and ultimately, being able to do something positive about that growth, is a key concern for IBM Rational and something we are exploring as part of our Platinum Initiative.  I welcome your input in this area.  Perhaps we can interact at IBM InterConnect 2015.