At Effectual we care about two things: better performance and better data quality from HPE UCMDB. Over the last decade, several socially reinforced conclusions have been made about UCMDB, by both the customer and practitioner marketplace, which are no longer correct; nor are they unsolvable. The first of these is that Discovery and UCMDB are “heavy”, “slow”, and “won’t scale.” The second is that UCMDB has poor operational quality (or that data quality is difficult to manage). However, when the system is tuned to operate on modern equipment, both of these myths can be completely dispelled.
With the introduction of the latest generation of HP ESX hardware, faster storage, faster SQL Server and Oracle instances, and efficient VMWare resource management, previous generations of settings and inherent design features inside of UCMDB have become out of date. These settings and features actually serve to limit performance and scalability to an unacceptable degree. Since UCMDB has always relied on fast I/O more than any other resource, the outdated settings and RPM features within the product negate the huge potential gains of newer, faster infrastructure.
Effectual has performed consistent testing and tuning against the Mercury and HPE UCMDB platform for several major versions. Outlined below, and in the included White Paper, are our official recommendations for UCMDB and Probe application side tuning. These changes will circumvent processing bottlenecks, greatly increasing both performance and data quality.
- The fastest possible completion of every operation
- Large-content UCMDBs perform as fast as a small-content UCMDBs
- Errors are vastly reduced in all logs and operations
- High data quality remains consistent and predictable across all lifecycle activities (Discovery, integration, TQL, enrichments, patterns)
- Extremely high load operations can be seen in Data Access Layer, time increases with Hibernate and the RDBMS (the most expensive and intensive operations increase from 1 second to about 16 seconds at peak load – these operations have no impact, in either delay or quality, on any other procedure)
Our Official Recommendations
The following table outlines Effectual’s recommendations for adjusting default UCMDB tuning settings. For further clarification and explanation, please download the included Probe and UCMDB Tuning White Paper.
The most common problems that contribute to misconceptions of both poor performance and poor data quality are artificially low queue sizes, throttled thread and queue processing timers, and a variety of unrelated timeouts on the actual requests sent to the framework. Some timeouts are exceedingly long, while others are very short. This leaves the queues in a messy state. Users are stuck with abandoned processing requests, creating a huge amount of error “spam” in the various logs; logs in which users focus on the symptom rather than root cause. Worse, these timeouts abandon critical data transformations and updates. These abandoned updates can affect everything from saving a changed Model to syncing that model with new or changed data from another system or package.
Many triggered events that are sent by user actions are spread out across the framework. Some of these collide with background processing and artificially low queue limits. This means many times a user request gets stuck, causing a wait in the user interface which gives the (false) impression of a slow system.
When a process is told to wait in the system by design, and the system queues additional requests behind it, there are many different operational timeouts in different functions. This can then cause a waiting processing result to timeout without cleaning up or killing the original request. This not only abandons the request in the thread, but returns one or more critical errors to different logs. Because of the way in which exception and thread processing error handling is done, this bottlenecking in the queue appears to be a failure of the feature. Actually, it is the simple result of either an artificially low queue limit or an exceedingly long timeout (intended to help the “long running process” complete operation).
And there can be an even worse outcome: result sets can be different run over run. When repeated over time these issues cause impossible errors in the data set. The vast spam of errors mean the system (especially with large-content UCMDBs) is spending most of its time and computational energy processing garbage.
This is the equivalent of making fruit salad with different kinds of fruits, each kind requiring different types of cutting and preparation. If you’re making a large fruit salad, imagine the effort of slicing one grape, then switching to peeling and cutting a banana, then an apple, as opposed to doing all the grapes, bananas, and apples in their own sequences. In the same way a human preparing a fruit salad can gain efficiency by repetition, preventing processing context switching in the UCMDB can greatly accelerate processing duration.
Why Apply Any Of These Tunings?
User experience can and should be fast. By design, all Data from any source moves through the UCMDB in various states from start to finish as quickly as it can. Removing pre-existing and outdated limitations will improve data accuracy and quality, will remove unintended consequences that impact user experience, will reduce the overall error rate, and the abandoned threads and multitudinous errors that most customers experience on a daily basis.
Once we identified and resolved the total sum of individual bottlenecks, these changes were then scientifically tested over time with various additional solutions (such as Configuration Manager, CMS contexts of both single and multiple UCMDBs, Universal Discovery, Integrations for Population and Push, and user experience inside the various applets and solutions). We found that spam generation was greatly reduced, data quality greatly improved, and extraneous processes were reduced, freeing up significant system resources and greatly enhancing overall performance.
An Ounce of Prevention
It is Effectual’s position that we must stop perpetuating these easily preventable problems. We must also stop treating each customer problem as being an independent customer issue. We have to overhaul the various queues and thread management processes, update the as-coded features that artificially delay and queue requests per minute, and we must evaluate and change all of the queue timeouts and feature-specific timeouts in the tool. It isn’t 1999 any more; we need to tune the UCMDB to run in modern times, on modern hardware.
If you implement the changes suggested in this document, you’ll experience a very different end user experience, Discovery result experience, and Integration performance experience. The system will produce many fewer error messages, will be much more responsive and function more smoothly.
We hope you find this information helpful. If you do, please communicate the settings you’ve changed and the outcome to HPE directly. They have expressed interest in including proven findings in the product. Your support will help HPE assign resources to validate and test their best-informed effort toward making UCMDB a modern and top-performing system.
Register below to Download the full white paper
For help or feedback, feel free to