Preston Thornton
Hello,
In working with products such as HP Openview and OMi, I was familiar with features such as message correlation. In the current version of xMatters, has this capability?
0
Comments
Please sign in to leave a comment.
Hi Preston,
I might need some elaboration on what you're looking for when it comes to message correlation but from what I've been able to deduce I think our event flood control features are what come closest to this:
https://help.xmatters.com/ondemand/installadmin/systemadministration/event-flood-control.htm
This allows us to look at the content of any event that is injected by an integration and check if it matches criteria for past events within a certain timeframe. If enough of these pile up we "correlate" the cause for future events to be rooted by a past one and we won't over-notify the user/group community about it. We still log and capture each one though so you can go back and see what came through from a reporting perspective.
Did I get this right? If you have more to your question please let us know.
Piling onto Francois' response, we automatically configure a "flood control" rule on the creation of any integration and is titled "Event Rate Filter". That rule is designed to protect on-call resources from noisy integration points. It can be changed in the user interface, but defaults to any integration that targets requests to the same recipient list more than 4 times in a minute will enter into a rolling suppression window that as long as you are meeting or exceeding that submission rate will continue to suppress requests, and correlate them to a parent event. The parent event will have a stacked icon in the Recent Events report, and you will have a "Suppression" tab visible when you drill into the event (see: https://help.xmatters.com/ondemand/installadmin/reporting/suppression-report.htm) that shows all of the correlated requests. The default rule also states that while in a suppression window, we will notify the recipient list of the suppression condition on the initial trigger, and every 900s or 1000 requests as long as the event submission rate continues. You can change any of the defaults to select or remove properties used for correlation, or adjust any of the time values listed (I personally lower my trigger conditions to 3 events within 5s.
That default filter is designed based on the communication needs of incident management i.e. at point it isn't helpful to diagnose the situation from notifications, you are better off going into the third party system and viewing the issue directly.
The Openview suite likely has other correlation capabilities to support things like service dependencies, which we don't do, so you'll want to determine what problems you can solve using our rule-based approach vs what problems require other approaches.
(That is a lot to grok in one go... but hopefully we are pointing you to the correct behavior).