William Joyner
I am not sure of where to go with this so I figure posting here would be a good start.
Currently Nagios and xmatters are integrated and functioning. A few questions that maybe someone can answer or point me in the right direction.
1. When an alert is acknowledge from any device, it does remove the alert in Nagios, but it seems to pop back after a few minutes. I am not sure where I can solve this. My desired result would be the alert is acknowledge, stop triggering it.
2. The default responses for alerts within the Nagios integration are good, but I would like to change them to add a sticky acknowledgement until problem is resolved.
0
Comments
Please sign in to leave a comment.
Hey William!
Good to hear you got things integrated. I'll see what I can do. I'm only superficially familiar with Nagios, but hopefully we can work through it together.
1. Looking at the Nagios integration we have here, it seems there are no outbound integration scripts that would carry the acknowledgement back to Nagios. At least one of these will be required to take any action in Nagios, did you create one or do we need to sort that out first?
If you did get one created and xMatters is talking to Nagios, then it sounds like we just have to tweak what xMatters does in Nagios. I poked around serverfault and it looks like Nagios doesn't have a concept of "clear", so when you say "stop triggering" are you talking about not triggering the alert in the first place? Would something like "downtime" mentioned here be helpful, or would it be something else?
Alternatively, we could run some code on the inbound integration script to search for any (active) events with the same X. Where X is hostname, service name, etc, and if there are any, then don't create an event.
2. So the first thing we'll need to do is add a Response Option to the form. Check out the docs here to get started. The second part is to make the API or command calls into Nagios. So do you have an idea what the command or API call that is needed to make a Sticky Acknowledgement in Nagios? We can backtrack from there.
Happy Thursday!
--- Travis
Travis,
Thank you for the response.
1. I am not sure exactly how the integration agent works, but when you do the installation on the nagios side of things there are scripts that seem to handle the response received from xmatters and translates it back to Nagios.
That functionality is there even without the outbound connection; this was my first thought when I had issues with initial set up, but working with xmatters support resolved that.
The generic responses that are set up when you receive an alert from Nagios are:
Acknowledge - this acknowledges the alert and will add that information to the host if you look in Nagios.
Refuse - this seems to send it to the next on call person in the rotation.
SchedDown60 - this will schedule downtime on the specific Host/Service for 60 minutes so no more alerts will fire from Nagios to xmatters.
Our environment is so large that when someone is on call it is possible for them to have an enormous amount of alerts to come through. I want to be able to give them a few more options to respond that will:
1. schedule downtime for more than 60 minutes.
2. Do a sticky acknowledgement that stops the alerts for a specific host/service until it is resolved.
I cannot seem to find where these are defined or are being translated from xmatters to Nagios.
Any help is appreciated.
forgot to answer your other part.
Yes I do want to accomplish setting a downtime for an extended period. But be able to do that with xmatters.
Hi William,
Travis can probably come up with something more more elegant (and should probably confirm what I'm about to tell you, since it's not my area of expertise), but you should be able to add a new response option fairly easily that will give you a longer specified downtime.
This doesn't give you the 'Thanks Nagios, I'm on it' solution, but maybe someone else here who knows Nagios a bit better can offer some ideas.
Christine,
Thank you! I will try what you suggested. I really needed the location and what to tweak, which you provided.
I will let you know how it goes.