The DevOps Playbook uses the xMatters communications platform to simulate toolchain integrations. This allows you to quickly get a feel for the power of xMatters without spending the time to deploy 3 or more integrations.
Steelbrick Inc. has xMatters integrated into their system. Their Application Performance Monitoring tool (see here for a full list of our APM integrations!) sends an alert about high CPU usage for the Steelbrick application. Let's see how this plays out in xMatters.
Setting up the playbook
In the playbook, we use constants in xMatters to simulate the closed loop back to the APM application and pushing to a chat application, so there's a bit of setup you have to do.
You need access to an xMatters instance, of course, and a device on which you can receive notifications. (If you want to try the conference call scenario, make sure you have a voice device handy...)
Download the sample integration:
To use this playbook, first you need to download the sample integration. (Don't worry, clicking the link just takes you to the bottom of the page.) Don't extract the zip file! You'll import it straight into xMatters.
Now that you've got the file downloaded, let's import the playbook and get it set up.
Get the playbook ready to run:
- Go to the Developer tab and click Import Plan. Select the .zip file you downloaded.
- If you get warnings about languages, it just means you have languages enabled in your instance that we don't have translations for in the sample plan.
- After the import is done, click Edit beside the plan and select Integration Builder, then expand the list of inbound integrations:
- Click "Inbound for Playbook Status Update", scroll down and copy the URL.
- In the real-world, you'd copy this URL into a webhook in your APM tool. But for our simulation, we need to use constants to mimic that webhook. So...click the breadcrumbs to go back to the Integration Builder and click Edit Constants.
- Select the "Playbook Status Update Endpoint" constant and paste the path (or the full url, either works) in the Value field and click Save Changes.
- Repeat for "Inbound for Post to Chat" > "Post to Chat Endpoint" and "Inbound for Rollback Last Commit" > "Rollback Last Commit Endpoint".
That's it – you're ready to play.
Following the Playbook
Think of the playbook as an ever-changing adventure where you can have hours of fun, starting at point A, but then choosing to go to point B, C, or D. Or even all of the above! So let's start adventuring.
Navigate to the Messaging tab and, in the DevOps Playbook section, click Application Performance Monitoring (aka APM) Alert. Default information has been entered for you, except for the recipient – that's you! So, in the recipient field, start typing your username to set yourself as the recipient.
Once you're satisfied, click Send Message.
All enabled devices are targeted with the notification content appropriate for that device, according to the settings configured in your Devices tab in your user profile.
For this adventure, choose one of the "Run remediation" responses. You should get a notification that the task was successful.
The next sections go over what happens with each response option.
Run remediation task Reboot Server | Run remediation task Increase Resource Pool | Rollback last commit
These three options simulate reaching back into the APM to set the responder – you again – as the assignee and telling your CI/CD application to run a task with one simple click of a button (or tap of a screen).
In a real-world environment, this is considered a "closed-loop" integration because the event is generated in the APM application, fired to xMatters, people are notified, they respond and that information is delivered back to the APM tool. This helps keep your teams informed and confident that someone has owned the incident.
Post to chat
Do you need to call in your teammates to help figure out a solution? This option simulates creating a chat room in your favorite chat application. From here, you could use the chat integrations we have to invite others to assist in quickly getting the issue resolved.
Are you swamped dealing with another issue? Or are you on vacation and forgot to set a replacement? The Escalate response option lets you tell xMatters that you aren't available to deal with the issue, so xMatters can immediately push the alert to the next on-call person. In this playbook scenario, since we only targeted one person (yup, you), the response is simply noted and no further action is taken.
Already fixed the problem because you're just that good? This option terminates the event in xMatters and stops any further notifications about it. In a real-life scenario, this option could reach into the APM application to resolve the issue. For this playbook, the response is just noted.
Ready for your next adventure?
Try out one of our other playbooks:
If you want to dive right in, set up one of our APM integrations, then maybe link it to one of our continuous integration / delivery or chat integrations, as demonstrated in the playbook.
Or try something new - connect it to a IT service management integration: When the APM application alerts xMatters of an issue, add a response option that creates a ticket for the group responsible for the service to look into the cause.
I didn't receive any notifications
Uh oh! That's not good. First, check the Reports tab to make sure the event was created. If the event was created, then you'll see a nice happy entry in there, like so:
You can click on the title to display a dashboard breaking down the delivery:
...and clicking on each tile shows more information. Make sure there are no errors and that Delivered shows at least 1.
I didn't receive an email/phone call/SMS/push notification
Ok, so you got the notification on some devices, but not all the ones you expected. First, check your devices – click the profile icon in the upper right and select Devices.
Well, here's a problem, I have a 10004 minute delay after my email! Yea, that's not going to work:
If the delays are in good order, click on the Options dropdown for each one and make sure the device is enabled and the schedule is as expected:
You can also see pertinent information in the Tracking Report on the Reports tab. This shows any system errors encountered delivering to the device. Or you can check out the Log tab in the event report.
Choosing a response didn't send me another notification
Well, so not ALL of the responses will send a notification. For example, the Escalate and End response options won't trigger an additional notification.
Outside of those, you should have received a new notification. So we might need to do a little more digging. First, review the Setup section and make sure the constants are set properly. If those look ok, we need to dig a bit deeper.
Head over to the Developer tab and click Edit > Integration Builder next to the DevOps Playbook Communication Plan.
Then, expand the Outbound integrations and click the gear icon next to the response option you chose. Click Activity Stream.
If it is blank, logging might be turned off. Flip the slider and try your response option again; if that doesn't work, inspect the Activity Stream for any errors.
If you're still not sure, reply in the comments below or open up a ticket with our cheerful support peeps...people.