Mattermost Recipe: Handling Incidents with Mattermost and PagerDuty
Update: Since I’ve written this article, PagerDuty has come out with a native integration that works with Mattermost’s Incoming Webhooks, meaning you can get simple PagerDuty notifications running without installing any other software.
To use it, follow these steps:
- Create an incoming Mattermost webhook and copy the URL
- In PagerDuty, go to
Configuration
>Services
and click on or create a new service
- Under the
Integrations
panel, add anExtension
and search for Mattermost - Enter a name for your Integration, and then paste the Mattermost webhook URL from step 1 into the box. If you want, you can also specify a user or channel to send to or use the one you configured with Mattermost.
- Create a test incident in PagerDuty. You should see a message similar to this in Mattermost.
Big thanks to David Coleman at PagerDuty for helping roll this out! If you want more functionality out of your PagerDuty notifications, read on.
Here’s the next installment in our Mattermost Recipes series.
The goal of these posts is to provide you with solutions to specific problems, as well as a discussion about the details of the solution and some tips about how to customize it to suit your needs perfectly.
If there’s a Recipe you want us to cook up in the future, drop us a line on our forum.
Problem
Many DevOps teams have incident alerting systems. But responding to the incident and getting a team together can take a while, and incident discussion can clutter up existing Mattermost channels. This is a simple recipe to help notify people that an incident occurred and keep discussion of the incident organized.
Solution
This is a very specific solution that will create a channel and invite some people on an incident trigger, update the channel header when the incident status changes and close the channel and output resolution statistics to the Town Hall
Note: This code is mainly used to illustrate how to access the Mattermost API and connect it to a webhook. It should be considered a guide more than a production-ready application.
0. Set up a Mattermost server
Instructions are available here.
1. Set up the code
The code for this is open source and available here. It includes a webhook configuration, a Ruby script called from the webhook and a sample config file as well as a small Mattermost API library to handle talking to your Mattermost server.
To create your own config files, make a copy of sample.hooks.json
and rename it to hooks.json
. Then edit the execute-command
and command-working-directory
to use the correct path.
Next, make a copy of sample.conf.yaml
and rename it conf.yaml
. Then edit the configuration to authenticate to your Mattermost server and specify the team name and users who should be notified.
Finally, make sure you have webhook installed and run webhook -verbose
to start listening for the notifications.
To test that the webhook is working correctly, run this command from inside the PagerDuty Recipe directory:
$ curl -vX POST https://127.0.0.1:9000/hooks/pagerduty_hook -d @./test_data/incident.trigger.js --header "Content-Type: application/json"
Change trigger
to either acknowledge
or resolve
to use those incident states.
2. Configure PagerDuty webhook settings
PagerDuty is a widely used alerting system. But any system that can send an outgoing webhook when an event is triggered could be used as a replacement.
Outgoing webhooks in PagerDuty are linked to monitoring services. But first, you need to add it as an extension to that service. To get to them, go to Configuration
>Services
.
Then click the service name and click the Integrations
tab and click New Extension
and enter a name for your webhook and the URL to call, which should end with pagerduty_hook
to match the hooks.json
file.
3. Test it out
Next, trigger an alert in PagerDuty and acknowledge and resolve it. When the issue is created, you’ll see a private channel created for the incident, with a header that shows the status of the ticket:
When the ticket status is updated, like with an acknowledgement, it updates the channel header to indicate the new status:
And when the issue is resolved, users are disinvited from the channel and the resolution is posted in Town Hall with some information to let the whole team know the problem is fixed:
Discussion
This recipe just shows a couple of ways you can use a webhook and Mattermost to improve incident notification and organization. For example, on resolution of a ticket, a script could get all the posts in the incident channel, as well as any files that were uploaded, and put them in an archive that’s attached to the incident resolution.
Because Mattermost supports interactive message buttons and slash commands, you can also send hooks out of Mattermost. PagerDuty incidents could be acknowledged or a Jira ticket with the relevant incident information can be created without leaving your Mattermost client.
PagerDuty also supports other event types that you may want to handle differently, such as adding users to the incident channel when an incident is assigned to them.
Resources
Here’s where you can find everything you need to write your own Mattermost incident management system based on PagerDuty:
(Editor’s note: This post was written by Paul Rothrock, Customer Community Manager at Mattermost, Inc. If you have any feedback or questions about Mattermost Recipe: Handling Incidents with Mattermost and PagerDuty, please let us know.)