The idea behind Incident Management is to be ready. Not ready for anything, as that can be an unrealistic expectation, but ready to respond when the unexpected inevitably happens. DevOps teams often create incident playbooks in order to ensure they are as ready as possible to handle situations as they arise.
Luckily, there is some amazing documentation on how to do just that from our friends at PagerDuty. Using some of the examples from their Incident Response documentation, let’s create an incident playbook in Mattermost.
How to create an new playbook in Mattermost
We start by clicking the Playbooks & Incidents link in the Mattermost navigator.
This will bring us to the Playbooks & Incidents page where we can see previously created playbooks, an incident management playbook template, and a link to create new playbooks. This really shows the flexibility of playbook creation. While we are focusing on Incident Management in this article, really you can create a playbook for anything—CI/CD automation, chaos engineering planning, even onboarding new team members!
For this run through, we’ll select Create a Playbook.
We’ll need to fill out a few fields to ensure we are clear when executing our plan. As pointed out in PagerDuty’s documentation, we need to set expectations and ensure we give people actionable tasks and roles that are clearly defined. All this should be established before creating a playbook.
For our example, we will use a basic template and begin filling in our fields:
We begin with our Playbook Title, Cloud Outage Incident. This let’s all appropriate parties know this is specifically related to Cloud outages as opposed to an issue with GitHub or the Mattermost docker image. In our Playbook Description, we will outline what we mean by the title and perhaps define terms if necessary.
Next, we define our stages. We start with a Default Stage (which can be renamed if we so desire). These are the steps we use to get an incident started and to establish points of contact that will flow through the incident. For some Playbooks, roles may be played by a rotating list of responsible parties. Ensure anyone who might start an incident knows where to find that information.
For our example incident, we will mark our first stage Discovery Stage and add a few tasks and assign a few roles. Some of the tasks for a stage may require further description and Mattermost offers that capability.
Customizing your playbook to streamline incident response
Additionally, you can add slash commands to associate with certain tasks. Not every task will require that, but it does help streamline the incident response.
If necessary, we can set up additional stages for each step in the playbook. These are added similar to our first stage. Ensure you have entered each step necessary for proper resolution of the incident.
Once you have your stages and tasks established, you can select whether this is a Public or Private playbook (private is default). In this area you can also decide to share the playbook with others in Mattermost.
When we are finished with Settings, we can click Save in the upper right-hand corner. We will be taken back to our Playbook list screen and see that our new playbook is available and can be edited if our response techniques evolve.
Executing an incident playbook
Now that we have the playbook, we need to execute it. To invoke an incident and select a playbook, use the
/incident start slash command. This will bring up the Incident Details screen.
In the Playbook drop down, we will select the playbook we just created, then add some information about our incident to make it clear to our incident responders. Once we have all our information, we can click Start Incident.
Starting an incident will open a channel specifically for this outage. You can step through the stages and tasks you’ve designated and have all the moving parts in one place.
With this playbook we can work to resolve the incident and worry less about logistics as they are all arranged for the team before we start. It’s important to note playbooks must be kept up to date and should not be accepted as ready for granted.
Get started with Incident Management Playbooks in Mattermost
Try setting up your Incident Management Playbook in Mattermost today! It’s likely you already have playbooks in some form, so give the Incident Management playbook creator from Mattermost a spin and see how you can make incident response better for all the first responders in your organization.