Managing security incidents can be a stressful job. You are dealing with many questions all at once. What’s the scope? Who do I need to engage? How do I manage all of this?
As an Incident Commander (IC), you have many responsibilities. You’re responsible for driving an incident to resolution as quickly as possible, creating the resources necessary to document, collaborate, and communicate while helping identify, engage, and orient the right people. On top of that, you also need to manage the life cycle of the incident and help bring the incident lessons back to the organization at the end. Trying to do all these things in a consistent way while managing an incident is not only very challenging, but it’s also likely to increase the meantime to assemble (MTTA), mean time to resolution (MTTR), and ultimately the cost of the incident.
At Netflix, we like to automate ourselves out of problems. Automation allows us to run lean, continue to meet the growing needs of the business, and focus our time on what really matters, resolving security incidents. With this in mind, we set out to create an orchestration framework for crisis management. We called this framework “Dispatch.”
Okay, but what is Dispatch? Put simply, Dispatch is:
All of the ad-hoc things you’re doing to manage incidents today, done for you, and a bunch of other things you should’ve been doing but have not had the time!
Dispatch helps us effectively manage security incidents by deeply integrating with existing tools used throughout the organization (e.g. Slack, G Suite, PagerDuty, Jira). Dispatch is able to leverage the existing familiarity of these tools to provide orchestration instead of introducing another tool that people need to learn.
This means you can let Dispatch focus on creating resources, assembling participants, sending out notifications, tracking tasks, and assisting with post-incident reviews; allowing you to focus on actually resolving the issue.
There are four main components to Crisis Management that we are attempting to address with Dispatch:
- Resource Management — The management of not only data collected about the incident itself but all of the metadata about the response.
- Individual Engagement — Understanding the best way to engage individuals and teams and doing so based on incident context.
- Life Cycle Management — Providing the Incident Commander (IC) tools to easily manage the life cycle of the incident.
- Incident Learning — Building on past incidents in order to speed up the resolution of future incidents.
Sounds interesting? Want to learn more? We will be presenting Dispatch at the upcoming BSidesSF conference on February 24th at 4:3pm (https://sched.co/YbhR). You can learn more about the event here.
Follow the Netflix TechBlog for more news about Dispatch.
About the Authors: Kevin Glisson, Forest Monsen, and Marc Vilanova are Senior Security Engineers at Netflix where they help drive security incidents to resolution, and design and develop automation for crisis management and digital forensics
Editor’s Note: The opinions expressed in this guest author article are solely those of the contributor, and do not necessarily reflect those of Tripwire, Inc.