Linked runbooks

Goal

To create a hierarchy between runbooks, allowing a single master runbook to control multiple child runbooks.

Problems to solve

#Visibility

There is often too much noise within a single runbook for large events. What tends to happen is each team merges their tasks into a single runbook

#Performance

A single runbook with all the team's tasks results in a very large runbook. This can result in poor performance.

#Security

All teams need access to the single runbook to complete their tasks. They may see tasks that aren't related to their activities.

Solutions

Create linkages

We wanted to establish relationships between tasks and other entities within the application. We created a polymorphic table that would allow us to create links between a given task and other objects. Initially, we would use this mechanism to create a link between a parent task and a child runbook.

Link task creation

We HAD to avoid the potential pitfall of having a single runbook linked to multiple tasks. To achieve this we introduced a system whereby a task would initially be assigned to a template. If the linked task belonged to a runbook, then the linked template would be converted into a new runbook and the link would be updated. However, if the linked task belonged to a template the link would remain against a separate template.

Template selection

Users needed the ability to import a large volume of linked tasks. Bulk selection of templates needed to be as frictionless as possible. To achieve this we introduced a new template select modal that included search and filtering capabilities for all available templates.

Timing Syncing

This was the technically most challenging aspect of this project. The ask was to ensure that all timing information would sync bidirectionally. So changes to start times in the parent would become visible in all children. We used preexisting events to trigger sync jobs to ensure timing information was up to date.

Run coordination

When a linked task becomes ready to start, we can either start the task (which will start the child runbook) or we can start the child runbook (which would start the parent task). We were able to use the event architecture to have listeners that knew to start the runbook or task. Likewise, when a child runbook was complete, the parent task listener would know to finish the in progress linked task

Take aways

Due to the asynchronous nature of the event architecture we experienced a few race conditions. This wasn’t something I’d come across a lot beforehand. I learnt the complexities of dealing with timing calculations and how to avoid potential performance issues by polling as opposed to recalculating after every action.