#note ## Abstract - [https://landing.google.com/sre/sre-book/chapters/postmortem-culture/](https://landing.google.com/sre/sre-book/chapters/postmortem-culture/) - The primary goals of writing a postmortem are to ensure that the incident is documented, that all contributing root cause(s) are well understood, and, especially, that effective preventive actions are put in place to reduce the likelihood and/or impact of recurrence. - Writing a postmortem is not punishment—it is a learning opportunity for the entire company - common postmortem triggers include: - User-visible downtime or degradation beyond a certain threshold - Data loss of any kind - On-call engineer intervention (release rollback, rerouting of traffic, etc.) - A resolution time above some threshold - A monitoring failure (which usually implies manual incident discovery) - Writing a postmortem also involves formal review and publication. In practice, teams share the first postmortem draft internally and solicit a group of senior engineers to assess the draft for completeness. Review criteria might include: - Was key incident data collected for posterity? - Are the impact assessments complete? - Was the root cause sufficiently deep? - Is the action plan appropriate and are resulting bug fixes at appropriate priority? - Did we share the outcome with relevant stakeholders? - Best Practice - Avoid Blame and Keep It Constructive - No Postmortem Left Unreviewed - An unreviewed postmortem might as well never have existed. To ensure that each completed draft is reviewed, we encourage regular review sessions for postmortems. In these meetings, it is important to close out any ongoing discussions and comments, to capture ideas, and to finalize the state. - Once those involved are satisfied with the document and its action items, the postmortem is added to a team or organization repository of past incidents.81 Transparent sharing makes it easier for others to find and learn from the postmortem. - Visibly Reward People for Doing the Right Thing - Ask for Feedback on Postmortem Effectiveness - Introducing a Postmortem Culture - One of the biggest challenges of introducing postmortems to an organization is that some may question their value given the cost of their preparation. The following strategies can help in facing this challenge: - Ease postmortems into the workflow. A trial period with several complete and successful postmortems may help prove their value, in addition to helping to identify what criteria should initiate a postmortem. - Make sure that writing effective postmortems is a rewarded and celebrated practice, both publicly through the social methods mentioned earlier, and through individual and team performance management. - Encourage senior leadership's acknowledgment and participation. Even Larry Page talks about the high value of postmortems! ## Template [https://landing.google.com/sre/sre-book/chapters/postmortem/](https://landing.google.com/sre/sre-book/chapters/postmortem/) plain ``` Title: Summary of the incident Date: Status: Summary: Summary of what happened Impact: the effect on users, revenue, etc. Root Causes: 5 whys (asking why 5 times) maybe good method Trigger: Why this occured Resolution: How did it fixed Detection: How did it ditected Action Items: Action Item: Bug fix, documentating, Type: Owner: Bug: Lessons Learned: What went well: What went wrong: Where we got lucky: Timeline: {Time} What Happened Supporting information: links of references ``` ## Ref - [https://research.google/pubs/pub45906/](https://research.google/pubs/pub45906/) - [https://landing.google.com/sre/sre-book/chapters/postmortem-culture/](https://landing.google.com/sre/sre-book/chapters/postmortem-culture/) - [https://landing.google.com/sre/sre-book/chapters/postmortem/](https://landing.google.com/sre/sre-book/chapters/postmortem/) - [https://www.portent.com/blog/project-management/tips-for-a-successful-post-mortem.htm](https://www.portent.com/blog/project-management/tips-for-a-successful-post-mortem.htm) - [https://www.atlassian.com/incident-management/handbook/postmortems#what-is-post-mortem](https://www.atlassian.com/incident-management/handbook/postmortems#what-is-post-mortem) - [https://postmortems.pagerduty.com/](https://postmortems.pagerduty.com/) > Systems Analysis (postmortems) are powerful tool to understand where our architectural (including processes) model differes from reality, surfacing risks and shortfalls. ポストモーテムは、ソフトウェアの解析手法の1つであり現実からフィードバックを得る機会になる。 どんなアーキテクチャもプロセスも現実と比較して100%正しいということはないのでそれを実例ベースで学ぶのである from Increments ISSUE12 Software Architecture "Systems analysis through postmortems" by Andrew Howden