Retrospectives

Retrospectives are structured learning sessions where teams identify what’s working, what isn’t, and what to change. The format matters less than the follow-through. A team that runs mediocre retros but implements every action item will outperform a team with sophisticated formats and no execution.

The Facilitation Challenge

Running effective retrospectives requires balancing structure (so you make progress) with openness (so people speak honestly). Too much structure and you get performative participation. Too little and you get meandering conversations that waste time.

Core Facilitation Principles (Esther Derby & Diana Larsen)

Set the stage: Create psychological safety before diving into problems
Gather data: Collect what actually happened, not opinions about what happened
Generate insights: Look for patterns and root causes
Decide what to do: Specific, actionable changes with owners
Close the retrospective: End with clarity about next steps

The facilitator’s job is to guide the process, not control the outcomes. You’re not there to solve problems - you’re there to help the team solve their own problems.

Retrospective Formats and When to Use Them

Different formats work better for different situations. Pick based on what you’re trying to learn.

Start/Stop/Continue

When to use: Regular sprint retros, general-purpose reflection Time: 30-45 minutes Best for: Teams that need clear action items

Structure:

Start: What should we begin doing?
Stop: What should we quit doing?
Continue: What’s working that we should keep?

Why it works: Forces prioritization. You can’t start 10 new things - the format pushes you toward 1-2 concrete changes.

Trade-off: Can feel repetitive if you use it every time. Rotate with other formats.

Timeline Retrospective

When to use: After a major release, incident post-mortem, when you need to establish shared understanding Time: 45-60 minutes Best for: Complex situations with unclear timelines

Structure:

Draw a timeline of the period (on whiteboard or shared doc)
Each person adds events they remember
Mark emotional reactions (highs and lows)
Discuss patterns: What caused the lows? What created the highs?
Decide what to change

Why it works: Reveals different perspectives on the same events. Backend thought deploy went smoothly while frontend was fighting CORS errors.

Trade-off: Takes longer, requires visual space (harder remote).

Five Whys (Sakichi Toyoda/Toyota Production System)

When to use: After incidents, when you keep hitting the same problem Time: 20-30 minutes for a single issue Best for: Getting to root causes instead of symptoms

Structure:

State the problem clearly
Ask “Why did this happen?”
Answer factually
Ask “Why?” about that answer
Repeat until you hit systemic causes (usually 3-5 iterations)

Example:

Problem: Production deploy failed Friday at 4pm
Why? Database migration timed out
Why? Migration locked a table with 10M rows
Why? We didn’t test migration on production-sized data
Why? Staging database only has 1000 rows
Why? No process for keeping staging data volume realistic
Action: Create process to refresh staging with production-sized sample data monthly

Why it works: Gets past “someone made a mistake” to “what systemic issue allowed this mistake?”

Trade-off: Can feel like interrogation if not facilitated carefully. Restate the Prime Directive often.

Sailboat/Speedboat Retrospective

When to use: When team morale is low, when you want metaphorical distance Time: 30-45 minutes Best for: Discussing sensitive topics without direct confrontation

Structure: Draw a sailboat heading toward an island:

Island (goal): Where are we trying to go?
Wind: What’s helping us get there?
Anchors: What’s slowing us down?
Rocks: What risks are we worried about?

Why it works: Metaphor creates psychological distance. Easier to say “legacy code is an anchor” than “this codebase is terrible.”

Trade-off: Some teams find metaphors silly. Read the room.

Lean Coffee Format

When to use: When team has lots of topics to discuss, limited time Time: 45-60 minutes Best for: Self-organizing teams, varied discussion topics

Structure:

Everyone writes topics on cards (5 min)
Brief explanation of each topic (5 min)
Dot voting to prioritize (2 min)
Discuss top topic for 8 minutes
Vote to continue (3 min more) or move to next topic
Repeat until time runs out
Last 10 min: decide action items

Why it works: Democratic, responsive to what team cares about most.

Trade-off: Can skip important-but-uncomfortable topics if team votes for easy wins.

Remote vs. In-Person Trade-offs

Remote advantages:

Written notes by default (better documentation)
Anonymous input possible (Miro, Retrium, Google Jamboard)
Easier for quiet team members to contribute
Can use breakout rooms for small group discussions

Remote challenges:

Harder to read body language
Technical issues derail momentum
Less spontaneous discussion
Screen fatigue affects engagement

Hybrid challenges:

Remote participants feel like second-class citizens
Whiteboard work excludes remote folks
Audio quality issues
Harder to facilitate equal participation

Making hybrid work:

Use digital tools even for in-person participants (everyone on laptops)
Dedicated camera pointed at whiteboard
Facilitator actively solicits remote input
Test technology before the meeting

Frequency and Timing

Sprint retrospectives: End of each sprint (every 1-2 weeks)

Pros: Regular rhythm, issues are fresh
Cons: Can feel like obligation, less time to see if changes work

Release retrospectives: After major deploys

Pros: Focused on significant events, clear scope
Cons: May miss ongoing process problems

Incident retrospectives: Within 24-48 hours of incidents

Pros: Details are fresh, urgency drives action
Cons: Emotions may still be running high

Quarterly retrospectives: Broader process and team health

Pros: Time to see patterns, strategic conversations
Cons: Too infrequent to catch tactical issues

Best practice: Layer them. Sprint retros for tactical issues, quarterly for strategic, incident retros when needed.

The Action Item Problem

Most retrospectives fail at follow-through. You have great conversations, identify real problems, agree on changes, then nothing happens.

Why Action Items Fail

Too many: Team commits to 7 changes, completes 0
Too vague: “Improve communication” (how?)
No owner: Everyone’s responsible = no one’s responsible
No deadline: Will get to it eventually (never does)
No authority: Team identifies problem they can’t fix (need budget/headcount/management decision)

Making Action Items Stick

SMART framework:

Specific: Not “better tests” but “add integration test for checkout flow”
Measurable: How will we know it’s done?
Achievable: Can we actually do this with current resources?
Relevant: Will this solve the problem we identified?
Time-bound: By when? (Next retro is a good default)

Limit to 1-3 action items per retro You will complete 1-3 small changes. You will not complete 10.

Track publicly Put action items somewhere everyone sees them (team board, Slack channel, JIRA epic). Check status at next retro.

Start each retro by reviewing last retro’s actions Did we do what we said? Did it help? If not, why not?

Measuring Retrospective Effectiveness

Hard to measure directly, but proxy metrics:

Leading indicators:

Attendance rate (are people showing up?)
Participation rate (are people contributing?)
Action item completion rate (are we following through?)

Lagging indicators:

DORA metrics improving over time (deploy frequency, lead time, MTTR, change failure rate)
Same problems stop appearing in incident post-mortems
Team satisfaction scores (Spotify squad health check model)

Qualitative signals:

People bring up difficult topics without fear
Constructive disagreement happens
Solutions come from team, not facilitator
Actions address systemic issues, not just symptoms

Common Dysfunctions and Fixes

Dysfunction: Blamestorming

Symptom: Retro turns into “whose fault was this?” Fix: Restate Prime Directive. Focus on “what systemic issue allowed this?” not “who did this?”

Dysfunction: Rehashing

Symptom: Same problems discussed every retro, nothing changes Fix: Make action items more specific. If action isn’t complete by next retro, escalate as impediment to management.

Dysfunction: Toxic positivity

Symptom: Everything’s great, no problems identified Fix: Anonymous input methods. Explicit invitation: “What almost went wrong?” or “What got harder this sprint?”

Dysfunction: Missing the empowered

Symptom: Team identifies problems only management can fix Fix: Invite decision-makers periodically. Track impediments separately and escalate. Focus team retros on things team can control.

Dysfunction: Facilitator as problem-solver

Symptom: Facilitator proposes all solutions Fix: Ask questions instead of giving answers. “What could we try?” not “Here’s what we should do.”

Integration with Other Practices

Retrospectives don’t exist in isolation. Connect them to:

Incident post-mortems: Feed systemic issues into retros Sprint planning: Action items become backlog items Security reviews: Security findings become retro topics Monitoring dashboards: Use data to ground discussions Feature planning: User feedback informs what to build next

Next Steps

Deep Water: Building organizational learning systems, cultural transformation, measuring improvement velocity, psychological safety at scale
Related: Incident Response, Feature Planning

Framework Summary

Format	Best For	Time	Complexity
Start/Stop/Continue	General-purpose	30-45 min	Low
Timeline	Complex events	45-60 min	Medium
Five Whys	Root cause analysis	20-30 min	Medium
Sailboat	Low morale, sensitive topics	30-45 min	Low
Lean Coffee	Democratic topic selection	45-60 min	Medium

When in doubt: Start with Start/Stop/Continue. It works for 80% of situations.

Retrospectives

The Facilitation Challenge

Core Facilitation Principles (Esther Derby & Diana Larsen)

Retrospective Formats and When to Use Them

Start/Stop/Continue

Timeline Retrospective

Five Whys (Sakichi Toyoda/Toyota Production System)

Sailboat/Speedboat Retrospective

Lean Coffee Format

Remote vs. In-Person Trade-offs

Frequency and Timing

The Action Item Problem

Why Action Items Fail

Making Action Items Stick

Measuring Retrospective Effectiveness

Common Dysfunctions and Fixes

Dysfunction: Blamestorming

Dysfunction: Rehashing

Dysfunction: Toxic positivity

Dysfunction: Missing the empowered

Dysfunction: Facilitator as problem-solver

Integration with Other Practices

Next Steps

Framework Summary

Want to Go Deeper?

Related Topics