Kiana had been on the team for three years. She knew the review process the way most people know their commute, not because she'd read a document, but because she'd done it hundreds of times and absorbed the unwritten rules along the way. She knew which approver needed to see a draft before it went to legal, even though that wasn't documented anywhere. She knew the SME in the engineering org who was fast and accurate, and the one who'd give you vague feedback three weeks late. She knew which corner of the shared drive the approved brand templates actually lived in, as opposed to where they were supposed to live.
Kiana went on maternity leave in October. By November, two projects were stalled at the review stage because nobody knew the right routing order. By December, a module shipped with an off-brand visual that would have been caught if anyone had known where the current template file was. Her knowledge was real, it was valuable, and it lived entirely in her head.
Most L&D teams I've seen operate this way. The institutional knowledge is distributed across a handful of experienced people, and the team functions as long as those people are present. The moment someone leaves (for maternity leave, for a new job, for FMLA, for any reason at all), a process gap opens. Some gaps are small. Some are career-defining problems that end up in front of a VP at the worst possible moment.
TL;DR
- Most L&D teams run on tribal knowledge. Cognota research identifies it as the #1 scaling blocker for learning operations teams.
- The fix isn't a 200-page process manual. It's a minimum viable playbook: the smallest set of documented decisions that makes a global team function consistently.
- The playbook pays for itself the first time a key person is unexpectedly unavailable. And when you're managing a 26-person team across three continents, "unexpectedly unavailable" is a when, not an if.
🔍 What Tribal Knowledge Actually Costs
There's a 2022 study on project management failures that found scope creep cited as a cause of severe cost and time overruns in a significant percentage of failed projects. What that research doesn't capture is how much of that scope creep originates not from expanding requirements, but from undocumented process: work that gets redone because nobody knew the right way, decisions that get relitigated because the original rationale was never written down, and quality standards that drift because they were never explicit in the first place.
In L&D, the specific form this takes is the 6-month backlog that quietly becomes a 12-month backlog. A team that's operationally capable but not operationally documented will always be slower than it needs to be, because a meaningful portion of every project involves reinventing decisions that were already made on the last project.
I ran a 26-person team across three continents. The time zone spread alone (US, Canada, and France) meant that at any given moment, a third of the team was asleep. When a process question came up during the Paris team's working hours, the answer couldn't be "ask Kiana." It had to be written down somewhere Kiana-in-France could find at 9am without waiting for Kiana-in-Utah to wake up.
That operational reality forced us to build the playbook. Not because we were organized or disciplined about documentation. Honestly, we weren't, not naturally. We built it because the alternative was a team that functionally couldn't work without specific people present. At 26 people across three time zones, that's not a team. That's a fragile dependency network.
⚠️ The Bureaucracy Trap
Before I describe what the playbook is, I want to be explicit about what it's not. Because the minute you say "we need to document our processes," a certain kind of organizational gravity kicks in and tries to turn that into a 200-page procedure manual nobody will ever read.
That instinct is understandable. If we're going to document, shouldn't we document everything? Shouldn't we be thorough? The answer is no. Not because thoroughness is bad, but because a document that nobody uses is worse than no document at all. It creates the illusion of process clarity while the actual knowledge still lives in people's heads, plus now you have to maintain an unused manual.
We operated with a guiding principle I'd stolen from the product world: for every process step added, at least one removed. The playbook couldn't grow without also getting leaner. Every time we documented a new decision, we asked whether it was replacing something we'd been doing informally, and if so, we deleted the informal version rather than running both in parallel.
The result is what I think of as a minimum viable playbook. Not minimum in quality. Minimum in scope: the smallest set of documented decisions that makes a team function consistently without requiring the presence of any single person.
📋 What the Playbook Actually Contains
Ours had four sections. I'll describe each one because the structure matters. The sections aren't arbitrary, they map to the four places tribal knowledge most commonly causes problems.
Intake and triage. How does a training request enter the system? Who decides if it's a training problem or something else? What's the format decision tree: when do we build a microlearning module vs. a full eLearning course vs. a job aid vs. a VILT session? What information has to be in a brief before development begins? This section existed entirely in our most senior IDs' heads before we wrote it down. After, a brand-new ID could triage an incoming request with acceptable consistency on their first week.
Development standards. What does a storyboard need to include before it moves to production? What are the quality standards for a module at each review stage? Where do the current approved templates live? What's the naming convention for files so they can be found by someone who didn't create them? This is the section that prevents the "shipped with the wrong brand template" problem. It sounds like tedious housekeeping. It is tedious housekeeping. It also saves significant rework time every quarter.
Review and approval routing. Who reviews what, in what order, with what scope? What are the SLAs for each review stage? What happens if a review exceeds the SLA? When does content escalate vs. move forward? This is where most of the tribal knowledge lives on most teams, and it's the section that causes the most damage when it's undocumented. The routing logic that lives in Kiana's head becomes a project-stopping question when Kiana isn't available.
Iteration and deprecation. How does a published module get updated? Who initiates a content refresh? What's the threshold for a deprecation review? How do we decide whether to refresh, rebuild, or retire? This section is the one teams skip most often, and it's also the section that slowly destroys capacity over time. Undocumented iteration processes mean content quietly goes stale. Undocumented deprecation means the library grows without limit until maintenance costs crowd out new development.
🔀 The Version Control Insight
One of the most useful frameworks we applied to the playbook came from an unexpected place: software engineering's GitOps model.
In software development, "version control" means that every change to the codebase is tracked, attributed, and reversible. You know who made a change, when, and why. You can roll back a decision if it turns out to be wrong. And critically, there's a single source of truth, not fifteen copies of the same file floating around different Slack channels.
Our playbook operated on similar principles. There was one version. Changes were deliberate and explicit, not silent. When we updated the review routing because a new approver joined the review chain, we documented the change and noted the date. When we changed the SLA for SME review from five days to three, we wrote down why, so six months later, when someone asked "why is it three days?" the reasoning was available.
This sounds like more overhead. In practice it saved us from a recurring problem that plagues every operations document: the document slowly becoming outdated because updates happen in conversation but not in writing. Three months later, nobody's sure whether the current process is the one in the document or the one someone mentioned in a meeting in February.
The single-source-of-truth principle also forced us to be disciplined about where the playbook lived. One location. Not "I think it's in the shared drive but also there might be a copy in Notion." One place, known to everyone on the team, findable in under 30 seconds. That's the practical test: can any member of your team find the answer to a process question without asking another human? If the answer is no, the playbook isn't working yet.
🌍 Why Global Teams Break Without This
Everything I've described becomes dramatically more complicated with a distributed team. And here's where I want to be honest about the scale of the problem before I make it sound like the solution is simple.
With a team in the US, Canada, and France, we were running across at least two time zones at all times, three during European summer hours. An ID in Paris who had a process question at 8am local time had, on a good day, a two-hour window before the East Coast team was fully online, and a five-to-six-hour wait for the Mountain Time team.
Every undocumented decision in our process was a potential blocker. Not a minor inconvenience. It was a potential half-day delay on a project because someone was waiting for a clarification that should have been in the playbook. Multiply that by a team of 26 and a library of 637 modules, and you're looking at a meaningful fraction of the team's productive time spent waiting for answers that should have been written down.
The playbook wasn't just a documentation exercise for us. It was infrastructure for asynchronous decision-making. When a question is answerable without a meeting, it doesn't need a meeting. When a standard is explicit, it doesn't need to be negotiated per project. When the routing is documented, it doesn't require a senior person's tribal knowledge to execute.
Cross-timezone standardization is also, incidentally, where L&D teams discover they have process inconsistencies they didn't know about. When you try to write down the review routing and three different senior IDs describe three slightly different versions, that's not a documentation problem. That's a signal that the process isn't actually consistent. It's just been consistent enough to avoid visible problems so far. The playbook-building process surfaces those inconsistencies before they become client-facing failures.
💡 Building the Playbook Without Grinding Everything to a Halt
The reason most teams don't have a playbook isn't that they don't see the value. It's that building one feels like a massive project requiring significant time away from the actual work of producing content. And when your team is already running at or above capacity, there's no obvious moment to say "let's pause everything and document our processes."
We used an MVP methodology borrowed from product development. Instead of trying to document everything before the playbook went "live," we identified the three highest-value sections (the ones where undocumented tribal knowledge was causing the most visible pain) and built those first. Review routing. File naming and template locations. Intake triage criteria. Those three sections took about two weeks of distributed effort, not two months.
We then ran those sections for 30 days and collected failure cases: places where the playbook didn't have the answer someone needed, or where the documented process turned out to be wrong or incomplete. Those failure cases became the backlog for the next iteration. The playbook grew in response to real operational gaps, not in response to someone's idea of what comprehensive documentation should look like.
This also meant the team trusted it more quickly than they would have trusted a document someone had spent months crafting in isolation. When people see their own reported gaps getting addressed in the next update, they start treating it as a living resource rather than a policy document gathering digital dust.
A word on the anti-bureaucracy principle in practice: we had one rule. Every time we added a process step, we identified one that could be removed or simplified. This kept the playbook from becoming the policy manual it wanted to become. It also forced us to ask, every time we considered adding something, whether we were solving a real operational problem or just documenting a preference.
🔄 The Sprint Retrospective as Playbook Maintenance
A playbook that isn't updated becomes a liability faster than you'd expect. Outdated documentation is in some ways worse than no documentation. It sends people confidently in the wrong direction.
We used sprint retrospectives (short, structured team check-ins every two weeks) to do two things simultaneously: surface what was working and what was blocked on current projects, and flag process gaps or outdated playbook content that had caused friction in the last sprint. The retrospective became the maintenance mechanism for the playbook. Problems that showed up twice got a playbook update. Problems that showed up once got a note.
This is the software engineering sprint retrospective structure applied to content operations, and it works for the same reason it works in engineering: small, regular updates are dramatically less painful than big quarterly overhauls. And the team that surfaces the problems is also the team most motivated to see them fixed, which means updates get written and actually used, not deferred to a backlog nobody looks at.
The other discipline we built in: an annual full review of the playbook against actual practice. Not "is the document current?" but "do we actually operate the way this document describes?" Those two questions have different answers more often than you'd expect. The annual review was where we found the sections that had been quietly bypassed, the workarounds that had become the real process, and the documentation that had never matched reality in the first place.
⚡ What the LMS Migrations Taught Us
We ran three LMS migrations: Litmos to Moodle, and then both Moodle instances to Skilljar. Each migration was a process portability test. Could our content operations survive a complete change in the underlying platform? Could we keep producing and deploying content during the transition? Could a new platform be onboarded without the team needing to relearn every workflow?
The answer, on the first migration, was: not as well as we'd hoped. Too many of our workflows were platform-specific rather than platform-agnostic. We had processes that depended on Litmos-specific features that Moodle didn't have. We had naming conventions tied to Litmos folder structures. We had QA checklists that referenced Litmos interface elements.
The second and third migrations went better, because by then we'd rebuilt our processes to be portable. The playbook documented the logic of what we were doing, not just the mechanical steps in a specific platform. When Skilljar was the new platform, we updated the platform-specific sections of the playbook (the steps, the interface references) without needing to rebuild the underlying decision logic.
This is the practical value of the single-source-of-truth principle: when the platform changes, you know exactly what in the documentation needs to change, because the documentation is organized by decision, not by screen. A playbook that says "click the orange button in the top-right corner" is fragile. A playbook that says "verify the module metadata is complete before publishing (see the metadata checklist)" is portable.
🎯 The One Thing to Do This Week
Identify the one process on your team that lives entirely in one person's head. The routing, the template location, the approval sequence, whatever it is that would cause a project to stall if that person weren't available. Write it down in two paragraphs. Share it with the team. Ask if the two paragraphs are accurate. That's step one of a playbook, and it takes less than an hour.
If you've been through the tribal knowledge problem (the moment someone leaves and the process gaps appear), I'd like to hear how you handled it. What did you document first? What took longest? Find me on LinkedIn.
-- Eian
Sources
- Cognota. (2026). How long does it take instructional designers to create one hour of learning? cognota.com
- Project Management Institute. (2022). Pulse of the profession 2022: Ahead of the curve: Forging a future-focused culture. pmi.org
- Humble, J., & Farley, D. (2010). Continuous delivery: Reliable software releases through build, test, and deployment automation. Addison-Wesley.
- Ries, E. (2011). The lean startup: How today's entrepreneurs use continuous innovation to create radically successful businesses. Crown Business.
- Thalheimer, W. (2024). LTEM Version 13. Work-Learning Research. worklearning.com
- ATD. (2025). Benchmarks and trends from the 2025 State of the Industry report. td.org
- Malamed, C. (2024). Methods for capturing SME knowledge. The eLearning Coach. theelearningcoach.com