Skip to content
— / Resource

The 12-module AI literacy curriculum

· 13 min read

The full sequence of modules in the 174 program — prompting fundamentals, evaluation, agentic workflows, and governance — with the artifact each module produces and the rubric it's evaluated against.

Most AI literacy curricula are course catalogues. A list of topics, an estimated time per module, a quiz at the end. Ours isn’t. Each of the 12 modules below produces an artifact — a reusable prompt, a working agent, a documented evaluation rubric, a published policy — that the learner takes back into their actual work. The artifact is the proof that the module landed. If a module ends with no artifact, the module didn’t happen.

The modules cluster into four sequences: prompting fundamentals, evaluation, agentic workflows, and governance + operations. The clusters are dependency-ordered — you don’t build agents before you can evaluate single prompts — but within a cluster modules can run in parallel.

Cluster 1 — Prompting fundamentals (Modules 01–04)

The base layer. Most teams plateau here forever; the modules below are designed to push past the plateau into prompts that get reviewed, reused, and shipped.

Module 01 — Prompting as design, not typing

Dimension served: Capability. Prerequisites: None. Artifact: A documented prompt for a recurring task in the learner’s actual work, with three variations and a one-sentence rationale per variation. Evaluation rubric: Does the chosen variation produce reliably better output than the learner’s previous “type a question” baseline? Yes / no, with examples.

Module 02 — Context, examples, and the cost of vagueness

Dimension served: Capability. Prerequisites: Module 01. Artifact: Two versions of the same prompt — one minimal, one with appropriate context and examples — plus a short note on which one wins for which kind of task. Evaluation rubric: Can the learner explain why added context helps for one task category and hurts for another? In one paragraph.

Module 03 — Prompts that scale across people

Dimension served: Capability. Prerequisites: Modules 01–02. Artifact: A prompt published in a shared location (Slack canvas, Notion page, internal prompt library) with usage notes that let a colleague run it without the learner present. Evaluation rubric: Does the colleague produce comparable output on first try? Yes / no, witnessed by the colleague.

Module 04 — When not to use AI

Dimension served: Capability + Governance. Prerequisites: Modules 01–03. Artifact: A short list — three to five entries — of the learner’s recurring tasks where AI is not the right tool, with one-sentence reasons each. Evaluation rubric: Are the reasons defensible and specific? Reviewed by a peer, not a self-grade.

By the end of Cluster 1, the learner has prompts they reuse, prompts they share, and prompts they deliberately don’t write. That’s the foundation.

Cluster 2 — Evaluation (Modules 05–07)

Most AI rollouts trip here. People can write prompts; they just can’t tell whether the output is any good. The modules below close that gap.

Module 05 — From eyeball to rubric

Dimension served: Capability. Prerequisites: Cluster 1. Artifact: A simple evaluation rubric (3–5 criteria, plain language) for one of the learner’s recurring AI use cases. Used to grade five real outputs. Evaluation rubric: Does the rubric produce stable grades when applied by two different reviewers? Inter-rater agreement above 70%.

Module 06 — Spotting failure modes you can’t see

Dimension served: Capability + Governance. Prerequisites: Module 05. Artifact: A documented “failure modes” list for the learner’s domain — three to five categories of error the learner now actively watches for, with examples and detection heuristics. Evaluation rubric: When given fresh outputs, can the learner correctly flag examples that fall into each documented failure mode?

Module 07 — Versioning and review for prompts

Dimension served: Capability. Prerequisites: Modules 05–06. Artifact: A version-controlled prompt (in a Git repo, a Notion changelog, or a dedicated prompt-management tool) with at least two iterations and a written rationale for each change. Evaluation rubric: Can a teammate look at the change history and reconstruct why each iteration happened?

By the end of Cluster 2, the team has the evaluation muscle that distinguishes a serious AI program from a culture of vibes.

Cluster 3 — Agentic workflows (Modules 08–10)

This is where most teams discover whether their evaluation discipline holds at scale. Multi-step systems amplify both capability and failure.

Module 08 — Decomposing a task into steps

Dimension served: Capability. Prerequisites: Cluster 2. Artifact: A documented decomposition of one of the learner’s longer workflows into atomic steps, with a note on which steps are good AI candidates and which aren’t. Evaluation rubric: Does the decomposition match how a senior person in the learner’s role would describe the workflow? Reviewed by that senior person.

Module 09 — Building a simple multi-step system

Dimension served: Capability. Prerequisites: Module 08. Artifact: A working multi-step system — a chain of prompts, an n8n / Zapier flow, a simple custom script — that automates a real task end-to-end. Evaluation rubric: Does the system run end-to-end on five real inputs without manual intervention? Counted, not estimated.

Module 10 — Evaluating systems, not just prompts

Dimension served: Capability + Governance. Prerequisites: Modules 09 + Cluster 2. Artifact: An evaluation harness for the system from Module 09 — a small set of test inputs, expected outputs, and pass/fail criteria — used to detect regressions when the system is changed. Evaluation rubric: Does running the harness reliably catch a deliberate breaking change? Demonstrated, not asserted.

By the end of Cluster 3, the learner can ship a system, watch it for regressions, and explain it to a manager who’s never seen it before.

Cluster 4 — Governance and operations (Modules 11–12)

The two modules below close the loop from individual capability back to organizational program.

Module 11 — Writing a usable AI policy

Dimension served: Governance. Prerequisites: Cluster 1. Artifact: A draft AI usage policy for the learner’s team or department, using the governance starter template as a base. Less than 800 words. Evaluation rubric: Has the draft survived a review by a security or legal stakeholder with at most surface-level edits?

Module 12 — Running a recurring program

Dimension served: Governance + Adoption. Prerequisites: Cluster 1, Module 11. Artifact: A documented monthly program rhythm for the team — what gets reviewed, who runs the meeting, what artifacts get tracked — plus the first month’s review notes. Evaluation rubric: Did the second month’s meeting happen without the learner having to chase anyone? The test of a real program.

How the curriculum is calibrated

Three rules govern the way we update the modules:

  1. Modules are versioned to specific tools. A prompting module written for the GPT-3 era is not the same module today. We re-author each module against the current generation of tools at least quarterly. If your curriculum says “as of April 2026”, that’s why.
  2. Artifacts are language-agnostic but tool-specific. Module 09 might use Cursor in an engineering team and n8n in an operations team. The artifact — “a working multi-step system” — is the same; the tool is matched to the team.
  3. Evaluation rubrics are public. Every rubric above is something a learner can read in advance. There are no surprise gates. The point of an evaluation rubric is to clarify the bar, not to grade in secret.

If you’re considering the curriculum for your organization, the assessment will tell you which clusters to start with — Emerging Capability points at Cluster 1, Developing Capability at Clusters 2–3, Mature Capability at Cluster 3 deep-dive plus Cluster 4. Mature Governance plus Emerging Capability is the inverse: the program is in place, the skill is not.

The point of publishing the full sequence is the same as publishing the rubric. If we won’t show you what’s actually in the curriculum, we shouldn’t ask you to enroll your team in it.

— / Next move

Where does your org actually stand?

Ten minutes. Three dimensions. A leadership-shareable baseline.