Changelog
Hand-maintained for Wave 1, reverse-chronological. A changelog entry is added in the same pull request as any spec-affecting backend change.
Decision-loop guides — fidelity, decision packets, data room — 2026-06-20
No API change. Three new developer guides under Guides, grounded in
the shipped module code, give the RTI decision wave its API-tier coverage:
Intervention Fidelity Tracker (the escalation gate: rate, rating,
verdict, and escalationBlocked), Decision Packets (the assemble,
approve, and reject workflow that calls the RTI engine to move the tier), and
Student Data Room (the dossier + evidence index + packet shell). Each
documents its real routes, permissions, and field shapes, with the fail-closed, append-only,
version-pinned, controlled-reason, and never-student-facing invariants. English + Arabic parity.
Platform Guide — external documentation portal — 2026-06-20
No API change. A second documentation flavor for a non-developer audience, served under the
/external path prefix. The navbar carries two tabs, Developer Docs and Platform Guide, and
each portal shows only its own menu (the sidebar switches per portal). Developer-doc URLs are
unchanged.
- 27 deep pages in English and Arabic, grounded in the actual module code, organized into six sections: Measurement & Decisions, Intervention & Teaching, Monitoring & Response, Reports & Dashboards, Platform Foundations, plus Overview, Trust & Privacy, Roadmap, and an FAQ. Each page explains a real capability end to end with its true rules and thresholds, in plain product language (no endpoints, schemas, or internal identifiers).
- Growth language only: no clinical or program labels, no single overall score, and an honest split between what is available now and what is planned.
- Both portals share one app, theme, and search; full EN/AR parity (60/60) holds.
Portal — Phase 1 comprehensive guides — 2026-06-20
No API change. A documentation-only pass that brings the portal current with the ~21 RTI-engine work packages shipped since the portal was first built. The provisioning docs previously stood alone; the measurement-led RTI loop now has full conceptual coverage.
- New “How Phase 1 Works” concept section — The RTI Loop, Measurement & Determinism, The Decision Engine, Intervention Design, and Safety Rules.
- 16 new subsystem guides under Guides — diagnostic & practice, skills taxonomy, skill-status engine, student profiles, intervention bundles, smart clustering, teacher activation, progress monitoring, CBM & ORF fluency, RTI decisions (preview), standards & benchmarks, task delivery, reporting & parent PDF, cognitive warm-up, language safety, and system QA checks.
- New Glossary and three stale-page fixes (Overview, Errors decision-engine “safe-stop” outcomes, Managers & Principals rollup now populated).
- Every page ships in English + Arabic (33/33 parity); all internal links resolve.
Parent Web Portal — 2026-06-20
Four new read-only operations under a new parent-portal tag (the count moves to 171) — an authenticated
parent’s view of their own children.
GET /api/parent/children(linked children),…/:childId/diagnostics(status-only history),…/:childId/growth(macro-domain status trajectory, Arabic labels only),…/:childId/recommendations(≤5 at-home suggestions from the D2 pre-generated narrative — never an LLM at runtime).- Own-child only. The parent is the authenticated user (no
:parentIdin any path → no IDOR); each:childIdis checked against theParentStudentlink — an unlinked in-org child → 403, a cross-org/non-existent child → 404. Multi-child parents see all their linked children. - No-leak by construction (two belts). A typed allow-list DTO + the canonical parent-stricter string filter — no θ, no numbers, no percentages, no internal labels ever reach a parent (a child with a real θ/profile/percentage in their data shows none of it). Fail-closed (503) if the filter rules are missing.
- No WhatsApp digest, no phone storage (D1) — every route is GET, read-only; the portal stores nothing about the parent.
Multi-School Admin Overview — 2026-06-20
Three new operations under a new multi-school tag (the count moves to 167) — an org-level rollup for a
network manager, built on top of the school-level board view.
GET /api/orgs/:orgId/overview— a rollup card per school (name + total students + class count + how many need support).GET /api/orgs/:orgId/comparisons— the 6 reading-domain tiles × N schools, side-by-side, as status counts.GET /api/orgs/:orgId/trends— an org time-series, one line per school × tile, over month windows.- The comparison is a neutral status rollup, not a ranking — no best/worst ordering, no school leaderboard. Cross-Cutting shows the “coming soon” placeholder.
- Aggregate only, by construction. It shows class names but never a student name or id in any of
the three responses — the same lint guard (extended to this module) + tests that seed a recognizable name
and assert its absence. No θ, no single overall percentage (V-3);
Not_Assessedexcluded from rollups. - A manager sees all their org’s schools by default (no partial-scoping UI in Wave 1); a cross-org id returns 404. Manager / Admin only — never teacher- or student-facing.
- Reuses the shipped school-aggregate engine + readers (no new tables, no measurement logic). Batched reads meet the ≤5 s budget for a 10-school org.
Parent Narrative Coverage (AI-generated, provisional) — 2026-06-20
One new admin operation (the count moves to 164) plus a content + data change so the parent report and portal show a written growth explanation for every profile, not just the two partner-authored ones.
- 56 Arabic parent narratives (the 12 primary + 5 modifier profiles × grade bands) are generated from
the profile specs and seeded as a provisional layer. Each is stamped
ai_generated+needs_sme_reviewand passes the parent-stricter filter with zero changes — no numbers, no θ, no internal labels, growth language only (the same V-12/V-3 discipline as everything parent-facing). - They ship live, flagged (Mohammad’s call): a parent sees the explanation now; the partner reviews and replaces each at the June-25 working session. Only the parent render is overridden — the teacher, admin, and student narratives stay partner-authored. The safe “not enough data yet” page remains for genuine no-profile cases, not for “narrative not written yet.”
GET /api/admin/profile-narratives/pending-reviewlists the provisional rows for the SME review queue (RUN_BACKOFFICE only).
School Overview + Board PDF — 2026-06-20
Two new operations under a new board-pdf tag (the count moves to 163) — a school-level view for
principals and network managers.
- School overview JSON + a one-page A4 Arabic-RTL board PDF for a board/management meeting: per-class macro-status counts + the school-wide distribution across the 6 reading domains (+ the “coming soon” Cross-Cutting placeholder).
- Aggregate only, by construction. It shows class names (so a manager can act) but never a student name or id — a dedicated lint guards every read path, and a test seeds a recognizable name and asserts it’s absent from both the PDF bytes and the JSON. No θ, no single overall percentage (V-3).
- Reuses the shipped reporting readers + the shared
lib/pdfprimitive (embedded Arabic font, V-6 byte-determinism) that the parent PDF now also builds on — one PDF recipe, two layouts. - Manager / Principal / Admin only (a principal sees their own school; a manager sees their org’s schools); never teacher- or student-facing. ≤500 KB, ≤30 s, fail-closed.
v5 vocabulary alignment — 2026-06-20
No behavior change. The RTI decision layer now stores the partner’s canonical v5 status-code names directly,
removing an interim translation layer. Data-sufficiency reads sufficient / insufficient / contextual; tier
decisions escalate / de_escalate / maintain / defer; RTI alerts the 5-kind catalog Skill_Alert / Student_Alert / Intensive_Review / acute_regression_review / tier_review_due; the approval role set adds specialist; the
decision packet reads draft / under_review / approved / rejected / deferred. Only the spelling of the stored
tokens changed — decision logic, guardrails, and determinism (V-6) are untouched, and the OpenAPI schemas reflect
the v5 names.
Decision Packet Workflow — 2026-06-20
Four new operations under the new decision-packet tag (the count moves to 161). The closing piece of the
RTI decision wave: the review-and-approval workflow that turns a tier-movement recommendation into an
audited decision — and only then moves the child’s tier.
- Assemble.
POST /api/decision-packet-workflows/assembleopens a rich decision packet (evidence snapshotted by reference, every rule/config version pinned for replay), computes the deterministic tier recommendation, and records a workflow row awaiting review. - Review + approve.
GET /api/decision-packet-workflows/{id}is the reviewer’s surface;POST …/{id}/approveroutes the recommendation through the approval gate — a Tier-3 move requires the high-intensity specialist (admin-seated for Wave 1) — and on approval executes the tier movement through the RTI engine’s single tier-write path, then closes the packet.POST …/{id}/rejectrecords a controlled-reason decline; no movement. - No movement without approval; the decision reason is a controlled code, never free text. A sentinel “need more evidence” halt, an unapproved Tier-3 move, or an incomplete dossier all block the movement (fail-closed). A closed packet + an applied decision are immutable forever (replay reproduces them).
- This workflow adds NO tier write of its own — it calls the RTI engine to move the tier and the Data Room to open/close the packet. Append-only, tenant-isolated, version-pinned (V-6), and never shown to a student.
Intervention Fidelity Tracker — 2026-06-20
Five new operations under the new fidelity tag (the count moves to 152). The gate that distinguishes
“the plan didn’t work” from “the plan wasn’t done,” so the platform never escalates a child whose intervention
was never actually run.
- The escalation gate.
GET /api/fidelity/{bundleAssignmentId}/checkreturns the synchronous verdict the RTI engine calls before any Tier 2 → Tier 3 move: the fidelity rate (sessions done ÷ planned), a 5-value rating, a 4-state verdict, andescalationBlocked. Faithful delivery (≥80%) with weak growth MAY escalate; under-80% with weak growth is BLOCKED; no session data is “cannot evaluate yet” — never a guessed pass. - Recompute + tracker + cohort.
POST …/recomputeappends a fresh immutable snapshot (advisory-locked, idempotent);GET …/{bundleAssignmentId}reads the latest;GET /api/students/{id}/fidelity-trackerfeeds the teacher’s Fidelity tab;GET /api/fidelity/admin/low-fidelity-cohortlists the at-risk plans (admin). - A growth-language paragraph. Each verdict auto-drafts a short Arabic review-meeting paragraph — rule- written, never AI — filtered for clinical/framework/deficit language.
- Append-only + version-pinned (V-6); teacher-scoped (B.4); tenant-isolated; never shown to a student. The persisted verdict is what the RTI fidelity gate now reads — no guessed up-move can slip past a plan that wasn’t delivered.
Student Data Room — 2026-06-20
Five new operations under the new data-room tag. The Data Room is the decision-documentation layer: it
assembles a child’s evidence into a reviewable dossier so that big decisions are documented, evidence-based,
and replayable years later. It never decides — the rules still decide; the Data Room makes them show their work.
- One dossier per child.
GET …/data-roomreturns the eight summary rollups (benchmark, CBM trend, practice, paper checks, maintenance, error patterns, fidelity, behavior flags) + a data-sufficiency status. The summaries are system-rendered rollups of referenced evidence — never free text, never an input to a rule. - Evidence is indexed, never copied.
GET …/data-room/evidencepages the evidence index; every row points back to its real source record. Invalid-administration evidence stays indexed (nothing is deleted) but is excluded from the sufficiency count. - Decision-packet shell.
POST …/decision-packetsopens a packet, snapshots the evidence by reference, and pins the rule versions it read — so the packet replays byte-for-byte years later. A second open of the same review type returns 409; a closed packet is immutable forever. - Weak data blocks a strong decision. When a skill is
do_not_decide_yet, or the evidence is below the comparable floor, the dossier reportsinsufficientand recommends the next data action — never a movement. - No tier movement here. The Data Room reads and composes the tier-decision record; it never writes one. The only tier write-path stays in the RTI engine.
- Students never see any of this — there is no student endpoint; teacher (their own classes) / admin only.
RTI Decision Engine — 2026-06-20
Ten new operations under the new rti tag. This is the central decision layer: it raises the right hand on
the teacher’s dashboard, and it is the single gate every RTI Tier change goes through.
- Two alert layers. Layer A is the per-skill state (
on_track/monitoring/maintenance, math-driven from the skill-status engine); Layer B is the per-student alert —Skill_Alert(a single-skill nudge),Student_Alert(this student needs a Targeted Support Plan),Intensive_Review(the deepest review). At most one alert is open per student; a more-severe alert auto-closes the lower one. - The 4 tier-movement rules. Every tier change runs
TM_ESC_01(escalate) /TM_MAINT_01(maintain) /TM_DEC_01(de-escalate) /TM_BLOCK_01(block on low fidelity). The engine recommends; the teacher approves; the specialist (admin-seated Wave-1) gates TIER_3.GET …/recommendationreturns the deterministic verdict;POST …/decisioncommits it through the single audited write path. - Fidelity before escalation.
TM_BLOCK_01runs beforeTM_ESC_01: “the plan didn’t work” and “the plan wasn’t done” are different verdicts. Until the fidelity tracker ships, escalation fails closed (blocked) — never a guessed up-move. do_not_decide_yetblocks everything. When the evidence isn’t sufficient, no alert opens and no tier moves — for any role, no override. The teacher sees a quiet “need more evidence” hint, never a red alarm.- No direct tier set, ever. A tier changes only through the gated decision path that writes an append-only decision record in the same transaction. Dismissals use a controlled reason code — never free text. Students never see any of this.
CBM / ORF Progress-Monitoring Engine — 2026-06-20
Ten new operations under two new tags — orf + cbm (the count moves to 129). The platform now
measures reading fluency and watches it improve over time.
- Teacher-scored ORF, no AI.
POST /api/orf/assessments/{start, :id/mark, :id/stop}records a 60-second oral read by teacher error-taps (no speech recognition, no LLM; audio archive-only) → words-correct-per- minute + accuracy → an append-onlycbm_scoresrow. - Aim line + 4-point trend.
POST /api/cbm/probeswrites a probe; the aim line activates after three baseline probes (median baseline), and the last four comparable points yield a typed trend (trend_above_goal/trend_below_goal/trend_mixed/insufficient_points).GET /api/cbm/trend/...returns the series + aim line;GET /api/cbm/alertsthe open alerts. - Emit-only (V-12). The engine emits the signal + a raw alert; it does not apply the fidelity gate or write the teacher message (those are WP-RTI-BE + the fidelity tracker, which read the alert behind an ≥80% gate). Acute regression raises the signal, never auto-fails. Dismissal uses a controlled reason code — never free text.
- Missing norms are honest. Grades with no benchmark norm yet (G1/G4 partner-owed; G2–3 seeded) score and trend on raw WCPM but report “not assessed” rather than judging against an invented cut.
Cognitive Warm-Up Engine — 2026-06-20
Seven new operations under the new warmup tag; the count moves to 119. The backend for the 3–5 minute,
rule-based, deterministic, non-linguistic warm-up a child plays before a session (shapes, colors, paths —
never words, letters, or numbers).
- Deterministic, seeded generation. 10 Tier-1 activity schemas + 24 per-grade configs through one central
generator; the same
(template, grade, difficulty, seed)always yields the identical exercise (V-6 replayable, validated byPOST /api/admin/warmup/dry-run-replay). - Resumable + capped.
POST /api/warmup/sessions→…/events→…/complete, with…/resumerestoring the child at the exact exercise step. A silent 5-minute cap; a skip is penalty-free (no retry, nowarmup_abandonedflag). - Isolated from measurement — by construction. The warm-up never touches θ, never emits a mastery event, never mints XP, and has no foreign key into any measurement/decision table. A CI import-graph guard + a schema-FK test prove it; telemetry is a readiness signal stored in the warm-up’s own tables only.
- Teacher/admin-gated, ClassTeacher-scoped; no student endpoint (the exercise is handed to the renderer).
- Partner-owed (June-25): COG-05 (Matching) + COG-06 (Visual Tracking) per-grade configs ship inactive; per-grade bounds + the SVG asset library are transcribed working defaults.
Student Diagnostic + Practice APIs — 2026-06-20
Ten new operations under two new tags — diagnostic-session (6) + practice (4); the count moves to
112. This is the student side of the RTI loop: a child sits a diagnostic, then practices adaptively.
- Diagnostic (WP-04).
POST /api/diagnostic-sessions/start→ serve item →…/responses→…/finish, per grade (Grade-1 audio-first, keyed on the student’s own grade — never the class’s). 24-hour save/resume (a signed state token whose trust root is a server-side ownership re-check), a 15-minute active-time cap (pauses + audio replays don’t count). The server grades each answer and chains θ itself — a student cannot fake correctness or steer their own score. Scoring runs through the engine; items through task-delivery. - Practice + adaptive CAT (WP-05).
POST /api/practice/queues/seed/:diagnosticSessionIdbuilds a queue from the diagnostic;…/current/next-itempicks the most informative item at θ±0.5;…/submit-responsescores it. Per-block auto-difficulty (80/60) is on, adjusting a durable per-(student × sub-skill) difficulty band that carries across sessions (a student resumes at the level they reached). SOLO hint on a wrong answer. - Student-safe by construction. Every student-facing response is allow-list serialized — no θ / status-label / profile / bundle / scaffold-tier / accuracy ever reaches the student. Student-own-scoped (a student can only touch its own session); a teacher reads session summaries for its own class only.
- The browser cache + 5-disruption restore is a later front-end change; the backend ships + tests the resume contract. Error-category vocabulary + the exact 80/60 thresholds are partner-owed (June-25), built behind documented working defaults.
Parent PDF Service — 2026-06-20
One new operation under the new parent-pdf tag (the count moves to 113). A teacher can export a
parent-facing growth report as an A4 Arabic right-to-left PDF — read-only composition, no measurement or
decision logic, no writes, no new table.
- The endpoint.
POST /api/parent/pdf/:childIdreturns the generated PDF directly (A4, Arabic RTL, body font ≥ 12 pt, ≤ 500 KB, well under 30 s). Teacher- or admin-triggered; never student-facing. - Strictest safety in the product. Every string passes the parent-stricter filter + a typed field strip: parents see no numbers, no percentages, no scores, no internal codes — growth language only (V-3 / V-12).
- Fail-closed. No resolved report → an explicit “needs more data” page, never a fabricated profile. No active safety rule set → 503 rather than ship unfiltered text.
- Arabic that travels. The Arabic font (Noto Naskh, embedded) ships inside the file, so it renders correctly when shared (e.g. via WhatsApp) regardless of the recipient’s device fonts. Deterministic content (V-6); no LLM (V-5).
Sub-Skill Report APIs — 2026-06-20
Three new operations under the new reporting tag (the count moves to 112). The read-side surface that
turns persisted engine outputs into teacher/admin-facing reports — read-only composition, no measurement
or decision logic, no writes.
- Per-student report.
GET /api/reports/students/:id/sub-skill-reportcomposes five sections in one payload: the educational profile (pre-written teacher narrative), the macro-domain tiles (plus GLOBAL meta), the per-skill statuses sorted “needs intervention” first, the active intervention plan, and any open alerts. A student with no resolved profile returns adataIncompletesection (a 200, not an error). - Per-class summary.
GET /api/reports/classes/:id/class-summaryreturns per-skill count distributions and the class macro status (worst-case-wins, never an average). A teacher sees per-student rows for their own class;?audience=managerswitches to aggregate-only (no student names or ids ever). - Band reference.
GET /api/reports/band-descriptionsreturns the 5-band Mizan catalog (cached 24h); Arabic labels are working defaults pending partner sign-off. - Safety. No global percentage anywhere (V-3); every string passes the language-safety filter (V-12); every query is org-scoped (cross-org → 404); no LLM (V-5). Teacher (ClassTeacher-scoped) + admin only.
Diagnostic Engine (θ-scoring) — 2026-06-20
Four new operations under the new engine tag (the count moves to 96). This is the measurement core:
it turns a student’s diagnostic answers into an ability estimate (θ) and picks the next best question.
- Score.
POST /api/engine/scoretakes the student’s responses + a prior θ and returns{newTheta, newSE, calibrationVersion, nextItemId, calibrationProvisional}. Every scoring run is recorded as an append-onlyDiagnosticSession. Reads/writes are org-scoped; gated by theUSE_DIAGNOSTIC_ENGINEpermission, and a student token can only score its own responses. - Deterministic + replayable (V-6). Same inputs → byte-identical θ to 4 decimal places. Each session
freezes the exact item difficulties it used (
resolvedDeltas) soPOST /api/engine/replay/:sessionIdreproduces the original score forever — even after the calibration pipeline later updates those difficulties. - Two-track difficulty. A question’s measured difficulty is trusted once it has ≥300 real answers (and matches the session’s pinned calibration version); otherwise the deterministic estimate is used and the row is flagged provisional.
- No LLM, ever (V-5). The whole scoring path is a pure deterministic function — covered by the no-LLM CI
lint and an integration spy. Insufficient data →
newTheta:null(a 200, the engine’s “don’t decide yet”). - Reads:
GET /api/engine/sessions/:id+GET /api/engine/sessions/by-student/:studentId. The Wave-1 score is a deterministic placeholder, not real Rasch psychometrics — the real statistical engine swaps in later (calibration pipeline) with zero contract change.
System QA Checks Registry — 2026-06-20
Four operations under the new qa tag (the count moves to 92). A 32-check registry records the
cross-cutting safety invariants the decision, intervention, and content engines already enforce — this layer
registers and routes them; it adds no new check logic.
- Observability for admins/engineering only.
GET /api/qa/checksreturns the 32-row registry;GET /api/qa/runtime-alertslists the routed runtime alerts for your organization (filter byseverity, time window, orqaCheckId);GET /api/admin/qa/audit-summaryrolls them up by check + severity. All requireVIEW_QA_REGISTRY(or super-admin/backoffice) — never reachable by a teacher or student token. - Deterministic routing. A runtime check failure is routed purely by its registered severity — critical surfaces to the teacher dashboard + pings the alert seam, high pings engineering, medium records the alert, low logs only. Every routed alert pins the registry + rule version for replay.
- No new alert sink. Routed alerts append to the existing platform alert log (append-only); the QA read never returns non-QA alert rows.
- New —
POST /api/admin/qa/checks-set(registry version-publish). Super-admin / backoffice only. Mirrors the bundle and decision-engine config-publish: it takes a candidate check-set with partner approval and an acknowledged diff, runs a completeness gate (all 32 checks present, each command runnable, each Arabic message safe), and returns the prospective new version — rejecting gaps, missing approval, or an unacknowledged diff. - New — QA_032 over-assessment guard. A medium-severity alert (never a block) that flags when a student is measured too often in a short window — the probe is always recorded; the alert just nudges the teacher to teach before testing again. The over-testing threshold and window are partner-tunable (working defaults pending the June partner session).
Language-Safety Layer — 2026-06-20
Six new operations under the new language-safety tag (the count moves to 102). The deterministic,
rule-based rewriting layer that sits in front of every string the platform renders, saves, or prints — it
converts clinical/deficit phrasing to growth language, strips international-framework labels, and applies a
stricter filter for parent audiences. It is never a language model (V-5) and never changes a measurement
value — only how it is described (V-3/V-12).
- Validate any string.
POST /api/language-rules/validaterunssanitize(text, appliesTo, profileKey?)and returns the safe text + the rules that fired + the rule-set version (V-6 audit replay). Audience tiers:ui / reports / parent / parent_stricter / teacher_note / student / admin. The parent-stricter tier also blocks numeric scores and internal IDs; a per-profiledo_not_sayoverlay applies at render time. - Admin rule management.
GET /api/language-rulesreturns the active rule set (consumed at copy-load by the app + PDF renderer);POST /api/language-rulesinserts a new version (append-only — the prior row is deactivated, never edited in place);PATCH /api/language-rules/{id}/deactivateretires a row. - Controlled context flags only.
POST /api/students/{id}/context-flagattaches a controlled flag from the closed dictionary — no free-text body field exists; a blocked/forbidden flag returns422 context_flag_blockedand records a length-only security-audit entry (no text about a child is ever stored).GET /api/students/{id}/context-flagsreturns the recorded flags + their safe labels (teacher/admin only). - Fail-closed. A render that cannot load an active rule set refuses rather than emit unfiltered text.
Standards Spine + Dynamic Benchmark Management — 2026-06-20
Eight new operations under the new standards tag (the spec now serves 110 operations). This is how a
raw score becomes “on track / approaching / below / severe” — and how those thresholds are managed without
touching code.
- One verdict vocabulary, never a single score (V-3). Every measure resolves to one of
meets / approaching / below / severe / not_assessed.not_assessedis never treated as zero and is excluded from every rollup. There is no “overall reading score” anywhere. - Thresholds live in data, versioned, swappable.
GET /api/standards/benchmark-profiles/resolvewalks a deterministic 3-step chain (exact → country-default → global fallback) and never returns null — a seeded global row guarantees a resolution. Super-admins manage the rows under/api/admin/benchmark-profiles; an activate is an atomic swap (the old version deactivates, the new one activates, both audited). - History never shifts (V-6). A session pins its benchmark versions at the start; a later swap does not change in-flight verdicts, and every stored result re-derives its original status forever.
- Zero framework labels (FR-STD-1). International yardsticks (used only to calibrate the Arabic thresholds) are stripped at the serialization layer — they never reach a teacher, principal, parent, or student surface. The principal dashboard speaks only Curriculum-KPI Arabic labels.
- Principal reads + a Ministry-report stub.
GET /api/principal/schools/:id/{overview,heatmap}return Curriculum-KPI gauges + a computed School Literacy Health Index (no PII). The one-click Ministry report is registered as a Wave-1 contract stub (501); the PDF lands in Wave 2.
Benchmark numbers are partner-owned (extracted from the v3.3 workbook — Jordan + Palestine); the global
fallback rows + Curriculum-KPI label wording are flagged working defaults pending the June-25 partner session.
Migration: 20260620150000_wp_std_be. PRD: 12-benchmarks-and-country-config (FR-STD-1..6).
Teacher Activation Workflow — 2026-06-20
Seven new operations under the new workflow tag (the count moves to 88). A 17-step workflow walks a
teacher through review → group → activate → monitor → decide, composing the bundle, cluster, and monitoring
engines into one guided flow.
- The teacher decides; the system blocks unsafe moves (V-12).
POST /api/workflow/activatewill not activate ado_not_decide_yetplan, and an invalid lifecycle move is rejected (409). The RTI tier is never changed by any workflow action. - Guardrails at the boundary. Insufficient evidence → a prompt, not an activation; an open acute-regression
alert forces the review path (
POST /api/workflow/reviewresolves it by an explicit teacher disposition); aDATA_INCOMPLETEstudent is held out of bulk activation. - Per-member adjustments are deliberate. Any scaffold-tier / delivery / schedule change off the recommendation requires explicit confirmation (else 409) and a controlled reason code — never free text.
- No new data path. Workflow effects route through the existing activation / cluster / monitoring write
paths; the triggering workflow step is stamped on the assignment history.
POST /api/workflow/save-draftpersists an in-progress workflow (7-day), restored viaGET /api/workflow/draft/:teacherId/:classId. - Also: the acute-regression alert response no longer includes
contextFlagId— a dismissal’s context flag is recorded only in the context-flag log (the single sanctioned home).
Logging & Observability — 2026-06-20
Two new operations under the new logging tag (the count moves to 81). Completes the platform
logging contract on top of the logger core + request correlation shipped earlier.
POST /api/logs/client— the browser ships its WARN/ERROR logs to the server (warn/error only; rate-limited to 1 batch per user per 5s). The server stampsuserId/orgId/tsfrom the authenticated identity — never from the request body — so a client cannot mislabel who or which tenant a line belongs to. The intake endpoint deliberately does not log itself (infinite-loop guard).GET /api/health/log-level— an unauthenticated ops probe returning the active global log level and any per-module overrides, so operators can confirm what a deployment is running without a redeploy.- No schema change. Logging is a process concern; client-log persistence is a Wave-2 question.
Task Delivery Service — 2026-06-20
Four new operations under the new delivery tag (the count moves to 79). This is what happens at
student time: the platform picks the right pre-authored item and serves it exactly as stored — it
never composes content at runtime.
- Select + serve.
POST /api/delivery/taskfilters the operational item bank on(sub-skill × scaffold tier × task_mode × difficulty × dialect), picks deterministically, and returns the stored item. It is the single serve chokepoint every consumer (diagnostic, practice, probes, ISM) calls per task. - No improvisation. An empty candidate pool returns
200 { served: null, log.reason: "no_eligible_item" }— never an error, never a silent substitute. The caller decides the UX; the gap aggregates toGET /api/admin/delivery/gap-summary(by sub-skill × tier × mode) for content-volume planning. - Tier + mode are filters, not render switches. Per-tier variants are distinct stored rows; probes ignore tier and show no hints or mid-probe feedback (probe integrity). The student never sees a profile/bundle/tier label.
- Deterministic + auditable. Every serve writes an append-only
served_task_instancekeyed by a UUIDtaskInstanceId(the QR-worksheet join token), pinning its selection inputs;POST /api/admin/delivery/dry-run-replayre-asserts the same pick (V-6) and writes nothing. No LLM on the serve path (V-5). - The recently-served window (Wave-1 default 7 days) and per-mode feedback copy are partner-owed
(June-25); teacher-enable lower-tier substitution is deferred (default is
no_eligible_item).
Progress Monitoring Rules — 2026-06-20
Three new operations under the new monitoring tag (the count moves to 75). The platform now tracks
whether an intervention is working and raises a review when a student regresses.
Spec reconciliation + fixes — 2026-06-20
Maintenance pass — no new operations (the count stays at 75). Two consumer-visible API changes plus spec-corpus and test corrections.
-
Organizations now return
createdBy. EveryGET /api/organizations/…/{id}response includes the provisioning super-admin’screatedBy(a User id) — the value was already stored at creation, now it is exposed (FR-TEN-1). TheOrgResponseschema in the API reference is updated accordingly. -
Domain-status sentinel correctness.
GET /api/sse/students/{id}/domain-statusnow returns the storeddoNotDecideYetsentinel instead of re-deriving it at read time, honoring the FR-SSE-16 “stored, not re-derived” consumer contract (it already behaved this way on the skill-status endpoint). Consumers must still branch ondoNotDecideYetbefore acting on a domain status. -
Taxonomy routes document
401. The four taxonomy reads (GET /api/sub-skills,…/{id},/api/domains,/api/skill-dependencies) now declare the401(unauthenticated) response they already returned at runtime. -
Internal only.
gateway_priority_configgains apublished_ataudit column (migration20260620120000); spec-corpus reconciliations in11-skills-taxonomy(15 active item types, the 4-route taxonomy API surface, SENT/SYN count 12, plain-view DDL), two corrected FR-ID cross-references, and added test coverage for the request-id and append-only / context-flag lint fail-branches. -
Record progress evidence.
POST /api/monitoring/evidencerecords the non-CBM evidence types (quick check, rubric, teacher observation, worksheet, digital trend) with administration conditions; the record is append-only and triggers a deterministic recompute pinned to the rule version (V-6). -
Five safety rules govern every decision. Insufficient or non-comparable evidence →
do_not_decide_yet; no single-point decisions; only comparable probes are compared. Acute regression (≥20% drop over two comparable sessions) raises an Immediate Review — never auto-fail, auto-exit, or an RTI-tier change (the teacher decides). -
Status + alerts.
GET /api/monitoring/status/{bundleAssignmentId}returns the current verdict + the pinned rule versions;GET /api/monitoring/alertslists open acute-regression alerts. An alert resolves only by an explicit teacher disposition via a controlled reason code — never free text. -
Smart Cluster safety seam closed. Cluster now reads the real acute-regression alert (was fail-open), so a regressing student is routed to individual review and kept out of group clusters.
Smart Cluster Sequencing + Bulk Activation — 2026-06-19
Three new operations under the new cluster tag (the count moves to 72). Teachers group students who
need the same support into small clusters and activate the whole group in one confirmed action.
- Deterministic clustering.
POST /api/cluster/recompute/:classIdgroups a class on a 5-key match (bundle + anchor + scaffold tier + compatible delivery mode + sufficient data); Level 1 + Level 2 merge for teacher-led delivery only; minimum group size 2 (school knob 3–5). Results are append-only and pinned to the sequencing-rule version (V-6 deterministic), 60-day retention. - Safety exclusions are hard. A
DATA_INCOMPLETE/do_not_decide_yetstudent → a more-data-needed list; an acute-regression student → individual review. Neither is ever clustered or bulk-activated. - Bulk activation (V-12, reuses the single write path).
POST /api/cluster/bulk-activaterequires explicit teacher confirmation and creates each member’s assignment through the same writer individual activation uses (stamped with the cluster id); all-or-nothing (an excluded member → 422), idempotent, 409 on a stale cluster. Differentiation lives only in each student’s scaffold assignment. - Preview.
GET /api/cluster/student/:studentId/previewreturns the student’s current cluster + the per-tier scaffold distribution; no student-facing internals. - The
acute_regressionexclusion reads a fail-open seam until WP-PROG-MON-BE produces the signal; the nightly recompute cron is a seam (the idempotent recompute function ships now).
Intervention Bundle Catalog — 2026-06-19
The bundle recommender ships. Eleven new operations under /api/bundles, /api/assignments,
/api/students/:id, and /api/admin/bundle-* (the count moves to 69). It turns a student’s profile
into one recommended intervention bundle from a closed 20-bundle catalog, assigns scaffold support, and
hands the teacher a one-tap activation.
- Deterministic recommendation.
POST /api/bundles/recommendmaps the active profile (primary code + driver/modifier) through the 21-row profile→bundle map to exactly one bundle — anchor skill + 1–2 supporting threads + a data-supported asset bridge, anchor always the largest dosage share. - A bundle is a recommendation, never automatic (V-12).
POST /api/students/:id/activate-bundlerequires the teacher; the 14-status assignment lifecycle is append-only + history-shadowed. Adjustments (PATCH …/scaffold-assignment | …/delivery | …/schedule,POST …/context-flag) use controlled reason codes only — never free text — and RTI tier can never be changed by any adjustment. - Fail-closed. No active profile / DATA_INCOMPLETE /
do_not_decide_yet⇒blocked_insufficient_evidence- a recommended next data action, never a bundle. No role can override.
- Student never sees a bundle id/name, profile code, or scaffold-tier label — stripped at serialization.
- Admin.
POST /api/admin/bundle-config-sets(publish + completeness gate);POST /api/admin/bundle-replay(V-6 dry-run diff, zero writes). The Smart Cluster + bulk-activation endpoints arrive with WP-CLUSTER-BE.
Student Profile Resolution — 2026-06-19
The profile engine ships. Six new operations under /api/profile + /api/admin/profile-* (the count
moves to 58). It composes the Skill-Status Engine’s outputs into one named educational profile per
student per assessment window, with rule-written Arabic narratives per audience.
- Read a student’s profile.
GET /api/profile/{studentId}?window=returns the active OR pending assignment — primary code, drivers, modifiers, confidence, the rule-version pins, andassignmentStatus(so a client can render the pending → confirm flow). When evidence is incomplete it returns a structured DATA_INCOMPLETE guard (200), not a profile. - Narratives are per-audience and safe.
GET /api/profile/{studentId}/narrative?audience=teacher|parent|adminreturns the pre-rendered, language-safety-filtered text.audience=studentis rejected (422) — a student never receives a profile code or any numeric score; the parent narrative carries no internal IDs. - Teacher decides (V-12). A fresh resolution starts
pending_teacher_review;POST /api/profile/{studentId}/review-resolutionwith{ outcome: confirm }promotes it toactive. Exactly one active profile per window; every change is a new append-only row. - Fail-closed. If any contributing skill is
do_not_decide_yetor a required domain lacks coverage, no profile is written — no role can override. - Admin.
POST /api/admin/profile-catalog/publishruns the catalog-completeness gate (every primary has its CDP/PGA/SPOT coverage);POST /api/admin/profile-replayis a V-6 dry-run diff (zero writes).
Skills taxonomy: VOCAB-C06 added (79 skills) — 2026-06-19
Partner confirmation of the sub-skills ID mapping. The skill catalogue grows from 78 to 79:
GET /api/sub-skills now returns VOCAB-C06 Synonym Awareness (فهم المرادفات) in the VOCAB domain.
No new operations (count unchanged at 52).
- New skill
VOCAB-C06Synonym Awareness. TheVOC_RELconcept now covers both antonyms (VOCAB-C05) and synonyms (VOCAB-C06). - Sub-flag naming locked. The writing/spelling sub-flag is
WR_SP_ALERT(wasWRITING_SPELLING_ALERT);AR_LETTER_CONFUSION_FLAGis scoped to visual/orthographic confusion only. - Arabic-feature scope (current phase). Madd / short-vowel signals come only from written/visual items (never oral-reading error type); hamza / shadda / tanween open no bundle or alert this phase.
Skill-Status Engine — 2026-06-19
The decision engine ships. Seven new operations under /api/sse (the count moves to 52): five
read-only consumer endpoints and two admin endpoints.
- Read a student’s interpretive status.
GET /api/sse/students/{id}/skill-statusand/domain-statusreturn the latest per-(subSkill|domain, evidenceWindow)snapshot, each carrying the matched rule ids + versions (for replay) and thedoNotDecideYetsentinel./pattern-classificationsand/behavior-logback the “why this status” explainer + the teacher review surface. All tenant-scoped, read-only,VIEW_STUDENT_PROGRESS— no status value is ever exposed to a student. do_not_decide_yetis a hard stop. When evidence is thin, split, or low-quality the engine refuses to commit and emits the sentinel; consumers must short-circuit (no profile, no bundle, no alert, no mastery event) — and no role can override it.- Status is computed server-side, never on demand. There is no client “compute now” endpoint; status recomputes when new evidence arrives. Reads return the stored snapshot — consumers never re-derive.
- Admin (rule publishing).
POST /api/sse/admin/rules/{layer}/publish(schema + Arabic-accuracy carve-out + overlapping-row validation + dry-run diff, partner-approval gated) andPOST /api/sse/admin/dry-run-replay. Publishing a new rule version never retroactively changes existing snapshots — they stay pinned to their compute-time version.
Request correlation id — 2026-06-19
Every response now carries an X-Request-Id correlation id, and the server accepts one inbound. No
new operations; the count is unchanged (45).
X-Request-Idis now accepted inbound. If you send anX-Request-Idheader matching^[A-Za-z0-9_-]{8,128}$, the server honors it verbatim on the response; otherwise it generates one (req_<hex>). Send your own to correlate a client trace with the server’s logs for that request.X-Request-Idis CORS-exposed. Browser clients on an allowed origin can now read the header off the response (Access-Control-Expose-Headers: X-Request-Id) — log it client-side and quote it in a bug report.- Internal (no contract change): the id auto-binds to every server log line and is persisted on the activity/audit rows a request produces (auth events, super-admin bypass, item-import + δ_prior logs), so support can pivot a reported id → server logs → DB changes for one request.
Content — 2026-06-19
Item-bank content + schema additions ahead of pilot. No new published operations; the count is unchanged (45) — the item-bank/import surfaces remain pre-publication.
- Seed content corpus. A validated offline-authored item bank now exists — 645 items + 2 reading
passages across all 7 digital domains, every item-eligible skill at 8–12 items, all passing the
import gate. Items are
source='offline_generated',status='draft'(never served until SME approval). Difficulty priors are fully real (SAMER v2 lexicon + CAMeL). - Schema additions (item bank). A new optional
image_urlimport field +imageUrlcolumn for picture items, and thedistractor_typecontrolled vocabulary expanded 13 → 19 (addedagreement_error,wrong_part_of_speech,wrong_function_word,in_text_wrong_detail,wrong_syllable_count,wrong_sound_position) with the import compatibility matrix extended to match. No change to any published route, request shape, or response shape.
Maintenance — 2026-06-18
Routine-driven correctness pass — closed a batch of code-review and spec-drift findings. No new operations; the count is unchanged (45).
- Class deletion is now a soft-delete.
DELETE /api/classes/:idis unchanged for callers (same request + same{ ok: true }response), but the class and its teacher/student assignments are now preserved for audit rather than removed. A deleted class no longer appears inGET /api/classes,GET /api/classes/:id(404), the organization rollup, or roster/assignment lookups. Deleting a class with active students is still rejected (409). → See Core Concepts. - Internal hardening (no API-contract change): tighter manager/teacher scope enforcement on class, student, and principal access; rate-limit lockout-escalation correctness; item-import integrity (atomic write, per-row validation, global-item write protection); and log-field redaction. Error codes, request shapes, and response shapes are unchanged.
Sub-skills taxonomy — 2026-06-18
The Arabic-literacy skills taxonomy ships as global reference data — 4 read operations under the
new sub-skills tag, raising the published total to 45 operations.
GET /api/sub-skills— paginated list of all 78 sub-skills with domain, grade range, assessment type, and gateway/key-skill flags. (raised to 79 by the VOCAB-C06 entry above)GET /api/sub-skills/{id}— single sub-skill row by CUID.GET /api/domains— all 9 Arabic-literacy domains (Phonological Awareness, Decoding, Fluency, Comprehension, Vocabulary, and more).GET /api/skill-dependencies— prerequisite links between skills.
All four endpoints require authentication and tenant context; the VIEW_OWN_PROFILE permission is
held by every persona (Teacher, Admin, Principal, Manager, Parent, Student), so the taxonomy is
available to all authenticated users.
Item bank foundation — 2026-06-18
The item-bank API surface is available — 12 operations under the items tag. These are
pre-publication routes gated to RUN_BACKOFFICE (backoffice/SME-only); they do not affect the
published-operation count of 45.
- Read —
GET /api/items,GET /api/items/{id}(?view=authoringfor the full authoring row),GET /api/item-types,GET /api/item-types/{code}/distractor-compat. - Import pipeline —
POST /api/items/import,GET /api/items/import,GET /api/items/import/{batchId},POST /api/items/import/{batchId}/approve. - Lifecycle —
POST /api/items/{id}/sme-approve,POST /api/items/{id}/promote,POST /api/items/{id}/retire. - Difficulty priors —
GET /api/expected-difficulty/log(deterministic δ_prior audit trail).
1.0.0-phase0
Initial portal. Documents WP-A1 authentication + WP-A2 tenancy + WP-A3 user management — 41
operations across the tags auth, tenancy, users, and system.
- auth — register (bootstrap-only), login, refresh (rotating),
me, logout, password reset + confirm, student quickCode login. → See Authentication, Getting Started, and Errors. - tenancy — organizations, schools, classes (CRUD), organization rollup. → See Core Concepts and Manage classes & rosters.
- users — provisioning (discriminated
POST /api/users), persona profiles, class rosters, ClassTeacher assignment, principals, managers, parent ↔ child links, student quickCode regen. → See Provision users, Managers & principals, and Parent ↔ child links. - system — health check. → See the API Reference.