Changelog

Hand-maintained for Wave 1, reverse-chronological. A changelog entry is added in the same pull request as any spec-affecting backend change.

Decision-loop guides — fidelity, decision packets, data room — 2026-06-20

No API change. Three new developer guides under Guides, grounded in the shipped module code, give the RTI decision wave its API-tier coverage: Intervention Fidelity Tracker (the escalation gate: rate, rating, verdict, and escalationBlocked), Decision Packets (the assemble, approve, and reject workflow that calls the RTI engine to move the tier), and Student Data Room (the dossier + evidence index + packet shell). Each documents its real routes, permissions, and field shapes, with the fail-closed, append-only, version-pinned, controlled-reason, and never-student-facing invariants. English + Arabic parity.

Platform Guide — external documentation portal — 2026-06-20

No API change. A second documentation flavor for a non-developer audience, served under the /external path prefix. The navbar carries two tabs, Developer Docs and Platform Guide, and each portal shows only its own menu (the sidebar switches per portal). Developer-doc URLs are unchanged.

27 deep pages in English and Arabic, grounded in the actual module code, organized into six sections: Measurement & Decisions, Intervention & Teaching, Monitoring & Response, Reports & Dashboards, Platform Foundations, plus Overview, Trust & Privacy, Roadmap, and an FAQ. Each page explains a real capability end to end with its true rules and thresholds, in plain product language (no endpoints, schemas, or internal identifiers).
Growth language only: no clinical or program labels, no single overall score, and an honest split between what is available now and what is planned.
Both portals share one app, theme, and search; full EN/AR parity (60/60) holds.

Portal — Phase 1 comprehensive guides — 2026-06-20

No API change. A documentation-only pass that brings the portal current with the ~21 RTI-engine work packages shipped since the portal was first built. The provisioning docs previously stood alone; the measurement-led RTI loop now has full conceptual coverage.

New “How Phase 1 Works” concept section — The RTI Loop, Measurement & Determinism, The Decision Engine, Intervention Design, and Safety Rules.
16 new subsystem guides under Guides — diagnostic & practice, skills taxonomy, skill-status engine, student profiles, intervention bundles, smart clustering, teacher activation, progress monitoring, CBM & ORF fluency, RTI decisions (preview), standards & benchmarks, task delivery, reporting & parent PDF, cognitive warm-up, language safety, and system QA checks.
New Glossary and three stale-page fixes (Overview, Errors decision-engine “safe-stop” outcomes, Managers & Principals rollup now populated).
Every page ships in English + Arabic (33/33 parity); all internal links resolve.

Parent Web Portal — 2026-06-20

Four new read-only operations under a new parent-portal tag (the count moves to 171) — an authenticated parent’s view of their own children.

GET /api/parent/children (linked children), …/:childId/diagnostics (status-only history), …/:childId/growth (macro-domain status trajectory, Arabic labels only), …/:childId/recommendations (≤5 at-home suggestions from the D2 pre-generated narrative — never an LLM at runtime).
Own-child only. The parent is the authenticated user (no :parentId in any path → no IDOR); each :childId is checked against the ParentStudent link — an unlinked in-org child → 403, a cross-org/non-existent child → 404. Multi-child parents see all their linked children.
No-leak by construction (two belts). A typed allow-list DTO + the canonical parent-stricter string filter — no θ, no numbers, no percentages, no internal labels ever reach a parent (a child with a real θ/profile/percentage in their data shows none of it). Fail-closed (503) if the filter rules are missing.
No WhatsApp digest, no phone storage (D1) — every route is GET, read-only; the portal stores nothing about the parent.

Multi-School Admin Overview — 2026-06-20

Three new operations under a new multi-school tag (the count moves to 167) — an org-level rollup for a network manager, built on top of the school-level board view.

GET /api/orgs/:orgId/overview — a rollup card per school (name + total students + class count + how many need support). GET /api/orgs/:orgId/comparisons — the 6 reading-domain tiles × N schools, side-by-side, as status counts. GET /api/orgs/:orgId/trends — an org time-series, one line per school × tile, over month windows.
The comparison is a neutral status rollup, not a ranking — no best/worst ordering, no school leaderboard. Cross-Cutting shows the “coming soon” placeholder.
Aggregate only, by construction. It shows class names but never a student name or id in any of the three responses — the same lint guard (extended to this module) + tests that seed a recognizable name and assert its absence. No θ, no single overall percentage (V-3); Not_Assessed excluded from rollups.
A manager sees all their org’s schools by default (no partial-scoping UI in Wave 1); a cross-org id returns 404. Manager / Admin only — never teacher- or student-facing.
Reuses the shipped school-aggregate engine + readers (no new tables, no measurement logic). Batched reads meet the ≤5 s budget for a 10-school org.

Parent Narrative Coverage (AI-generated, provisional) — 2026-06-20

One new admin operation (the count moves to 164) plus a content + data change so the parent report and portal show a written growth explanation for every profile, not just the two partner-authored ones.

56 Arabic parent narratives (the 12 primary + 5 modifier profiles × grade bands) are generated from the profile specs and seeded as a provisional layer. Each is stamped ai_generated + needs_sme_review and passes the parent-stricter filter with zero changes — no numbers, no θ, no internal labels, growth language only (the same V-12/V-3 discipline as everything parent-facing).
They ship live, flagged (Mohammad’s call): a parent sees the explanation now; the partner reviews and replaces each at the June-25 working session. Only the parent render is overridden — the teacher, admin, and student narratives stay partner-authored. The safe “not enough data yet” page remains for genuine no-profile cases, not for “narrative not written yet.”
GET /api/admin/profile-narratives/pending-review lists the provisional rows for the SME review queue (RUN_BACKOFFICE only).

School Overview + Board PDF — 2026-06-20

Two new operations under a new board-pdf tag (the count moves to 163) — a school-level view for principals and network managers.

School overview JSON + a one-page A4 Arabic-RTL board PDF for a board/management meeting: per-class macro-status counts + the school-wide distribution across the 6 reading domains (+ the “coming soon” Cross-Cutting placeholder).
Aggregate only, by construction. It shows class names (so a manager can act) but never a student name or id — a dedicated lint guards every read path, and a test seeds a recognizable name and asserts it’s absent from both the PDF bytes and the JSON. No θ, no single overall percentage (V-3).
Reuses the shipped reporting readers + the shared lib/pdf primitive (embedded Arabic font, V-6 byte-determinism) that the parent PDF now also builds on — one PDF recipe, two layouts.
Manager / Principal / Admin only (a principal sees their own school; a manager sees their org’s schools); never teacher- or student-facing. ≤500 KB, ≤30 s, fail-closed.

v5 vocabulary alignment — 2026-06-20

No behavior change. The RTI decision layer now stores the partner’s canonical v5 status-code names directly, removing an interim translation layer. Data-sufficiency reads sufficient / insufficient / contextual; tier decisions escalate / de_escalate / maintain / defer; RTI alerts the 5-kind catalog Skill_Alert / Student_Alert / Intensive_Review / acute_regression_review / tier_review_due; the approval role set adds specialist; the decision packet reads draft / under_review / approved / rejected / deferred. Only the spelling of the stored tokens changed — decision logic, guardrails, and determinism (V-6) are untouched, and the OpenAPI schemas reflect the v5 names.

Decision Packet Workflow — 2026-06-20

Four new operations under the new decision-packet tag (the count moves to 161). The closing piece of the RTI decision wave: the review-and-approval workflow that turns a tier-movement recommendation into an audited decision — and only then moves the child’s tier.

Assemble. POST /api/decision-packet-workflows/assemble opens a rich decision packet (evidence snapshotted by reference, every rule/config version pinned for replay), computes the deterministic tier recommendation, and records a workflow row awaiting review.
Review + approve. GET /api/decision-packet-workflows/{id} is the reviewer’s surface; POST …/{id}/approve routes the recommendation through the approval gate — a Tier-3 move requires the high-intensity specialist (admin-seated for Wave 1) — and on approval executes the tier movement through the RTI engine’s single tier-write path, then closes the packet. POST …/{id}/reject records a controlled-reason decline; no movement.
No movement without approval; the decision reason is a controlled code, never free text. A sentinel “need more evidence” halt, an unapproved Tier-3 move, or an incomplete dossier all block the movement (fail-closed). A closed packet + an applied decision are immutable forever (replay reproduces them).
This workflow adds NO tier write of its own — it calls the RTI engine to move the tier and the Data Room to open/close the packet. Append-only, tenant-isolated, version-pinned (V-6), and never shown to a student.

Intervention Fidelity Tracker — 2026-06-20

Five new operations under the new fidelity tag (the count moves to 152). The gate that distinguishes “the plan didn’t work” from “the plan wasn’t done,” so the platform never escalates a child whose intervention was never actually run.

The escalation gate. GET /api/fidelity/{bundleAssignmentId}/check returns the synchronous verdict the RTI engine calls before any Tier 2 → Tier 3 move: the fidelity rate (sessions done ÷ planned), a 5-value rating, a 4-state verdict, and escalationBlocked. Faithful delivery (≥80%) with weak growth MAY escalate; under-80% with weak growth is BLOCKED; no session data is “cannot evaluate yet” — never a guessed pass.
Recompute + tracker + cohort. POST …/recompute appends a fresh immutable snapshot (advisory-locked, idempotent); GET …/{bundleAssignmentId} reads the latest; GET /api/students/{id}/fidelity-tracker feeds the teacher’s Fidelity tab; GET /api/fidelity/admin/low-fidelity-cohort lists the at-risk plans (admin).
A growth-language paragraph. Each verdict auto-drafts a short Arabic review-meeting paragraph — rule- written, never AI — filtered for clinical/framework/deficit language.
Append-only + version-pinned (V-6); teacher-scoped (B.4); tenant-isolated; never shown to a student. The persisted verdict is what the RTI fidelity gate now reads — no guessed up-move can slip past a plan that wasn’t delivered.

Student Data Room — 2026-06-20

Five new operations under the new data-room tag. The Data Room is the decision-documentation layer: it assembles a child’s evidence into a reviewable dossier so that big decisions are documented, evidence-based, and replayable years later. It never decides — the rules still decide; the Data Room makes them show their work.

One dossier per child. GET …/data-room returns the eight summary rollups (benchmark, CBM trend, practice, paper checks, maintenance, error patterns, fidelity, behavior flags) + a data-sufficiency status. The summaries are system-rendered rollups of referenced evidence — never free text, never an input to a rule.
Evidence is indexed, never copied. GET …/data-room/evidence pages the evidence index; every row points back to its real source record. Invalid-administration evidence stays indexed (nothing is deleted) but is excluded from the sufficiency count.
Decision-packet shell. POST …/decision-packets opens a packet, snapshots the evidence by reference, and pins the rule versions it read — so the packet replays byte-for-byte years later. A second open of the same review type returns 409; a closed packet is immutable forever.
Weak data blocks a strong decision. When a skill is do_not_decide_yet, or the evidence is below the comparable floor, the dossier reports insufficient and recommends the next data action — never a movement.
No tier movement here. The Data Room reads and composes the tier-decision record; it never writes one. The only tier write-path stays in the RTI engine.
Students never see any of this — there is no student endpoint; teacher (their own classes) / admin only.

RTI Decision Engine — 2026-06-20

Ten new operations under the new rti tag. This is the central decision layer: it raises the right hand on the teacher’s dashboard, and it is the single gate every RTI Tier change goes through.

Two alert layers. Layer A is the per-skill state (on_track / monitoring / maintenance, math-driven from the skill-status engine); Layer B is the per-student alert — Skill_Alert (a single-skill nudge), Student_Alert (this student needs a Targeted Support Plan), Intensive_Review (the deepest review). At most one alert is open per student; a more-severe alert auto-closes the lower one.
The 4 tier-movement rules. Every tier change runs TM_ESC_01 (escalate) / TM_MAINT_01 (maintain) / TM_DEC_01 (de-escalate) / TM_BLOCK_01 (block on low fidelity). The engine recommends; the teacher approves; the specialist (admin-seated Wave-1) gates TIER_3. GET …/recommendation returns the deterministic verdict; POST …/decision commits it through the single audited write path.
Fidelity before escalation. TM_BLOCK_01 runs before TM_ESC_01: “the plan didn’t work” and “the plan wasn’t done” are different verdicts. Until the fidelity tracker ships, escalation fails closed (blocked) — never a guessed up-move.
do_not_decide_yet blocks everything. When the evidence isn’t sufficient, no alert opens and no tier moves — for any role, no override. The teacher sees a quiet “need more evidence” hint, never a red alarm.
No direct tier set, ever. A tier changes only through the gated decision path that writes an append-only decision record in the same transaction. Dismissals use a controlled reason code — never free text. Students never see any of this.

CBM / ORF Progress-Monitoring Engine — 2026-06-20

Ten new operations under two new tags — orf + cbm (the count moves to 129). The platform now measures reading fluency and watches it improve over time.

Teacher-scored ORF, no AI. POST /api/orf/assessments/{start, :id/mark, :id/stop} records a 60-second oral read by teacher error-taps (no speech recognition, no LLM; audio archive-only) → words-correct-per- minute + accuracy → an append-only cbm_scores row.
Aim line + 4-point trend. POST /api/cbm/probes writes a probe; the aim line activates after three baseline probes (median baseline), and the last four comparable points yield a typed trend (trend_above_goal / trend_below_goal / trend_mixed / insufficient_points). GET /api/cbm/trend/... returns the series + aim line; GET /api/cbm/alerts the open alerts.
Emit-only (V-12). The engine emits the signal + a raw alert; it does not apply the fidelity gate or write the teacher message (those are WP-RTI-BE + the fidelity tracker, which read the alert behind an ≥80% gate). Acute regression raises the signal, never auto-fails. Dismissal uses a controlled reason code — never free text.
Missing norms are honest. Grades with no benchmark norm yet (G1/G4 partner-owed; G2–3 seeded) score and trend on raw WCPM but report “not assessed” rather than judging against an invented cut.

Cognitive Warm-Up Engine — 2026-06-20

Seven new operations under the new warmup tag; the count moves to 119. The backend for the 3–5 minute, rule-based, deterministic, non-linguistic warm-up a child plays before a session (shapes, colors, paths — never words, letters, or numbers).

Deterministic, seeded generation. 10 Tier-1 activity schemas + 24 per-grade configs through one central generator; the same (template, grade, difficulty, seed) always yields the identical exercise (V-6 replayable, validated by POST /api/admin/warmup/dry-run-replay).
Resumable + capped. POST /api/warmup/sessions → …/events → …/complete, with …/resume restoring the child at the exact exercise step. A silent 5-minute cap; a skip is penalty-free (no retry, no warmup_abandoned flag).
Isolated from measurement — by construction. The warm-up never touches θ, never emits a mastery event, never mints XP, and has no foreign key into any measurement/decision table. A CI import-graph guard + a schema-FK test prove it; telemetry is a readiness signal stored in the warm-up’s own tables only.
Teacher/admin-gated, ClassTeacher-scoped; no student endpoint (the exercise is handed to the renderer).
Partner-owed (June-25): COG-05 (Matching) + COG-06 (Visual Tracking) per-grade configs ship inactive; per-grade bounds + the SVG asset library are transcribed working defaults.

Student Diagnostic + Practice APIs — 2026-06-20

Ten new operations under two new tags — diagnostic-session (6) + practice (4); the count moves to 112. This is the student side of the RTI loop: a child sits a diagnostic, then practices adaptively.

Diagnostic (WP-04). POST /api/diagnostic-sessions/start → serve item → …/responses → …/finish, per grade (Grade-1 audio-first, keyed on the student’s own grade — never the class’s). 24-hour save/resume (a signed state token whose trust root is a server-side ownership re-check), a 15-minute active-time cap (pauses + audio replays don’t count). The server grades each answer and chains θ itself — a student cannot fake correctness or steer their own score. Scoring runs through the engine; items through task-delivery.
Practice + adaptive CAT (WP-05). POST /api/practice/queues/seed/:diagnosticSessionId builds a queue from the diagnostic; …/current/next-item picks the most informative item at θ±0.5; …/submit-response scores it. Per-block auto-difficulty (80/60) is on, adjusting a durable per-(student × sub-skill) difficulty band that carries across sessions (a student resumes at the level they reached). SOLO hint on a wrong answer.
Student-safe by construction. Every student-facing response is allow-list serialized — no θ / status-label / profile / bundle / scaffold-tier / accuracy ever reaches the student. Student-own-scoped (a student can only touch its own session); a teacher reads session summaries for its own class only.
The browser cache + 5-disruption restore is a later front-end change; the backend ships + tests the resume contract. Error-category vocabulary + the exact 80/60 thresholds are partner-owed (June-25), built behind documented working defaults.

Parent PDF Service — 2026-06-20

One new operation under the new parent-pdf tag (the count moves to 113). A teacher can export a parent-facing growth report as an A4 Arabic right-to-left PDF — read-only composition, no measurement or decision logic, no writes, no new table.

The endpoint. POST /api/parent/pdf/:childId returns the generated PDF directly (A4, Arabic RTL, body font ≥ 12 pt, ≤ 500 KB, well under 30 s). Teacher- or admin-triggered; never student-facing.
Strictest safety in the product. Every string passes the parent-stricter filter + a typed field strip: parents see no numbers, no percentages, no scores, no internal codes — growth language only (V-3 / V-12).
Fail-closed. No resolved report → an explicit “needs more data” page, never a fabricated profile. No active safety rule set → 503 rather than ship unfiltered text.
Arabic that travels. The Arabic font (Noto Naskh, embedded) ships inside the file, so it renders correctly when shared (e.g. via WhatsApp) regardless of the recipient’s device fonts. Deterministic content (V-6); no LLM (V-5).

Sub-Skill Report APIs — 2026-06-20

Three new operations under the new reporting tag (the count moves to 112). The read-side surface that turns persisted engine outputs into teacher/admin-facing reports — read-only composition, no measurement or decision logic, no writes.

Per-student report. GET /api/reports/students/:id/sub-skill-report composes five sections in one payload: the educational profile (pre-written teacher narrative), the macro-domain tiles (plus GLOBAL meta), the per-skill statuses sorted “needs intervention” first, the active intervention plan, and any open alerts. A student with no resolved profile returns a dataIncomplete section (a 200, not an error).
Per-class summary. GET /api/reports/classes/:id/class-summary returns per-skill count distributions and the class macro status (worst-case-wins, never an average). A teacher sees per-student rows for their own class; ?audience=manager switches to aggregate-only (no student names or ids ever).
Band reference. GET /api/reports/band-descriptions returns the 5-band Mizan catalog (cached 24h); Arabic labels are working defaults pending partner sign-off.
Safety. No global percentage anywhere (V-3); every string passes the language-safety filter (V-12); every query is org-scoped (cross-org → 404); no LLM (V-5). Teacher (ClassTeacher-scoped) + admin only.

Diagnostic Engine (θ-scoring) — 2026-06-20

Four new operations under the new engine tag (the count moves to 96). This is the measurement core: it turns a student’s diagnostic answers into an ability estimate (θ) and picks the next best question.

Score. POST /api/engine/score takes the student’s responses + a prior θ and returns {newTheta, newSE, calibrationVersion, nextItemId, calibrationProvisional}. Every scoring run is recorded as an append-only DiagnosticSession. Reads/writes are org-scoped; gated by the USE_DIAGNOSTIC_ENGINE permission, and a student token can only score its own responses.
Deterministic + replayable (V-6). Same inputs → byte-identical θ to 4 decimal places. Each session freezes the exact item difficulties it used (resolvedDeltas) so POST /api/engine/replay/:sessionId reproduces the original score forever — even after the calibration pipeline later updates those difficulties.
Two-track difficulty. A question’s measured difficulty is trusted once it has ≥300 real answers (and matches the session’s pinned calibration version); otherwise the deterministic estimate is used and the row is flagged provisional.
No LLM, ever (V-5). The whole scoring path is a pure deterministic function — covered by the no-LLM CI lint and an integration spy. Insufficient data → newTheta:null (a 200, the engine’s “don’t decide yet”).
Reads: GET /api/engine/sessions/:id + GET /api/engine/sessions/by-student/:studentId. The Wave-1 score is a deterministic placeholder, not real Rasch psychometrics — the real statistical engine swaps in later (calibration pipeline) with zero contract change.

System QA Checks Registry — 2026-06-20

Four operations under the new qa tag (the count moves to 92). A 32-check registry records the cross-cutting safety invariants the decision, intervention, and content engines already enforce — this layer registers and routes them; it adds no new check logic.

Observability for admins/engineering only. GET /api/qa/checks returns the 32-row registry; GET /api/qa/runtime-alerts lists the routed runtime alerts for your organization (filter by severity, time window, or qaCheckId); GET /api/admin/qa/audit-summary rolls them up by check + severity. All require VIEW_QA_REGISTRY (or super-admin/backoffice) — never reachable by a teacher or student token.
Deterministic routing. A runtime check failure is routed purely by its registered severity — critical surfaces to the teacher dashboard + pings the alert seam, high pings engineering, medium records the alert, low logs only. Every routed alert pins the registry + rule version for replay.
No new alert sink. Routed alerts append to the existing platform alert log (append-only); the QA read never returns non-QA alert rows.
New — POST /api/admin/qa/checks-set (registry version-publish). Super-admin / backoffice only. Mirrors the bundle and decision-engine config-publish: it takes a candidate check-set with partner approval and an acknowledged diff, runs a completeness gate (all 32 checks present, each command runnable, each Arabic message safe), and returns the prospective new version — rejecting gaps, missing approval, or an unacknowledged diff.
New — QA_032 over-assessment guard. A medium-severity alert (never a block) that flags when a student is measured too often in a short window — the probe is always recorded; the alert just nudges the teacher to teach before testing again. The over-testing threshold and window are partner-tunable (working defaults pending the June partner session).

Language-Safety Layer — 2026-06-20

Six new operations under the new language-safety tag (the count moves to 102). The deterministic, rule-based rewriting layer that sits in front of every string the platform renders, saves, or prints — it converts clinical/deficit phrasing to growth language, strips international-framework labels, and applies a stricter filter for parent audiences. It is never a language model (V-5) and never changes a measurement value — only how it is described (V-3/V-12).

Validate any string. POST /api/language-rules/validate runs sanitize(text, appliesTo, profileKey?) and returns the safe text + the rules that fired + the rule-set version (V-6 audit replay). Audience tiers: ui / reports / parent / parent_stricter / teacher_note / student / admin. The parent-stricter tier also blocks numeric scores and internal IDs; a per-profile do_not_say overlay applies at render time.
Admin rule management. GET /api/language-rules returns the active rule set (consumed at copy-load by the app + PDF renderer); POST /api/language-rules inserts a new version (append-only — the prior row is deactivated, never edited in place); PATCH /api/language-rules/{id}/deactivate retires a row.
Controlled context flags only. POST /api/students/{id}/context-flag attaches a controlled flag from the closed dictionary — no free-text body field exists; a blocked/forbidden flag returns 422 context_flag_blocked and records a length-only security-audit entry (no text about a child is ever stored). GET /api/students/{id}/context-flags returns the recorded flags + their safe labels (teacher/admin only).
Fail-closed. A render that cannot load an active rule set refuses rather than emit unfiltered text.

Standards Spine + Dynamic Benchmark Management — 2026-06-20

Eight new operations under the new standards tag (the spec now serves 110 operations). This is how a raw score becomes “on track / approaching / below / severe” — and how those thresholds are managed without touching code.

One verdict vocabulary, never a single score (V-3). Every measure resolves to one of meets / approaching / below / severe / not_assessed. not_assessed is never treated as zero and is excluded from every rollup. There is no “overall reading score” anywhere.
Thresholds live in data, versioned, swappable. GET /api/standards/benchmark-profiles/resolve walks a deterministic 3-step chain (exact → country-default → global fallback) and never returns null — a seeded global row guarantees a resolution. Super-admins manage the rows under /api/admin/benchmark-profiles; an activate is an atomic swap (the old version deactivates, the new one activates, both audited).
History never shifts (V-6). A session pins its benchmark versions at the start; a later swap does not change in-flight verdicts, and every stored result re-derives its original status forever.
Zero framework labels (FR-STD-1). International yardsticks (used only to calibrate the Arabic thresholds) are stripped at the serialization layer — they never reach a teacher, principal, parent, or student surface. The principal dashboard speaks only Curriculum-KPI Arabic labels.
Principal reads + a Ministry-report stub. GET /api/principal/schools/:id/{overview,heatmap} return Curriculum-KPI gauges + a computed School Literacy Health Index (no PII). The one-click Ministry report is registered as a Wave-1 contract stub (501); the PDF lands in Wave 2.

Benchmark numbers are partner-owned (extracted from the v3.3 workbook — Jordan + Palestine); the global fallback rows + Curriculum-KPI label wording are flagged working defaults pending the June-25 partner session. Migration: 20260620150000_wp_std_be. PRD: 12-benchmarks-and-country-config (FR-STD-1..6).

Teacher Activation Workflow — 2026-06-20

Seven new operations under the new workflow tag (the count moves to 88). A 17-step workflow walks a teacher through review → group → activate → monitor → decide, composing the bundle, cluster, and monitoring engines into one guided flow.

The teacher decides; the system blocks unsafe moves (V-12). POST /api/workflow/activate will not activate a do_not_decide_yet plan, and an invalid lifecycle move is rejected (409). The RTI tier is never changed by any workflow action.
Guardrails at the boundary. Insufficient evidence → a prompt, not an activation; an open acute-regression alert forces the review path (POST /api/workflow/review resolves it by an explicit teacher disposition); a DATA_INCOMPLETE student is held out of bulk activation.
Per-member adjustments are deliberate. Any scaffold-tier / delivery / schedule change off the recommendation requires explicit confirmation (else 409) and a controlled reason code — never free text.
No new data path. Workflow effects route through the existing activation / cluster / monitoring write paths; the triggering workflow step is stamped on the assignment history. POST /api/workflow/save-draft persists an in-progress workflow (7-day), restored via GET /api/workflow/draft/:teacherId/:classId.
Also: the acute-regression alert response no longer includes contextFlagId — a dismissal’s context flag is recorded only in the context-flag log (the single sanctioned home).

Logging & Observability — 2026-06-20

Two new operations under the new logging tag (the count moves to 81). Completes the platform logging contract on top of the logger core + request correlation shipped earlier.

POST /api/logs/client — the browser ships its WARN/ERROR logs to the server (warn/error only; rate-limited to 1 batch per user per 5s). The server stamps userId/orgId/ts from the authenticated identity — never from the request body — so a client cannot mislabel who or which tenant a line belongs to. The intake endpoint deliberately does not log itself (infinite-loop guard).
GET /api/health/log-level — an unauthenticated ops probe returning the active global log level and any per-module overrides, so operators can confirm what a deployment is running without a redeploy.
No schema change. Logging is a process concern; client-log persistence is a Wave-2 question.

Task Delivery Service — 2026-06-20

Four new operations under the new delivery tag (the count moves to 79). This is what happens at student time: the platform picks the right pre-authored item and serves it exactly as stored — it never composes content at runtime.

Select + serve. POST /api/delivery/task filters the operational item bank on (sub-skill × scaffold tier × task_mode × difficulty × dialect), picks deterministically, and returns the stored item. It is the single serve chokepoint every consumer (diagnostic, practice, probes, ISM) calls per task.
No improvisation. An empty candidate pool returns 200 { served: null, log.reason: "no_eligible_item" } — never an error, never a silent substitute. The caller decides the UX; the gap aggregates to GET /api/admin/delivery/gap-summary (by sub-skill × tier × mode) for content-volume planning.
Tier + mode are filters, not render switches. Per-tier variants are distinct stored rows; probes ignore tier and show no hints or mid-probe feedback (probe integrity). The student never sees a profile/bundle/tier label.
Deterministic + auditable. Every serve writes an append-only served_task_instance keyed by a UUID taskInstanceId (the QR-worksheet join token), pinning its selection inputs; POST /api/admin/delivery/dry-run-replay re-asserts the same pick (V-6) and writes nothing. No LLM on the serve path (V-5).
The recently-served window (Wave-1 default 7 days) and per-mode feedback copy are partner-owed (June-25); teacher-enable lower-tier substitution is deferred (default is no_eligible_item).

Progress Monitoring Rules — 2026-06-20

Three new operations under the new monitoring tag (the count moves to 75). The platform now tracks whether an intervention is working and raises a review when a student regresses.

Spec reconciliation + fixes — 2026-06-20

Maintenance pass — no new operations (the count stays at 75). Two consumer-visible API changes plus spec-corpus and test corrections.

Organizations now return createdBy. Every GET /api/organizations / …/{id} response includes the provisioning super-admin’s createdBy (a User id) — the value was already stored at creation, now it is exposed (FR-TEN-1). The OrgResponse schema in the API reference is updated accordingly.
Domain-status sentinel correctness. GET /api/sse/students/{id}/domain-status now returns the stored doNotDecideYet sentinel instead of re-deriving it at read time, honoring the FR-SSE-16 “stored, not re-derived” consumer contract (it already behaved this way on the skill-status endpoint). Consumers must still branch on doNotDecideYet before acting on a domain status.
Taxonomy routes document 401. The four taxonomy reads (GET /api/sub-skills, …/{id}, /api/domains, /api/skill-dependencies) now declare the 401 (unauthenticated) response they already returned at runtime.
Internal only. gateway_priority_config gains a published_at audit column (migration 20260620120000); spec-corpus reconciliations in 11-skills-taxonomy (15 active item types, the 4-route taxonomy API surface, SENT/SYN count 12, plain-view DDL), two corrected FR-ID cross-references, and added test coverage for the request-id and append-only / context-flag lint fail-branches.
Record progress evidence. POST /api/monitoring/evidence records the non-CBM evidence types (quick check, rubric, teacher observation, worksheet, digital trend) with administration conditions; the record is append-only and triggers a deterministic recompute pinned to the rule version (V-6).
Five safety rules govern every decision. Insufficient or non-comparable evidence → do_not_decide_yet; no single-point decisions; only comparable probes are compared. Acute regression (≥20% drop over two comparable sessions) raises an Immediate Review — never auto-fail, auto-exit, or an RTI-tier change (the teacher decides).
Status + alerts. GET /api/monitoring/status/{bundleAssignmentId} returns the current verdict + the pinned rule versions; GET /api/monitoring/alerts lists open acute-regression alerts. An alert resolves only by an explicit teacher disposition via a controlled reason code — never free text.
Smart Cluster safety seam closed. Cluster now reads the real acute-regression alert (was fail-open), so a regressing student is routed to individual review and kept out of group clusters.

Smart Cluster Sequencing + Bulk Activation — 2026-06-19

Three new operations under the new cluster tag (the count moves to 72). Teachers group students who need the same support into small clusters and activate the whole group in one confirmed action.

Deterministic clustering. POST /api/cluster/recompute/:classId groups a class on a 5-key match (bundle + anchor + scaffold tier + compatible delivery mode + sufficient data); Level 1 + Level 2 merge for teacher-led delivery only; minimum group size 2 (school knob 3–5). Results are append-only and pinned to the sequencing-rule version (V-6 deterministic), 60-day retention.
Safety exclusions are hard. A DATA_INCOMPLETE / do_not_decide_yet student → a more-data-needed list; an acute-regression student → individual review. Neither is ever clustered or bulk-activated.
Bulk activation (V-12, reuses the single write path). POST /api/cluster/bulk-activate requires explicit teacher confirmation and creates each member’s assignment through the same writer individual activation uses (stamped with the cluster id); all-or-nothing (an excluded member → 422), idempotent, 409 on a stale cluster. Differentiation lives only in each student’s scaffold assignment.
Preview. GET /api/cluster/student/:studentId/preview returns the student’s current cluster + the per-tier scaffold distribution; no student-facing internals.
The acute_regression exclusion reads a fail-open seam until WP-PROG-MON-BE produces the signal; the nightly recompute cron is a seam (the idempotent recompute function ships now).

Intervention Bundle Catalog — 2026-06-19

The bundle recommender ships. Eleven new operations under /api/bundles, /api/assignments, /api/students/:id, and /api/admin/bundle-* (the count moves to 69). It turns a student’s profile into one recommended intervention bundle from a closed 20-bundle catalog, assigns scaffold support, and hands the teacher a one-tap activation.

Deterministic recommendation. POST /api/bundles/recommend maps the active profile (primary code + driver/modifier) through the 21-row profile→bundle map to exactly one bundle — anchor skill + 1–2 supporting threads + a data-supported asset bridge, anchor always the largest dosage share.
A bundle is a recommendation, never automatic (V-12). POST /api/students/:id/activate-bundle requires the teacher; the 14-status assignment lifecycle is append-only + history-shadowed. Adjustments (PATCH …/scaffold-assignment | …/delivery | …/schedule, POST …/context-flag) use controlled reason codes only — never free text — and RTI tier can never be changed by any adjustment.
Fail-closed. No active profile / DATA_INCOMPLETE / do_not_decide_yet ⇒ blocked_insufficient_evidence
- a recommended next data action, never a bundle. No role can override.
Student never sees a bundle id/name, profile code, or scaffold-tier label — stripped at serialization.
Admin. POST /api/admin/bundle-config-sets (publish + completeness gate); POST /api/admin/bundle-replay (V-6 dry-run diff, zero writes). The Smart Cluster + bulk-activation endpoints arrive with WP-CLUSTER-BE.

Student Profile Resolution — 2026-06-19

The profile engine ships. Six new operations under /api/profile + /api/admin/profile-* (the count moves to 58). It composes the Skill-Status Engine’s outputs into one named educational profile per student per assessment window, with rule-written Arabic narratives per audience.

Read a student’s profile. GET /api/profile/{studentId}?window= returns the active OR pending assignment — primary code, drivers, modifiers, confidence, the rule-version pins, and assignmentStatus (so a client can render the pending → confirm flow). When evidence is incomplete it returns a structured DATA_INCOMPLETE guard (200), not a profile.
Narratives are per-audience and safe. GET /api/profile/{studentId}/narrative?audience=teacher|parent|admin returns the pre-rendered, language-safety-filtered text. audience=student is rejected (422) — a student never receives a profile code or any numeric score; the parent narrative carries no internal IDs.
Teacher decides (V-12). A fresh resolution starts pending_teacher_review; POST /api/profile/{studentId}/review-resolution with { outcome: confirm } promotes it to active. Exactly one active profile per window; every change is a new append-only row.
Fail-closed. If any contributing skill is do_not_decide_yet or a required domain lacks coverage, no profile is written — no role can override.
Admin. POST /api/admin/profile-catalog/publish runs the catalog-completeness gate (every primary has its CDP/PGA/SPOT coverage); POST /api/admin/profile-replay is a V-6 dry-run diff (zero writes).

Skills taxonomy: VOCAB-C06 added (79 skills) — 2026-06-19

Partner confirmation of the sub-skills ID mapping. The skill catalogue grows from 78 to 79: GET /api/sub-skills now returns VOCAB-C06 Synonym Awareness (فهم المرادفات) in the VOCAB domain. No new operations (count unchanged at 52).

New skill VOCAB-C06 Synonym Awareness. The VOC_REL concept now covers both antonyms (VOCAB-C05) and synonyms (VOCAB-C06).
Sub-flag naming locked. The writing/spelling sub-flag is WR_SP_ALERT (was WRITING_SPELLING_ALERT); AR_LETTER_CONFUSION_FLAG is scoped to visual/orthographic confusion only.
Arabic-feature scope (current phase). Madd / short-vowel signals come only from written/visual items (never oral-reading error type); hamza / shadda / tanween open no bundle or alert this phase.

Skill-Status Engine — 2026-06-19

The decision engine ships. Seven new operations under /api/sse (the count moves to 52): five read-only consumer endpoints and two admin endpoints.

Read a student’s interpretive status. GET /api/sse/students/{id}/skill-status and /domain-status return the latest per-(subSkill|domain, evidenceWindow) snapshot, each carrying the matched rule ids + versions (for replay) and the doNotDecideYet sentinel. /pattern-classifications and /behavior-log back the “why this status” explainer + the teacher review surface. All tenant-scoped, read-only, VIEW_STUDENT_PROGRESS — no status value is ever exposed to a student.
do_not_decide_yet is a hard stop. When evidence is thin, split, or low-quality the engine refuses to commit and emits the sentinel; consumers must short-circuit (no profile, no bundle, no alert, no mastery event) — and no role can override it.
Status is computed server-side, never on demand. There is no client “compute now” endpoint; status recomputes when new evidence arrives. Reads return the stored snapshot — consumers never re-derive.
Admin (rule publishing). POST /api/sse/admin/rules/{layer}/publish (schema + Arabic-accuracy carve-out + overlapping-row validation + dry-run diff, partner-approval gated) and POST /api/sse/admin/dry-run-replay. Publishing a new rule version never retroactively changes existing snapshots — they stay pinned to their compute-time version.

Request correlation id — 2026-06-19

Every response now carries an X-Request-Id correlation id, and the server accepts one inbound. No new operations; the count is unchanged (45).

X-Request-Id is now accepted inbound. If you send an X-Request-Id header matching ^[A-Za-z0-9_-]{8,128}$, the server honors it verbatim on the response; otherwise it generates one (req_<hex>). Send your own to correlate a client trace with the server’s logs for that request.
X-Request-Id is CORS-exposed. Browser clients on an allowed origin can now read the header off the response (Access-Control-Expose-Headers: X-Request-Id) — log it client-side and quote it in a bug report.
Internal (no contract change): the id auto-binds to every server log line and is persisted on the activity/audit rows a request produces (auth events, super-admin bypass, item-import + δ_prior logs), so support can pivot a reported id → server logs → DB changes for one request.

Content — 2026-06-19

Item-bank content + schema additions ahead of pilot. No new published operations; the count is unchanged (45) — the item-bank/import surfaces remain pre-publication.

Seed content corpus. A validated offline-authored item bank now exists — 645 items + 2 reading passages across all 7 digital domains, every item-eligible skill at 8–12 items, all passing the import gate. Items are source='offline_generated', status='draft' (never served until SME approval). Difficulty priors are fully real (SAMER v2 lexicon + CAMeL).
Schema additions (item bank). A new optional image_url import field + imageUrl column for picture items, and the distractor_type controlled vocabulary expanded 13 → 19 (added agreement_error, wrong_part_of_speech, wrong_function_word, in_text_wrong_detail, wrong_syllable_count, wrong_sound_position) with the import compatibility matrix extended to match. No change to any published route, request shape, or response shape.

Maintenance — 2026-06-18

Routine-driven correctness pass — closed a batch of code-review and spec-drift findings. No new operations; the count is unchanged (45).

Class deletion is now a soft-delete. DELETE /api/classes/:id is unchanged for callers (same request + same { ok: true } response), but the class and its teacher/student assignments are now preserved for audit rather than removed. A deleted class no longer appears in GET /api/classes, GET /api/classes/:id (404), the organization rollup, or roster/assignment lookups. Deleting a class with active students is still rejected (409). → See Core Concepts.
Internal hardening (no API-contract change): tighter manager/teacher scope enforcement on class, student, and principal access; rate-limit lockout-escalation correctness; item-import integrity (atomic write, per-row validation, global-item write protection); and log-field redaction. Error codes, request shapes, and response shapes are unchanged.

Sub-skills taxonomy — 2026-06-18

The Arabic-literacy skills taxonomy ships as global reference data — 4 read operations under the new sub-skills tag, raising the published total to 45 operations.

GET /api/sub-skills — paginated list of all 78 sub-skills with domain, grade range, assessment type, and gateway/key-skill flags. (raised to 79 by the VOCAB-C06 entry above)
GET /api/sub-skills/{id} — single sub-skill row by CUID.
GET /api/domains — all 9 Arabic-literacy domains (Phonological Awareness, Decoding, Fluency, Comprehension, Vocabulary, and more).
GET /api/skill-dependencies — prerequisite links between skills.

All four endpoints require authentication and tenant context; the VIEW_OWN_PROFILE permission is held by every persona (Teacher, Admin, Principal, Manager, Parent, Student), so the taxonomy is available to all authenticated users.

Item bank foundation — 2026-06-18

The item-bank API surface is available — 12 operations under the items tag. These are pre-publication routes gated to RUN_BACKOFFICE (backoffice/SME-only); they do not affect the published-operation count of 45.

Read — GET /api/items, GET /api/items/{id} (?view=authoring for the full authoring row), GET /api/item-types, GET /api/item-types/{code}/distractor-compat.
Import pipeline — POST /api/items/import, GET /api/items/import, GET /api/items/import/{batchId}, POST /api/items/import/{batchId}/approve.
Lifecycle — POST /api/items/{id}/sme-approve, POST /api/items/{id}/promote, POST /api/items/{id}/retire.
Difficulty priors — GET /api/expected-difficulty/log (deterministic δ_prior audit trail).

1.0.0-phase0

Initial portal. Documents WP-A1 authentication + WP-A2 tenancy + WP-A3 user management — 41 operations across the tags auth, tenancy, users, and system.

auth — register (bootstrap-only), login, refresh (rotating), me, logout, password reset + confirm, student quickCode login. → See Authentication, Getting Started, and Errors.
tenancy — organizations, schools, classes (CRUD), organization rollup. → See Core Concepts and Manage classes & rosters.
users — provisioning (discriminated POST /api/users), persona profiles, class rosters, ClassTeacher assignment, principals, managers, parent ↔ child links, student quickCode regen. → See Provision users, Managers & principals, and Parent ↔ child links.
system — health check. → See the API Reference.