Metricis Project Plan¶

Status: Active hub document — single source of truth for project vision, status, roadmap, and progress. Last updated: 2026-05-08 (§6 #21 resolved — REDCapEventSyncService.sync_events_to_visit_windows now wraps its per-event loop in a db.begin_nested() SAVEPOINT and raises an internal _AtomicSyncFailure sentinel when any event errors, so partial state can no longer be persisted. The service no longer calls db.commit() directly — the FastAPI get_db dependency owns the outer transaction boundary, mirroring §6 #6. 7 atomicity tests in tests/test_redcap_event_sync_atomicity.py.) Replaces: docs/project-plan/project-plan.md (archived 2026-05-06)

This is the hub. Deep specifications live in linked spoke documents and should be edited there; this document tracks status, sequencing, and progress.

1. Vision¶

Metricis is an AI-native, standards-first Electronic Data Capture (EDC) platform for regulated clinical trials, with particular strength in rare disease and pediatric research. It is ODM-aligned and assessment-centric, integrating jsPsych cognitive assessments, eConsent, REDCap interoperability, and governed AI co-pilots in a single backend that serves both site-facing (researcher) and participant-facing (patient/caregiver) experiences.

Core principles

Single authoritative backend with versioned metadata and immutable audit trail
Distinct frontends for site staff vs. participants/caregivers, sharing one API
Strict role-based access control enforced at UI, API, and service layers
ODM-informed data model with explicit version binding (battery, metadata, consent)
AI agents as governed co-pilots — drafts only, human approval required
Compliance: ICH-GCP, Health Canada, FDA, EMA, 21 CFR Part 11, HIPAA

2. Architecture at a Glance¶

┌──────────────────────────────────────────────────────────────────┐
│ Portal (React) │ Client (jsPsych) │ Patient Portal (React+Capac.)│
└────────┬───────────────┬────────────────────┬────────────────────┘
         └───────────────┼────────────────────┘
                         ▼
                 FastAPI Backend
   ┌──────────┬──────────┬──────────┬──────────┬───────────────┐
   │ Study    │ Metadata │ Consent  │ Assess-  │ RBAC / Audit  │
   │ Runtime  │ Version  │ Gate     │ ments    │ (21 CFR Part11)│
   └──────────┴──────────┴──────────┴──────────┴───────────────┘
   ┌──────────┬──────────┬──────────┬──────────┬───────────────┐
   │ Schedule │ REDCap   │ Forms /  │ AI Agent │ Registries /  │
   │ (Unified)│ Sync     │ ODM      │ (Phase 2)│ Phenotypes    │
   └──────────┴──────────┴──────────┴──────────┴───────────────┘
                         ▼
                 PostgreSQL + Redis + Celery

For detailed architecture, see CLAUDE.md and the spec spokes in §9.

3. Status by Subsystem¶

Legend: ✅ complete · 🔄 in progress · 🔜 planned · ⚠️ has known issues

#	Subsystem	Status	Evidence
1	Study runtime (enrollment, visits, sessions, data entry)	✅	`server/app/routers/`, `portal/src/pages/`
2	Metadata governance (versioned design, draft→approve→publish)	✅	`server/app/services/metadata_service.py`
3	jsPsych assessments (battery versioning, queue, reconciliation)	✅	`server/app/services/assessment_queue_service.py`
4	eConsent (versioning, signatures, gating, re-consent)	✅	`consent_service.py`, `reconsent_service.py`
5	RBAC + Audit (21 CFR Part 11)	✅	`audit_integrity.py` (hash chain), CODEOWNERS
6	SDTM/ODM/Define-XML export	✅	`routers/regulatory.py`, `sdtm_validation_service.py`
7	Validation services (Pinnacle 21-style)	✅	`sdtm_validation_service.py`
8	Researcher Portal (38+ pages)	✅	`portal/src/pages/`, docs/PAGES.md
9	Patient/Caregiver Portal (magic link, mobile-first)	✅	`patient-portal/src/`, `routers/portal.py` — REDCap-managed data ingestion landed 2026-05-07: `task_type="survey"` with `external_url` launcher, `is_stale` flag on visit schedule, `GET /api/portal/randomization` (§6 #4 + #20)
10	Rare disease & pediatric (Phases 1–4)	✅	rare-disease-integration.md
11	AI Agent Phase 1 (foundation: models, LLM, ChangeSet)	✅	`llm_service.py`, `agent_service.py`, `changeset_validator.py`
12	AI Agent Phase 2 (protocol ingestion, StudyAssistant UI)	✅	`agent_tasks/protocol_ingestion.py`, `routers/ai_assistant.py`, `pages/StudyAssistant.tsx`
13	Patient Registry (Phases 1–4: foundation, linkage, analytics, UI)	✅	participant-registry.md
14	Unified Scheduler (mode-aware: legacy vs EDC)	✅	`unified_scheduler.py`, `consent_gate.py`
15	Anchor date / enrollment date system	✅	`enrollment_date_service.py`, `schedule_versioning_service.py`
16	Form workflow state machine (NOT_STARTED → LOCKED)	✅	`form_workflow_service.py`
17	Quality flags & validation rules	✅	`ValidationRule`, `ValidationResult` models
18	REDCap sync (fail-safe, no-fallback policy)	✅	`redcap_sync.py` — invariant test enforces no-fallback; failed-sync surfaced in patient portal banner + coordinator dashboard alert (§6 #14, fixed 2026-05-07)
19	REDCap DET webhooks	✅	`routers/webhooks.py` — HMAC, RBAC/audit/rate-limit, production test-endpoint gate, idempotency + persistence (`WebhookEvent`) all landed 2026-05-07
20	Randomization module (full stack, Metricis-managed mode)	✅ ⚠️ deferred	`randomization_.py`, `pages/Randomization.tsx` — bugs in §6; deferred (M9) since REDCap-managed studies use REDCap randomization
21	Business Day Service (weekend/holiday)	✅ ⚠️ deferred	`business_day_service.py` — see §6, deferred to M9
22	Token encryption (REDCap tokens)	✅	`token_encryption.py` — production refuses missing key + plaintext reads; `enc:v2:` + key rotation + `migrate_redcap_tokens.py`; 15 tests (§6 #2 + #7, fixed 2026-05-07)
23	Time simulation (test mode)	✅	`TimeSimulationBanner.tsx`, `routers/testing.py` — service + router gates inert in production; admin + dev-mode required on every mutation with `AuditLog`; 3 invariant tests (§6 #1, fixed 2026-05-07)
24	Dev mode service & bypasses	✅	`dev_mode.py`, `DevModeContext.tsx`
25	Compliance invariant tests	✅	`tests/test_compliance_invariants.py`
26	Audit log integrity (SHA-256 hash chain)	✅	`audit_integrity.py`
27	Capacitor mobile (iOS/Android, push)	✅	`capacitor.config.ts`, FCM/APNs setup
28	CI/CD (GitHub Actions, path-based, nightly)	✅	`.github/workflows/`
29	AI Agent Phase 3 (Battery & Assessment Configuration)	🔄	ai-agent-implementation-plan.md — `battery_config` task shipped 2026-05-10 (`agent_tasks/battery_config.py`, 9 tests); `event_linking` task + portal inline assist follow-ups
30	AI Agent Phase 4 (Amendment Impact Analysis)	🔜	ai-agent-implementation-plan.md

4. Active Workstreams¶

Status (2026-05-10): every §6 item resolved; cutover checklist landed at docs/guides/sponsor-study-cutover.md. The engineering portion of M7 is closed. The remaining gate is operational: a sponsor-study cutover walked through the checklist with sign-off recorded against the audit log. Until that ships, this workstream is in cutover-review limbo, not in active development.

Goal: ship a production-grade Metricis deployment for a study where REDCap is the source of truth for randomization, data collection forms, events, and visit windows, and Metricis delivers the patient/caregiver experience (assessments, eConsent, task queue, reminders) on top.

What landed (cross-reference §6 entry numbers):

Gating functions — consent gate, dev-mode gates, time-simulation gate all inspection-grade with router-level production 404s, audit-logged mutations, and structural invariants (§6 #1, #3).
Token encryption for REDCap tokens AND webhook secrets — enc:v2: write format, no JWT-secret fallback in production, MultiFernet rotation, one-shot migration script (§6 #2, #7, #19).
REDCap sync robustness — fail-safe no-fallback policy enforced by structural invariant; transactional boundaries on event sync and schedule versioning; per-study circuit breaker; single retry layer; DET webhook idempotency + replay protection (§6 #14, #6, #21, #22, #23, #17).
Patient portal REDCap ingestion — survey task type with external launcher, stale-data flag on visit schedule, randomization read-only display with masked-label preference for blinded studies (§6 #4, #20).
Test coverage — 10-test E2E lifecycle plus dedicated suites for retry/breaker, anchor-shift reconciliation, project_id index, webhook idempotency, RBAC, token encryption, and compliance invariants.

4.2 AI Agent Phase 3 (active, M8)¶

Phase 2 (protocol ingestion) shipped on 2026-01-22. Phase 3 (battery configuration assistance) is the active workstream now that M7 engineering is closed. Spec: ai-agent-implementation-plan.md.

Status (2026-05-10): - ✅ Task 1 of 2 — battery_config AI task: reads a battery plan + optional protocol, proposes Battery rows with ordered Module lists and event_battery_assignment items, anchors event names to existing VisitWindow.name values when available, surfaces burden warnings in the ChangeSet summary. Wired into AgentService.execute_task dispatch. 9 unit tests in tests/test_agent_battery_config.py covering happy path, visit anchoring on/off, missing required document, malformed LLM output, unexpected exceptions, and dispatch wiring. Read-only: produces ChangeSet items with artifact_type="battery_version" and "event_battery_assignment"; no mutation of live Battery/BatteryModule tables (the apply path is shared with Phase 2 and remains a TODO in routers/ai_assistant.py::apply_changeset). - 🔜 Task 2 of 2 — event_linking task: takes existing visits + batteries, proposes optimal pairings, identifies timing conflicts and burden concerns. Builds on the same prompt + ChangeSet scaffolding as battery_config. - 🔜 Portal inline assists: "Suggest module selection" button in BatteryBuilder.tsx; tab additions in StudyAssistant.tsx for the new task type.

4.3 Metricis-managed mode hardening (deferred, M9)¶

Randomization (full-stack), Business Day Service, and Schedule Versioning have outstanding bugs documented in §6 #9–13. They are deferred until Metricis-managed studies become a target. The first sponsor study runs in REDCap-managed mode where REDCap handles randomization and scheduling, so these issues do not block M7 cutover.

4.4 Documentation consolidation (ongoing)¶

Part of M6. CLAUDE.md condensed 2026-05-06; this hub document is the planning consolidation deliverable. M7-specific operator documentation landed 2026-05-10: sponsor-study-cutover.md.

5. Roadmap¶

Horizon	Milestone	Focus
Now (current)	M8 — AI Agent Phase 3 (Battery Configuration)	Inline assists in BatteryBuilder, event-linking proposals, validation pipeline
Operational gate	M7 — REDCap-managed study production readiness (engineering complete; sponsor sign-off pending)	Walk the cutover checklist with the first sponsor study; record sign-off in the audit log
Deferred	M9 — Metricis-managed mode hardening	Randomization fixes, Business Day Service, schedule versioning transaction boundary — unblocked once Metricis-managed studies become a priority
Later	M10 — AI Agent Phase 4 (Amendment Impact Analysis)	Cutover planning, narrative generation per affected participant
Backlog	First sponsor study, multi-site rollout, expanded SDTM domains, additional languages	—

6. Known Issues & Risks¶

Findings from the 2026-05-06 code review. Severity prefixes: 🔴 critical, 🟠 significant, 🟡 minor. Status as of 2026-05-10: every item that blocks M7 is resolved. Items are grouped by whether they blocked M7 (REDCap-managed study production readiness) or are deferred to M9 (Metricis-managed mode hardening).

Blocked M7 — REDCap-managed production readiness (✅ all resolved)¶

🔴 Critical (gating + safety/security)¶

Time simulation is not gated to dev mode and has no audit log. ✅ Fixed 2026-05-07 server/app/services/scheduler.py:40-63 shifted the scheduler's "today" by Study.test_mode_config.time_simulation_offset_days regardless of ENVIRONMENT. The mutation endpoints in server/app/routers/testing.py:178,219 required only get_current_user — no admin role, no DevModeService.is_dev_mode_available() gate. A production user could have shifted any study's clock and broken participant-facing scheduling. Resolution:
Service-layer gate — services.scheduler._time_simulation_allowed() consulted by get_effective_date(study). Honours the offset only when ENVIRONMENT != "production" or DEV_MODE is explicitly set, so a stale persisted offset cannot affect production participants. The routers/portal.py test-mode-info computation is gated by the same predicate.
Router-layer gate — every mutation in routers/testing.py (update_test_mode_config, update_time_simulation, reset_time_simulation, update_study_visit_statuses, generate_synthetic_data, clear_synthetic_data, set_bulk_anchor_dates, generate_longitudinal_data) requires _require_dev_mode_admin — admin role and dev-mode availability — and emits an AuditLog row via new _audit_testing_action helper (with audit_metadata.source = "dev_mode_bypass").
Router-level production gate — _testing_router_production_gate 404s every /api/testing/* call in production, applied as a router-level dependency. 404 (not 403) avoids disclosing route existence.
Invariant tests (tests/test_compliance_invariants.py::TestTimeSimulationProductionGate): (a) get_effective_date returns date.today() in production regardless of persisted offset; (b) sanity check that dev environments still observe the offset; (c) structural test walking the FastAPI dependency tree to assert every mutation in routers/testing.py is gated by _require_dev_mode_admin, not the looser get_current_user.
Test coverage: 3 invariant tests, all green.
Token encryption silently falls back to plaintext and to JWT_SECRET_KEY. ✅ Fixed 2026-05-07 server/app/services/token_encryption.py:31-37 derived the Fernet key from the JWT secret when REDCAP_ENCRYPTION_KEY was unset (warning only, not error). :62-65 returned plaintext when the stored value lacked the enc:v1: prefix. No migration path for legacy plaintext tokens. No tests despite commit message claiming coverage. Resolution:
Production refuses to operate without an explicit key — encrypt_token() raises TokenEncryptionConfigError when REDCAP_ENCRYPTION_KEY is unset and ENVIRONMENT=production. The JWT-secret fallback remains for non-production only (with a single WARNING per process). Reusing one key across two domains was a latent risk: a leak of either compromised both.
Plaintext reads rejected by default in production — decrypt_token() raises TokenEncryptionError for any stored value lacking an enc: prefix. A one-time bridge during migration is available via REDCAP_ENCRYPTION_ALLOW_PLAINTEXT_READS=true, which emits CRITICAL logs with requires_investigation: True so it cannot be quietly forgotten.
Format versioning — enc:v2: is the new write format; enc:v1: ciphertexts remain readable (same KDF, version reserved for a future format change). Idempotency: re-encrypting an already-prefixed input is a no-op.
Key rotation — REDCAP_ENCRYPTION_KEY_PREVIOUS adds a second key to the decrypt chain via cryptography.fernet.MultiFernet. Encryption always uses the current key; the optional previous key bridges the rotation window. New rotate_token(stored) re-encrypts onto the current key for migration runs.
One-shot migration script — server/migrate_redcap_tokens.py scans every Study.config['redcap']['api_token'], classifies via needs_rotation(), and rewrites legacy/plaintext/enc:v1: values to enc:v2:. Defaults to dry-run; --apply writes.
Test coverage: 15 tests in tests/test_token_encryption.py covering round-trip, idempotency, production-without-key raises, plaintext-read rejected/allowed paths, enc:v1: back-compat, key rotation (previous-key fallback + rotate_token upgrade + wrong-key error), and needs_rotation classifier (v2/v1/plaintext/empty).
All gating functions need a unified production-readiness audit. ✅ Fixed 2026-05-07 Beyond #1 and #2, every dev-mode bypass surface (dev_mode.py, routers/dev.py, routers/testing.py, force-consent-bypass endpoints) needed a single test that proves no path bypasses production guards. Resolution:
Router-level production gates added to routers/dev.py (_dev_router_production_gate) and routers/testing.py (_testing_router_production_gate). Both 404 the entire surface in production unless DEV_MODE=true is explicitly set. 404 (not 403) so a leaked URL list cannot confirm route existence.
Unified test suite — tests/test_production_gating.py (25 tests):
- DEV_TEST_ROUTES table enumerates every /api/dev/*, /api/testing/*, and /api/webhooks/redcap/det/test endpoint. The router-level test parametrises over the table so every new dev/test endpoint must be added there to pass — making the table the single point of update.
- Service-level tests assert defence in depth: DevModeService.is_dev_mode_available()/bypass_consent_check() return False; set_anchor_date_override/trigger_scheduled_message_now raise DevModeError; services.scheduler._time_simulation_allowed() returns False; get_effective_date ignores any persisted offset.
- Token-encryption tests confirm encrypt_token() raises TokenEncryptionConfigError without REDCAP_ENCRYPTION_KEY and decrypt_token() rejects plaintext by default in production.
Autouse fixture _force_production_environment sets ENVIRONMENT=production, unsets DEV_MODE/REDCAP_ENCRYPTION_KEY/REDCAP_ENCRYPTION_ALLOW_PLAINTEXT_READS, and busts get_settings.cache_clear() so each test sees a clean production environment.
Test coverage: 25 tests, all green.

🟠 Significant (REDCap path)¶

Patient portal needs REDCap data ingestion expansion. ✅ Fixed 2026-05-07 (with #20) Portal previously delivered only Metricis-side schedules; for REDCap-managed studies it now also surfaces REDCap-driven event schedules, REDCap survey invitations, completion state, and randomization assignment — with the REDCap fail-safe policy extended to the portal (stale-data indicators, never invented state). Resolution:
REDCap event schedules — VisitScheduleItem (routers/portal.py) carries redcap_event_name, redcap_sync_status, last_redcap_sync_at, is_stale. is_stale derives from _is_visit_stale(): true when redcap_sync_status="failed", never reconciled, OR last successful sync >24h ago. Computed only for studies where is_redcap_managed(study) is true; Metricis-mode visits never carry these fields.
REDCap survey invitations — new task_type="survey" on PortalTask with external_url, redcap_event_name, redcap_instrument_name, redcap_repeat_instance, redcap_record_id, last_synced_at, is_stale, and a generic extra_data JSONB (migration e3f4a5b6c7d8). PortalTaskService.create_survey_task() is idempotent on (participant, instrument, event_name, repeat_instance) so re-issuing an invitation refreshes the row. start_task redirects to the stored external_url; if absent the endpoint returns 503 Service Unavailable rather than a broken navigation. Wired to consume URLs from redcap.py:925 generate_survey_link (REDCapService.generate_survey_link via the REST endpoint).
Randomization read-only display — new GET /api/portal/randomization endpoint. Source priority: RandomizationAllocation (preferred, includes masked_label for blinding) → Participant.arm fallback (typical for REDCap-managed studies where REDCap performs randomization). Returns randomized=False when no assignment — portal must not render a placeholder. The masked label is preferred when present so blinded studies never leak the underlying arm name to participants. No edit path.
Fail-safe extension — survey task without an external_url returns 503 (not redirect_url=null). Stale flag is sticky until next successful sync. The portal renders stale indicators rather than substituting Metricis-computed state for REDCap state.
Test coverage: 14 tests in tests/test_portal_redcap_ingestion.py covering survey task creation/idempotency/start, start_task 503 fail-safe, is_stale for failed/never-synced/recent/old sync, Metricis-mode visits never marked stale, randomization with RandomizationAllocation (blinded label preferred), Participant.arm fallback for REDCap-managed studies, unauthenticated rejection. All green.
No end-to-end test for REDCap-managed study lifecycle. ✅ Fixed 2026-05-07 redcap_sync.py, redcap_*.py services had unit coverage but no integration test exercised the full pipeline together. The first run of the E2E test surfaced two pre-existing critical defects in redcap_sync._build_submission_from_responses that had been latent since a model refactor — neither defect would have been caught without this test. Resolution:
tests/test_redcap_managed_lifecycle.py — 10 tests in 5 stages mirroring the production lifecycle:
1. Event sync — REDCap export_events + event-instrument mappings → VisitWindow rows via REDCapEventSyncService. Covers idempotency on re-sync.
2. Visit scheduling — UnifiedSchedulerService.schedule_visits_for_participant dispatches REDCap-managed studies to SchedulingMode.LEGACY and creates ScheduledVisit rows with REDCap event provenance.
3. Patient-portal delivery — /api/portal/schedule carries redcap_event_name/redcap_sync_status/is_stale; /api/portal/randomization shows Participant.arm fallback with source="redcap"; /api/portal/tasks surfaces survey tasks with external_url.
4. Completion sync (success path) — patched REDCapService._get_project returns a MagicMock whose import_records returns success; Session.sync_status flips to "synced".
5. Completion sync (failure path) — same mock raises RuntimeError on every retry; sync_status="failed", success=False, no Metricis fallback computed (§6 #14 invariant). Portal data-status then surfaces has_failed_sync=True; visit row shows is_stale=True.
REDCap I/O is mocked at two boundaries so the test runs offline in CI: app.services.redcap_event_sync.get_redcap_events / get_redcap_event_instruments (module-level fetchers) and REDCapService._get_project (lazy PyCap project handle).
Pre-existing defects discovered and fixed:
- redcap_sync._build_submission_from_responses imported AssessmentMetadata from app.models.cognitive_data, but that class had been renamed to SessionMetadata. Any caller of REDCapSyncService.sync_session() would have raised ImportError in production. Fixed.
- The same function constructed a default SimpleRTSummary without min_rt/max_rt, both of which became required in a later schema revision. Pydantic would have raised ValidationError and the sync path would have masked it as sync_status="failed". Fixed.
Test coverage: 10 tests, all green. Exercises every production codepath the upcoming sponsor study will travel.
Schedule versioning lacks transaction boundary. ✅ Fixed 2026-05-08 schedule_versioning_service.py issued multiple flush() calls without an atomic boundary. Worse, the legacy SchedulerService.schedule_visits_for_participant and EDC VisitService.schedule_visits_for_participant both ran await self.db.commit() mid-flow — meaning the just-inserted ScheduleVersion row was prematurely persisted before the subsequent linking, anchor back-reference, and audit log steps could complete. A failure in any later step left an orphan version pointing at zero visits or an audit log row with no version. Resolution:
scheduler.py and visit_service.py — schedule_visits_for_participant accepts auto_commit: bool = True. When False, the inner db.commit() is replaced with a db.flush() (and the post-commit refresh loop is skipped — IDs are already populated by the Python-side UUID default at flush time). Default behaviour is unchanged for the dozen-plus existing call sites in routers/schedules.py, workers/reminder_worker.py, etc.
unified_scheduler.py — schedule_visits_for_participant, _schedule_via_legacy, and _schedule_via_edc propagate the flag.
schedule_versioning_service.py — both public methods (create_schedule_version, regenerate_schedule) wrap their bodies in async with self.db.begin_nested(): (SAVEPOINT) and call the unified scheduler with auto_commit=False. A reported scheduling failure raises an internal _AtomicVersioningFailure so the SAVEPOINT rolls back without leaking the exception to the caller; the method then returns a structured VersioningResult(success=False, ...). Behaviour change: a failed scheduling result no longer persists a status="failed" ScheduleVersion row — orphan versions with zero visits were exactly the partial state §6 #6 set out to prevent. Code-wide scan found no consumer of ScheduleVersion.status == 'failed'.
The caller's outer commit boundary is preserved: FastAPI's get_db dependency still owns the top-level commit()/rollback(). Versioning runs inside the request transaction; the SAVEPOINT only protects intra-method atomicity.
Test coverage: tests/test_schedule_versioning_atomicity.py — 8 tests in 3 classes:
- TestCreateScheduleVersionAtomicity — success path (version + visits + anchor link), scheduler-failure rollback (assert no orphan version, no visits, anchor current_schedule_version_id stays None), unexpected-exception rollback (RuntimeError propagates AND no rows persist), auto_commit=False contract assertion.
- TestRegenerateScheduleAtomicity — success path (v1 superseded → v2 active, audit log written), scheduler-failure rollback (old v1 remains is_current=True/status="active", no new version, no audit log, anchor still points at v1), auto_commit=False contract assertion.
- TestSchedulerAutoCommitContract — SchedulerService.schedule_visits_for_participant(auto_commit=False) invokes flush ≥ 1 and commit 0 times.
All 8 tests green. Existing scheduler/participant/dev-mode test suites unaffected (49 tests verified).

🟡 Minor¶

No key versioning for token encryption. ✅ Fixed 2026-05-07 (folded into #2): enc:v2: is the current write format and shares the KDF with enc:v1: so existing ciphertexts remain readable; rotation via REDCAP_ENCRYPTION_KEY_PREVIOUS + rotate_token() is now supported.
No reconciliation regression test for completed visits when anchor date shifts mid-study. ✅ Fixed 2026-05-10 ScheduleVersioningService.regenerate_schedule already implemented the right contract (_reconcile_completed_visits preserves completed/missed visits with anchor_reconciled=True + original_target_date snapshot; _delete_pending_visits soft-cancels still-pending visits; _link_visits_to_version only links the freshly-generated visits to v2; old completed visits remain on v1) — but no regression test pinned the contract. Drift in any one of _reconcile_completed_visits / _delete_pending_visits / supersession / audit-metadata wiring would silently corrupt an in-flight study's audit trail without raising. Resolution:
tests/test_anchor_shift_reconciliation.py — 12 tests across 4 classes:
- TestCompletedVisitsPreservedAcrossAnchorShift (4) — completed visit keeps status="completed" + actual_visit_date + scheduled_date; gets anchor_reconciled=True + anchor_reconciled_at + original_target_date snapshot of the pre-shift target_date; stays linked to v1 (NOT moved to v2 — that would rewrite history); status="missed" visits get the same treatment (parity).
- TestPendingVisitsCancelledOnAnchorShift (3) — pending rows soft-deleted to status="cancelled" and remain queryable on v1 for audit; overdue visits also cancelled (and NOT stamped with the reconciled flag — overdue is a pending state, not a historical one); v2 visits' earliest scheduled_date ≥ new anchor date.
- TestAnchorShiftVersionAndAuditWiring (2) — v1 marked is_current=False / status="superseded" / superseded_by_id=v2.id; anchor's current_schedule_version_id moves to v2; schedule_regenerated audit row written with audit_metadata.reconciled_visits count + old_values.anchor_date + new_values.anchor_date.
- TestAnchorShiftIdempotencyAndEdgeCases (3) — second anchor shift does NOT overwrite original_target_date (the very first protocol-intended date is the source of truth, the audit trail of what the participant was originally scheduled for); zero-completed edge case yields result.visits_reconciled=0 and no visit carries the flag; result.visits_reconciled count matches DB-state count (guards against the count-vs-state divergence that would make audit metadata lie).
All 12 tests green; no regressions across test_schedule_versioning_atomicity.py (8), test_redcap_managed_lifecycle.py (10), test_redcap_det_sync_versioning.py (7), test_redcap_event_sync_atomicity.py (7), test_redcap_retry_and_circuit_breaker.py (11) — 55 tests total verified.

🔴 Critical (REDCap path — discovered 2026-05-06 review #2)¶

sync_status="failed" is set but never read by downstream consumers. ✅ Fixed 2026-05-07 server/app/services/redcap_sync.py:341 marks failed sessions, but a repo-wide scan finds zero readers in notification, scheduling, portal, or worker code. The "no fallback" invariant holds today only because no fallback was ever written — fragile against future contributions. Resolution:
- Structural invariant (tests/test_compliance_invariants.py::TestREDCapSyncNoFallbackInvariant) — three tests: (a) AST-style scan rejects any unsanctioned reader of Session.sync_status outside an explicit allowlist (writers + display-only passthroughs); (b) confirms redcap_sync.py still emits fallback_used: False and requires_investigation: True on failure; (c) refuses any module in the REDCap sync failure path that flips fallback_used: True.
- Patient portal banner — new GET /api/portal/data-status returns {has_failed_sync, failed_sync_count, last_successful_sync_at} scoped to the authenticated participant. New SyncFailureBanner component renders a calm, non-actionable amber banner on Home (wording: "Your responses were saved on this device. The study team has been notified … No action is needed from you."). Polls every 5 min so coordinator re-sync clears the banner automatically.
- Coordinator dashboard alert — DashboardStats.failed_sync exposes the count of failed-sync sessions; portal dashboard renders a red failed-sync-alert tile (only when count > 0) labeled "Failed Sync — needs investigation".
- Test coverage: 3 invariant tests + 7 portal endpoint tests (incl. cross-participant isolation, unauthenticated rejection, pending vs failed disambiguation) + 3 dashboard tests = 13 tests, all green.
Study.integration_mode is checked only in unified_scheduler.py. ✅ Fixed 2026-05-07 server/app/db/models.py:131 defines the column; unified_scheduler.py was the only service that branched on it. Every other service (submit.py:212, redcap_sync.py:114,192, redcap_det_sync.py:402, webhooks.py:469, all 7 gates in routers/redcap.py) gated on redcap_enabled instead, leaving the architectural promise unenforced. Resolution:
- Helper (server/app/services/study_classification.py::is_redcap_managed) is the canonical gate. Precedence: integration_mode == "redcap" → True; integration_mode == "metricis" → False; else fall back to redcap_enabled with a structured WARNING log so legacy/unset rows are visible. Also exports is_metricis_managed as the negation.
- Call sites migrated: submit.py:212, redcap_sync.py:114,192, redcap_det_sync.py:402, all 7 gating sites in routers/redcap.py. routers/webhooks.py::_find_study_by_redcap_project now filters on Study.integration_mode == "redcap" directly. API passthroughs (routers/studies.py, response models in routers/redcap.py) keep surfacing redcap_enabled for backward compat — these are display-only and allowlisted.
- Structural invariant (tests/test_compliance_invariants.py::TestIntegrationModeRedcapEnabledConsistency): scans server/app/**/*.py for redcap_enabled references and refuses any reader outside an explicit allowlist (column def, helper, two API-surface routers). Adding a new entry requires a one-line justification recorded in the dict.
- Row-level invariant (same class): asserts every Study row satisfies integration_mode == "redcap" ⇔ redcap_enabled == True. Two existing test fixtures (test_redcap_rbac.py, test_webhook_idempotency.py) updated to set integration_mode="redcap" alongside redcap_enabled=True to satisfy the invariant.
- Helper unit test (same class): documents precedence by example for all 8 combinations of (integration_mode, redcap_enabled) plus the None study case.
- Test coverage: 3 new tests + 2 fixture migrations, all green.
REDCap router has zero RBAC and zero rate limiting. ✅ Fully resolved 2026-05-07 server/app/routers/redcap.py — all 40+ endpoints use only Depends(get_current_user). Any authenticated user can rotate API tokens (:290), push data dictionaries (:612, :1410), delete REDCap forms (:1515), or rewrite webhook secrets (:2443) for any study, without study-membership check or admin role. No @limiter.limit() decorators on any endpoint, including the unauthenticated DET webhook in routers/webhooks.py. Resolution:
- Reads (22 endpoints) gated to require_role("admin", "researcher", "coordinator")
- Mutations (12 endpoints) gated to require_role("admin", "researcher")
- High-risk endpoints (7: token rotation, init, dict push x2, form delete, participant import, webhook secret update) require explicit per-study admin/owner UserStudy membership via new _require_study_admin_membership helper. Stricter than the platform default verify_study_access — does NOT honour the global User.role == "admin" bypass, so a typoed study_id by a system admin cannot rotate the wrong project's REDCap token.
- All 7 high-risk endpoints emit AuditLog rows via new _audit_redcap_action helper; secret values (api_token, webhook_secret) never logged — only *_rotated: bool flags
- DET webhook rate-limited via per-router webhook_limiter (configurable, default 120/min)
- DET test endpoint rate-limited (10/min default) and returns 404 when ENVIRONMENT=production
- Test coverage: 25 tests in tests/test_redcap_rbac.py covering RBAC matrix (read/mutation/high-risk), per-study membership (admin/owner allowed; coordinator/data_entry/researcher/viewer blocked; admin-on-other-study blocked), audit log emission with secret-redaction assertions, DET production gate.
DET webhook lacks idempotency and replay protection. ✅ Fixed 2026-05-07 server/app/routers/webhooks.py:311 — no WebhookEvent table, no nonce/timestamp check. REDCap retries on 5xx will re-fire _process_det_webhook; the visit creation path redcap_det_sync.py:266-322 deduplicates only on (participant_id, redcap_event_name) and will double-create on schedule edits. Resolution:
- New WebhookEvent model (server/app/db/models.py) with composite dedup index over (source, project_id, record_id, instrument, event_name, redcap_repeat_instance, payload_hash, received_at). Migration d2e3f4a5b6c7_add_webhook_events.py.
- DET endpoint computes a SHA-256 payload hash (canonicalized via sorted-key JSON) and looks for an existing event with matching dedup keys whose received_at is within a configurable TTL (redcap_det_idempotency_ttl_seconds, default 24h) AND status ∈ (pending, processing, processed). If found, the duplicate is persisted as status="duplicate" with duplicate_of_id pointing at the original — duplicates are auditable, not silently dropped.
- WebhookEvent is persisted BEFORE queuing the background task: it is the canonical record for retry/DLQ. The BackgroundTasks call is best-effort; a crash leaves the row in processing, and a future retry worker can resume by status.
- _process_det_webhook now takes webhook_event_id and writes terminal status (processed/failed) with processed_at, error_message, retry_count. Three independent transactions: mark processing → run sync → record terminal status — so partial failure leaves observable state.
- Outstanding: background processing still uses FastAPI BackgroundTasks. Moving to Celery for true at-least-once delivery and a coordinator-facing retry endpoint are separate enhancements.
- Test coverage: 7 tests in tests/test_webhook_idempotency.py (first-receipt persistence, replay→duplicate, different record/instrument processed separately, different payload hash processed separately, payload-hash determinism under key reorder, payload-hash sensitivity to value changes).

🟠 Significant (REDCap path — discovered 2026-05-06 review #2)¶

redcap_det_sync._create_visit_schedule bypasses UnifiedScheduler and ConsentGate. ✅ Fixed 2026-05-08 The dead-code path was even broken: _create_visit_schedule instantiated ScheduledVisit(window_open=..., window_close=...) against columns that don't exist on the model — any call would have raised TypeError. The "fallback" _create_visits_from_windows used the right field names but skipped the ConsentGate, business-day rules, and schedule versioning entirely. The default _get_default_field_mapping also mapped first_name/last_name into the Participant(**participant_data) constructor, but those columns don't exist on Participant either — every DET sync would have TypeError'd on participant construction before ever reaching the visit code. Resolution:
- Removed both _create_visit_schedule and _create_visits_from_windows. Field-name drift can no longer recur — there is no second writer of ScheduledVisit.
- Added _seed_anchor_date_and_schedule(participant, redcap_config) which: (a) idempotently creates a ParticipantAnchorDate row (status=finalized, source_type="redcap_det", unique on participant_id); (b) delegates to ScheduleVersioningService.create_schedule_version() which routes through UnifiedSchedulerService → SchedulerService → SAVEPOINT-atomic versioning (§6 #6). Visits inherit consent gate enforcement, business-day rules, REDCap event-name provenance from the synced VisitWindow rows, and audit trail.
- Legacy event_battery_mapping config is detected and logged as a structured deprecation warning (extra={"deprecation": "event_battery_mapping"}) but no longer bypasses scheduling — visits come from VisitWindow rows seeded by REDCapEventSyncService (§6 #5).
- Default field mapping fixed: removed first_name/last_name (not Metricis Participant columns; names live in extra_data). The mapping is now strictly columns Metricis recognises.
- Webhook return contract enriched: sync_participant_from_det now surfaces a scheduling: {success, visits_created, warnings, error} block on the create path so the DET background processor and any future retry surface have observable outcomes. Update path is unchanged (no scheduling on update).
- Test coverage: tests/test_redcap_det_sync_versioning.py — 7 tests in 5 classes:
- TestDETSyncRoutesThroughVersioning — happy path: ParticipantAnchorDate (source_type="redcap_det", status="finalized"), ScheduleVersion (is_current=True, status="active"), ScheduledVisit rows with canonical schema field names (scheduled_date, window_start, window_end), all linked to the version.
- TestDETSyncRespectsConsentGate — consent_mode="digital": scheduling returns success=False with consent error; participant still created (so coordinators see them); no orphan ScheduleVersion.
- TestDETSyncIdempotency — second DET fire updates participant fields but creates no second anchor / version / visits.
- TestDETSyncLegacyConfigDeprecated — event_battery_mapping injected: deprecation warning logged AND visits still come from VisitWindow rows (count matches the no-shortcut path).
- TestDETSyncStructuralCleanup — asserts _create_visit_schedule and _create_visits_from_windows are gone; asserts _seed_anchor_date_and_schedule and _ensure_anchor_date exist.
- TestDETSyncCallsUnifiedScheduler — spies UnifiedSchedulerService.schedule_visits_for_participant to verify the canonical pipeline is invoked with the correct participant.
- All 7 tests green; no regressions across §6 #5/#6/#17/#18 + unified scheduler suites (45 tests verified).
Webhook secret stored plaintext in study.config. ✅ Fixed 2026-05-08 server/app/routers/redcap.py:2727 previously wrote webhook_secret straight into the JSONB config without going through encrypt_token() — asymmetric with api_token (:405). Any operator with read access to the studies table or a database backup could recover the HMAC secret used to authenticate REDCap DET callbacks; rotating the secret left the previous value in plaintext until overwritten. Resolution:
- Write site (routers/redcap.py:update_webhook_config) — update.webhook_secret, when set, is wrapped through encrypt_token() before being stored. An empty payload string (treated as "clear the secret") falls through unchanged so the stored value never becomes enc:v2:<empty-ciphertext>. encrypt_token() is idempotent on enc:v1:/enc:v2: inputs, so a re-PATCH carrying an already-encrypted value (theoretical caller path) does not double-wrap.
- Read site (routers/webhooks.py::_process_det_webhook) — the stored webhook_secret is now decrypted via decrypt_token() before being passed to _validate_webhook_signature. A TokenEncryptionError (e.g. key rotated without REDCAP_ENCRYPTION_KEY_PREVIOUS) is logged and the handler falls through to the "no secret configured" branch — which rejects in production via the existing gate. Failing closed beats a 500 on a public endpoint.
- Audit-log redaction unchanged — the existing had_webhook_secret / has_webhook_secret / webhook_secret_rotated keys remain bool-only; secret values (plaintext or ciphertext) never enter old_values/new_values. New regression test asserts neither plaintext nor enc:v2: strings appear in either blob.
- Migration script (server/migrate_redcap_tokens.py) — refactored around an explicit _ROTATABLE_FIELDS = ("api_token", "webhook_secret") allowlist. The same needs_rotation / rotate_token pipeline applies to both fields, dry-run by default; --apply writes. The allowlist is asserted by a unit test so a future contributor cannot silently drop webhook_secret and regress the invariant.
- Test coverage: 7 tests in tests/test_webhook_secret_encryption.py covering (a) PATCH writes enc:v2:, plaintext absent from stored value, decrypt round-trips; (b) explicit-empty PATCH does not encrypt the empty string; (c) DET request with HMAC computed over the plaintext secret is accepted (proves decrypt-before-compare); (d) wrong-secret signature is rejected; (e) audit log contains neither plaintext nor enc:v2: ciphertext; (f) rotate_token upgrades a plaintext webhook_secret in place; (g) _ROTATABLE_FIELDS includes both fields. All green; no regressions across test_redcap_rbac.py, test_webhook_idempotency.py, test_token_encryption.py (47 tests verified).
Patient portal needs survey task type and stale-data indicator. ✅ Fixed 2026-05-07 (folded into #4) Resolved with the same patch set as §6 #4. Specifically: task_type="survey" is now a first-class portal task (model + migration e3f4a5b6c7d8), start_task returns the stored external_url (or 503 if missing — fail-safe), VisitScheduleItem exposes redcap_event_name/redcap_sync_status/last_redcap_sync_at/is_stale, and GET /api/portal/randomization surfaces the assignment as read-only with masked-label preference for blinded studies. See §6 #4 above for the full resolution.
redcap_event_sync.sync_events_to_visit_windows commits on partial failure. ✅ Fixed 2026-05-08 server/app/services/redcap_event_sync.py:222 previously ran db.commit() unconditionally at the end of the per-event loop, even when result.errors[] was populated. A failure mid-batch — for example, REDCap returning a malformed event mapping or a manual battery override pointing at a deleted battery — would leave a half-synced set of VisitWindow rows persisted, with no rollback option for the operator. Resolution:
- Outer SAVEPOINT — the per-event loop now runs inside async with self.db.begin_nested():. If any event raises in the inner try/except, result.errors is populated and the method raises an internal _AtomicSyncFailure so the SAVEPOINT discards every row added/updated by the events that did succeed. Mirrors the §6 #6 versioning service pattern.
- No service-level commit — await self.db.commit() removed from the body. The FastAPI get_db dependency owns the outer transaction commit, so a partial failure cannot leak past the request boundary; tests and routes that previously relied on the in-service commit instead see the rows via the session's dirty set until the request completes.
- Counter reset on rollback — result.created and result.updated are zeroed when the SAVEPOINT rolls back, since the actual rows were discarded. result.errors and result.details are preserved so the caller can see what was attempted and what failed.
- All-or-nothing semantics chosen over per-event SAVEPOINTs — operator mental model is "sync this batch of REDCap events"; partial success would leave window A consistent against a stale view of event B and force the operator to reason about which subset committed. Re-running the sync after fixing the underlying issue is the cleaner workflow. Per-event SAVEPOINTs remain available as a future enhancement if a study with hundreds of events ever needs partial progress.
- Test coverage: 7 tests in tests/test_redcap_event_sync_atomicity.py:
- TestSyncEventsAtomicity — success path persists all 3 windows; mid-batch failure (forced via monkeypatched _match_instruments_to_battery raising on event 2) returns success=False with the failed event's name in result.errors, result.created == 0, and zero VisitWindow rows in the DB; mid-batch failure during update_existing leaves a pre-seeded window's sentinel target_day=999 and name unchanged (asserts SAVEPOINT undoes mutations to existing rows, not just inserts); contract test counts db.commit() invocations and asserts zero.
- TestSyncEventsExitsCleanlyOnPreFlightFailure — REDCap fetch failure (early return path) does not open a SAVEPOINT and leaves the session usable for subsequent queries.
- TestAtomicSyncFailureSentinelExists — structural guard: _AtomicSyncFailure is importable as an Exception subclass; inspect.getsource(sync_events_to_visit_windows) contains both begin_nested and _AtomicSyncFailure, so an accidental revert of the SAVEPOINT pattern fails the test rather than silently regressing the §6 #21 contract.
- All 7 tests green; no regressions across test_redcap_managed_lifecycle.py (10 tests), test_redcap_det_sync_versioning.py (7 tests), test_schedule_versioning_atomicity.py (8 tests), test_webhook_secret_encryption.py (7 tests) — 39 tests total verified.

🟡 Minor (REDCap path — discovered 2026-05-06 review #2)¶

Double retry layering in REDCap sync. ✅ Fixed 2026-05-10 redcap_sync._sync_session ran a 3× outer retry loop wrapping REDCapService.import_cognitive_data, which itself retried via _retry_with_backoff (3×) — worst case 9 PyCap calls per user-visible sync attempt. The outer loop also masked validation errors behind transient retries. Resolution:
- Outer loop removed — _sync_session now calls import_cognitive_data exactly once per sync. The inner _retry_with_backoff (server/app/services/redcap.py:54, MAX_RETRIES=3) is the sole retry layer; transient errors are absorbed there before the result reaches _sync_session.
- Validation vs reachability split — when REDCap returns a validation error ("validation" / "invalid" substring on the error message), the session is still marked failed per the §6 #14 fail-safe, but the circuit breaker (#23) treats the response as a reachability success — REDCap is up and rejecting bad data, so it would be wrong to disable the project while operators fix the payload.
- Structural guard — tests/test_redcap_retry_and_circuit_breaker.py::TestSingleRetryLayer::test_outer_retry_loop_removed_in_source greps _sync_session source for for attempt in range(. A future contributor restacking retries fails this test before a regression ships.
- Test coverage: 2 tests in the new suite (call-count assertion + structural guard).
No circuit breaker on REDCap API. ✅ Fixed 2026-05-10 A study with a misconfigured URL or persistently unreachable project would retry on every submission indefinitely, amplifying load and burning the retry budget without operator visibility. Resolution:
- New service app/services/redcap_circuit_breaker.py — per-study breaker keyed by study_id, in-process state, async-safe via per-study locks. State machine: closed → open after failure_threshold consecutive reachability failures (default 5) → half_open after cooldown_seconds (default 60) → closed on a successful trial or back to open on a failed trial. Singleton circuit_breaker instance is what redcap_sync imports; thresholds are env-tunable via REDCAP_CIRCUIT_BREAKER_FAILURE_THRESHOLD / REDCAP_CIRCUIT_BREAKER_COOLDOWN_SECONDS / REDCAP_CIRCUIT_BREAKER_ENABLED.
- Wired into _sync_session — await circuit_breaker.allow(study_id) runs before the REDCap call. When closed, the call proceeds and the breaker is updated with record_success / record_failure based on the result. When open, the session is marked sync_status="failed" with error_type="circuit_open" exactly as a real REDCap failure would be — preserving the §6 #14 no-fallback invariant. The breaker MUST NOT compute or substitute Metricis-side data; the module never reads session contents.
- Validation errors are not reachability failures — see #22 split. The breaker counter is unaffected by payload-rejection responses.
- Per-study isolation — opening the breaker for Study A leaves Study B unaffected (independent _BreakerState entries).
- Test coverage: 11 tests in tests/test_redcap_retry_and_circuit_breaker.py covering the full state machine (open after threshold, short-circuit without PyCap call, half-open success closes, half-open failure re-opens, per-study isolation, validation errors don't trip), plus standalone breaker unit tests (disabled-mode, threshold validation, reset). Manual-clock injection avoids real sleep in tests.
_find_study_by_redcap_project is O(N studies) scan. ✅ Fixed 2026-05-10 routers/webhooks.py::_find_study_by_redcap_project previously read every REDCap-mode study, decoded each row's JSONB config in Python, and scanned for a matching config['redcap']['project_id']. Run on every DET webhook fire, the cost grew linearly with the number of REDCap-managed studies in the deployment. Resolution:
- Partial functional B-tree index added via Alembic migration f0a1b2c3d4e5: CREATE INDEX ix_studies_redcap_project_id ON app.studies ((config -> 'redcap' ->> 'project_id')) WHERE integration_mode = 'redcap'. Partial because the only consumer also filters on integration_mode='redcap', so re-indexing metricis-mode rows would be wasted space.
- Query rewritten as a single SQL equality lookup: WHERE integration_mode = 'redcap' AND config['redcap']['project_id'].astext = :project_id. Returns the single matching row directly — no Python-side iteration, no JSONB-decode-per-row.
- Symmetry with §6 #15 preserved: the partial index's WHERE integration_mode = 'redcap' predicate matches the is_redcap_managed(study) gate exactly. A metricis-managed study carrying a stray config.redcap.project_id is excluded from both the index and the query.
- Test coverage: 7 tests in tests/test_redcap_project_id_lookup.py covering correctness (matching project_id resolved, unknown returns None, metricis-mode same-id rejected, redcap+metricis coexistence picks redcap, missing project_id key returns None, int → str coercion) and a structural DDL invariant that pg_indexes still contains the partial functional index with the JSONB expression and WHERE integration_mode = 'redcap' predicate. The DDL check guards against a future migration silently dropping the index and regressing to a sequential scan only visible under load.
Public test endpoint /webhooks/redcap/det/test reflects payloads back unauthenticated. ✅ Fixed 2026-05-07 (folded into #16): the endpoint now returns 404 in production (routers/webhooks.py:430 checks settings.environment == "production" before any reflection) and is rate-limited at 10/min via webhook_limiter. Asserted by tests/test_production_gating.py and tests/test_redcap_rbac.py::TestDETTestEndpointProductionGate.

Deferred to M9 — Metricis-managed mode hardening¶

These are real bugs but the first production study runs in REDCap-managed mode where REDCap handles randomization and scheduling. They block Metricis-managed mode from being production-ready, not the upcoming REDCap-managed launch.

🟠 Significant (Metricis-managed only)¶

Randomization stratum form lookup misses form filter. randomization_service.py:198-210 queries by participant_id + workflow_status only — form_oid is read but not filtered in the SQL where. Wrong form → wrong stratum.
Randomization uses non-cryptographic RNG. randomization_service.py:299 uses random.choice. For interventional trials, allocation should use secrets.SystemRandom to prevent prediction from a known seed.
Minimization silently degrades to simple randomization. randomization_service.py:425 falls back with only a warning. For studies that selected minimization, this is a quiet protocol deviation.
Ad-hoc RBAC in randomization router. routers/randomization.py:363,435,529,582,678 — role checks are inline instead of using a require_role() dependency. Risk of inconsistent enforcement around blinded data.
Business Day Service: no timezone awareness, no holiday import path. business_day_service.py operates on naive date objects. _holiday_cache field is declared but never populated → N+1 queries in bulk_calculate. Studies start with empty Holiday tables unless manually seeded — easy to silently miss statutory holidays.

Test coverage gaps¶

Area	Status	Priority
`token_encryption.py`	✅ 15 tests (2026-05-07)	M7 critical
`routers/testing.py` (time simulation)	✅ Asserted via structural invariant + service-layer tests (2026-05-07)	M7 critical
Production gating (unified)	✅ `tests/test_production_gating.py` — 25 tests (2026-05-07)	M7 critical
`sync_status` no-fallback invariant	✅ Asserted (2026-05-07)	M7 critical
`integration_mode` ⇄ `redcap_enabled` consistency	✅ Asserted (2026-05-07)	M7 critical
DET webhook idempotency / replay	✅ 7 tests (2026-05-07)	M7 critical
REDCap router RBAC	✅ 25 tests (2026-05-07)	M7 critical
REDCap-managed study integration (E2E)	✅ 10 tests across 5 stages — `tests/test_redcap_managed_lifecycle.py` (2026-05-07)	M7 significant
`schedule_versioning_service.py`	✅ 8 atomicity tests — `tests/test_schedule_versioning_atomicity.py` (2026-05-08)	M7
REDCap retry + circuit breaker	✅ 11 tests — `tests/test_redcap_retry_and_circuit_breaker.py` (2026-05-10)	M7 minor
Anchor-shift reconciliation regression	✅ 12 tests — `tests/test_anchor_shift_reconciliation.py` (2026-05-10)	M7 minor
REDCap project_id lookup + index DDL	✅ 7 tests — `tests/test_redcap_project_id_lookup.py` (2026-05-10)	M7 minor
`business_day_service.py`	Zero tests	M9

7. Milestones¶

Phased delivery record. Each milestone groups related deliverables with a date span and links to the resulting subsystems. Cross-reference §3 for current state of each subsystem.

M0 — Platform foundation (through 2026-01-04)¶

Status: ✅ Shipped · Span: ~ → 2026-01-04 Deliverables: initial repo scaffolding, jsPsych client, FastAPI server, PostgreSQL schema, base Researcher Portal, REDCap site config, magic-link patient auth. Closing milestone: scheduler service for visit schedules and reminders (1bccbe7, 2026-01-04). Subsystems landed: #1, #2, #3, #5, #8, #18.

M1 — Major foundational push (2026-01-16)¶

Status: ✅ Shipped · Span: 2026-01-16 (single commit consolidation, dbc144c) Deliverables: WebSocket real-time, PDF reports (ReportLab), participant CSV/Excel import, study templates, access control, mobile services, comprehensive E2E (Playwright) and server (pytest) test suites, GitHub Actions CI/CD, baseline documentation. Subsystems landed: #5 (audit groundwork), #25 (test infra), #28 (CI/CD), Capacitor base for #27.

M2 — EDC core feature build (2026-01-17 → 2026-01-22)¶

Status: ✅ Shipped · Span: 6 days, ~20 commits Deliverables: - Cognitive assessment module registry expansion - Capacitor native config + unified error handling + session security - Notification templates, REDCap event sync, participant link service - REDCap survey link generation and participant management endpoints - VisitService for EDC visit management; SDTM validation + export; Assessment ODM Exporter - Portal task service (unified task queue) - Patient/Caregiver Portal completes (magic link, mobile-first, i18n) - Form workflow state machine + form templates management - Registry Follow-up Cohort service Closing milestone: AI Agent Phase 1 (LLM service + foundation) — 51b0c4f, 2026-01-21. Subsystems landed: #3, #6, #7, #9, #11, #13, #16, #19, #27.

M3 — AI Agent Phase 2 (2026-01-22)¶

Status: ✅ Shipped · Span: single-day milestone (30653f7, 2026-01-22) Deliverables: PDF document processing service, protocol ingestion task, system prompt templates, ChangeSet creation from extracted study structure (events + forms), StudyAssistant portal page (711 LOC) with upload + run + ChangeSet review workflow. Subsystem landed: #12.

M4 — EDC operations hardening (2026-01-23 → 2026-01-28)¶

Status: ✅ Shipped (with known issues) · Span: 6 days Deliverables: - Source Data Verification (SDV) and Form Validation services - Quality flags & validation rule infrastructure - Comprehensive documentation refresh; CI improvements (path filters, ESLint v9) - Auto-logout on token expiry; portal_base_url config - Business Day Service for scheduling (weekend/holiday) — ⚠️ §6 #8 - Schedule Versioning + Unified Scheduling services — versioned schedule snapshots tied to anchor date - Comprehensive tests for migrations, rate limits, security headers, unified scheduler - Randomization module — full stack (DB, services, API, portal UI, alembic) — ⚠️ §6 #3, #4, #5, #6 Subsystems landed: #14, #15, #17, #20.

M5 — Infrastructure & polish (2026-02-06 → 2026-02-12)¶

Status: ✅ Shipped (with known issues) · Span: 1 week Deliverables: - Multi-stage Docker builds for Nginx + FastAPI server (da0792b) - TimeSimulationBanner for test-mode simulated date — ⚠️ §6 #1 (not gated to dev mode) - Digit Symbol Matching Task expanded to 9 symbols - Development server port refresh across configs Subsystems landed: #23 (with caveats).

M6 — Security & docs hygiene (2026-04 → 2026-05)¶

Status: ✅ Partially shipped · Span: ongoing Deliverables: - Token encryption service for REDCap tokens (Fernet); tests for ConsentSign, Login, Schedule (4e3a55d, 2026-04-21) — ⚠️ §6 #2 (token encryption itself has no tests; falls back to plaintext) - CLAUDE.md condensed 1530 → 326 lines (17056a8, 2026-05-06) - This consolidated project plan hub (2026-05-06) - REDCap router RBAC + audit logging + rate limiting + DET test endpoint production gate (2026-05-07) — fully resolves §6 #16. Two-stage rollout same day: 1. Initial commit (b3d022b): 22 reads gated to admin/researcher/coordinator, 12 mutations to admin/researcher, 7 high-risk endpoints to admin global-role; 7 audit-logged high-risk operations with secret redaction; slowapi rate limiting on the unauthenticated DET surface; 404 gate on /webhooks/redcap/det/test in production; 16 tests. 2. Hardening commit: replaced global-admin gate with explicit per-study admin/owner UserStudy membership check (_require_study_admin_membership). Even system admins must be explicit study members to rotate REDCap tokens, push data dictionaries, delete forms, sync participants, or update webhook secrets. Test coverage expanded to 25 tests including per-study isolation (admin on study A can't mutate study B). - DET webhook idempotency + persistence + DLQ scaffolding (2026-05-07) — resolves §6 #17. Adds WebhookEvent model and migration d2e3f4a5b6c7; payload hashing with deterministic canonicalization; dedup window via configurable TTL; duplicates persisted as audit rows pointing back to the original; receipt persisted before background processing so a crash leaves observable state for retry. 7 tests covering persistence, idempotency, and hash properties. Subsystems landed: #22 (with caveats), #19 (RBAC complete; idempotency still pending — §6 #17), documentation hub.

Status: ✅ §6 closed (every 🔴/🟠/🟡 resolved) · cutover checklist landed (sponsor-study-cutover.md) · M7 ships when the first sponsor study walks the checklist and records sign-off in the audit log · Span: Q2 2026 · Priority: OPERATIONAL GATE — active development moves to M8 Recent progress: §6 #16 (REDCap router RBAC + audit + rate limit + DET test gate), §6 #17 (DET webhook idempotency + WebhookEvent persistence), §6 #14 (sync_status no-fallback invariant + portal banner + dashboard alert), §6 #15 (integration_mode unification via is_redcap_managed(study) helper + dual structural/row-level invariant), §6 #1 (time simulation gating + audit), §6 #2 + #7 (token encryption hardening with enc:v2: + key rotation), §6 #3 (unified production gating test suite), §6 #4 + #20 (patient-portal REDCap data ingestion: task_type="survey", stale-data flag on visit schedule, randomization read-only display), §6 #5 (REDCap-managed study lifecycle E2E test, 10 tests across 5 stages — fixed two pre-existing defects in redcap_sync._build_submission_from_responses) all shipped 2026-05-07. §6 #6 (schedule versioning atomic via begin_nested SAVEPOINT + auto_commit=False plumbed through unified/legacy/EDC schedulers; 8 tests), §6 #18 (DET sync routed through ScheduleVersioningService + UnifiedScheduler + ConsentGate; removed broken _create_visit_schedule + bypassing _create_visits_from_windows; fixed default field mapping that referenced non-existent Participant columns; 7 tests), and §6 #19 (webhook_secret encryption parity — encrypt_token() symmetric with api_token, DET handler decrypts before HMAC validation, migration script extended to both fields via _ROTATABLE_FIELDS; 7 tests), and §6 #21 (REDCap event sync transactional boundary — per-event loop wrapped in db.begin_nested() SAVEPOINT, partial failure raises _AtomicSyncFailure to roll back the entire batch, service-level commit removed in favour of the FastAPI request lifecycle; 7 tests including a structural guard) shipped 2026-05-08. §6 #22 (collapsed REDCap retry layers — single retry layer at the REDCapService boundary, outer 3× loop removed; structural guard prevents re-stacking) and §6 #23 (per-study REDCap circuit breaker with closed/open/half_open state machine; opens after threshold reachability failures, short-circuits without touching PyCap, recovers via half-open trial; validation errors don't count as reachability; per-study isolation) shipped together 2026-05-10 with 11 tests in tests/test_redcap_retry_and_circuit_breaker.py. §6 #25 also marked closed (folded into #16). §6 #8 (anchor-shift reconciliation regression suite — 12 tests in tests/test_anchor_shift_reconciliation.py pinning completed-visit preservation, pending-visit soft-cancellation, version supersession, audit-metadata wiring, and original_target_date idempotency across multiple anchor shifts) shipped 2026-05-10. §6 #24 (partial functional index ix_studies_redcap_project_id via Alembic f0a1b2c3d4e5 + SQL-level project_id lookup in _find_study_by_redcap_project; 7 tests including a pg_indexes DDL invariant) shipped 2026-05-10 — all M7 §6 items now closed. Goal: ship a production-grade Metricis deployment for the first sponsor study, where REDCap is the source of truth for randomization, data collection forms, events, and visit windows and Metricis delivers the patient/caregiver experience on top.

Critical deliverables (block production): - ✅ Gating audit + test suite (§6 #3, fixed 2026-05-07) — tests/test_production_gating.py (25 tests) proves every /api/dev/*, /api/testing/*, and /api/webhooks/redcap/det/test endpoint returns 404/403 in production; service-level + token-encryption defence-in-depth assertions included - ✅ Time simulation gated (§6 #1, fixed 2026-05-07) — service gate (_time_simulation_allowed) + router-level production 404 + _require_dev_mode_admin on every mutation + AuditLog row + 3 invariant tests including dependency-tree walk over routers/testing.py - ✅ Token encryption hardened (§6 #2 + #7, fixed 2026-05-07) — TokenEncryptionConfigError on missing key in production; plaintext reads rejected by default (one-time bridge via REDCAP_ENCRYPTION_ALLOW_PLAINTEXT_READS with CRITICAL log); enc:v2: write format with enc:v1: back-compat; key rotation via REDCAP_ENCRYPTION_KEY_PREVIOUS + MultiFernet + rotate_token(); one-shot migration script migrate_redcap_tokens.py; 15 unit tests - ✅ sync_status consumption invariant (§6 #14, fixed 2026-05-07) — invariant test asserts no service reads Session.sync_status for fallback decisions; patient portal banner + coordinator dashboard alert - ✅ integration_mode unification (§6 #15, fixed 2026-05-07) — is_redcap_managed(study) is the canonical gate; every gating call site migrated; structural reader-allowlist invariant + row-level consistency invariant + helper unit tests - ✅ REDCap router RBAC + audit + rate limiting (§6 #16, fixed 2026-05-07) — require_role per role-class; per-study admin/owner check on 7 high-risk endpoints; AuditLog rows with secret redaction; DET webhook rate-limited; /webhooks/redcap/det/test 404 in production; 25 tests - ✅ DET webhook idempotency + replay protection (§6 #17, fixed 2026-05-07) — WebhookEvent model + composite dedup index + payload-hash canonicalisation + duplicate-row persistence + 7 tests; Celery transport remains a future enhancement

Significant deliverables (REDCap path): - ✅ Patient portal REDCap data ingestion (§6 #4, #20, fixed 2026-05-07) — task_type="survey" with external_url launcher (idempotent on (participant, instrument, event, repeat_instance)), start_task returns 503 if URL missing (fail-safe); VisitScheduleItem carries redcap_event_name/redcap_sync_status/last_redcap_sync_at/is_stale (stale = failed sync OR never reconciled OR >24h); new GET /api/portal/randomization with masked-label preference for blinded studies; migration e3f4a5b6c7d8; 14 tests - ✅ REDCap-managed study E2E test (§6 #5, fixed 2026-05-07) — tests/test_redcap_managed_lifecycle.py: 10 tests across 5 stages (event sync, visit scheduling, portal delivery, completion sync success+failure, portal stale-data signalling). REDCap I/O mocked at module-level fetchers and REDCapService._get_project. Surfaced and fixed two pre-existing critical defects in redcap_sync._build_submission_from_responses (AssessmentMetadata import drift + missing min_rt/max_rt defaults) — both would have raised at production runtime and silently classified syncs as failed. - ✅ Route DET sync through UnifiedScheduler + ConsentGate (§6 #18, fixed 2026-05-08) — _create_visit_schedule and _create_visits_from_windows removed (the former referenced non-existent columns and would have TypeError'd at runtime); replaced with _seed_anchor_date_and_schedule that creates a ParticipantAnchorDate (source_type=redcap_det) and delegates to ScheduleVersioningService.create_schedule_version. Default field mapping pruned of non-existent Participant columns. Legacy event_battery_mapping logged as deprecation. 7 tests in tests/test_redcap_det_sync_versioning.py. - ✅ Webhook secret encryption parity (§6 #19, fixed 2026-05-08) — encrypt_token() symmetric with api_token at the write site (routers/redcap.py::update_webhook_config); DET handler decrypts via decrypt_token() before HMAC validation, with graceful fallback to "no secret" → production reject on rotation/key errors; migrate_redcap_tokens.py extended via _ROTATABLE_FIELDS = ("api_token", "webhook_secret"); 7 tests in tests/test_webhook_secret_encryption.py. - ✅ REDCap event sync transactional boundary (§6 #21, fixed 2026-05-08) — sync_events_to_visit_windows wraps its per-event loop in a db.begin_nested() SAVEPOINT and raises _AtomicSyncFailure on partial failure so the SAVEPOINT discards every row added/updated by the events that did succeed; service no longer calls db.commit() directly (FastAPI get_db owns the outer commit, mirroring §6 #6). 7 tests in tests/test_redcap_event_sync_atomicity.py including a structural guard that fails if begin_nested or _AtomicSyncFailure is removed from the implementation. - ✅ Schedule versioning transaction boundary (§6 #6, fixed 2026-05-08) — create_schedule_version and regenerate_schedule wrapped in db.begin_nested() SAVEPOINTs; auto_commit=False plumbed through UnifiedScheduler/SchedulerService/VisitService so inner commits no longer break the boundary; orphan versions on partial failure now impossible. 8 atomicity tests in tests/test_schedule_versioning_atomicity.py.

Subsystems affected: #5, #9, #14, #15, #18, #19, #22, #23, #24, #25.

M8 — AI Agent Phase 3 (active)¶

Status: 🔄 In progress · Priority: PRIMARY (active development) · Spec: ai-agent-implementation-plan.md Recent progress: battery_config AI task shipped 2026-05-10 — task class at server/app/services/agent_tasks/battery_config.py, prompt template + JSON output schema at server/app/services/agent_prompts/battery_config.py, dispatch wired in AgentService.execute_task. 9 unit tests in tests/test_agent_battery_config.py cover happy path (5-item ChangeSet with 2 batteries + 3 event assignments), existing-VisitWindow anchoring, missing-document error, non-dict LLM output rejection, unexpected exception wrapping, and AgentService dispatch contract. Goal: AI assistance for jsPsych battery and assessment configuration. Deliverables: - ✅ Battery configuration agent task (battery_config) - 🔜 Event-linking agent task (event_linking) - 🔜 Portal inline assists ("Suggest module selection" in BatteryBuilder) - 🔜 Validation pipeline for battery proposals - 🔜 Apply path implementation (shared with Phase 2)

M9 — Metricis-managed mode hardening (deferred)¶

Status: 🔜 Deferred · Priority: LOW (until Metricis-managed studies become a target) Rationale: the first production study runs in REDCap-managed mode where REDCap handles randomization and scheduling. The bugs in §6 #9–13 do not block the M7 launch. They will block any future Metricis-managed mode study, so they remain on the roadmap. Deliverables: - Randomization fixes: stratum form lookup (§6 #9), secrets.SystemRandom (#10), minimization integrity (#11), require_role() dependency (#12) - Business Day Service: timezone awareness, holiday import path, populate _holiday_cache (§6 #13) - Tests for randomization, business day service

M10 — AI Agent Phase 4 (later)¶

Status: 🔜 Planned · Priority: LATER Goal: Amendment Impact Analysis & Cutover Planning agent capabilities.

8. Document Conventions¶

This hub stays high-level: status, sequencing, progress, known issues. Detailed designs go in spoke documents.
Update §3 Status table when a subsystem changes state.
Add or update a milestone in §7 Milestones whenever a coherent body of work ships. Group related commits under a single dated milestone rather than one entry per commit.
Add to §6 Known Issues whenever a code review surfaces something — link to the file:line. Move resolved items to a "Resolved" subsection or delete if minor.
Archive superseded plans to docs/archive/completed-plans/ with a date-stamped filename.

9. Spec Spokes¶

Specifications referenced from this hub. Edit the spec, not the hub, for design changes.

Active specs¶

Spec	Purpose
ai-assistant.md	Full AI Agent Facilitation specification (1333 lines)
ai-agent-implementation-plan.md	Phased AI Agent implementation plan (Phases 1–4)
patient-caregiver-portal.md	Patient/Caregiver Portal specification (✅ implemented; spec header is stale)
participant-registry.md	Patient registry framing and architecture
rare-disease-integration.md	Rare disease & pediatric adaptations (✅ all phases shipped)
edc-implementation-plan.md	EDC core implementation detail
redcap-det-webhook-implementation.md	REDCap Data Entry Trigger integration
ux-reviewer.md	UX review process for the portal

Reference¶

CLAUDE.md — engineering context and architectural concepts
docs/PAGES.md — page-by-page reference for both portals
docs/api/ — API reference (Swagger UI live at /docs)
docs/architecture/ — architecture diagrams and design records
docs/guides/ — developer and administrator guides
docs/features/ — user-facing feature documentation

Archived¶

docs/archive/completed-plans/project-plan-2026-05-06.md — previous top-level project plan, superseded by this hub
Other completed/archived plans in docs/archive/completed-plans/

10. How to Update This Document¶

When you ship something:

Update the relevant row in §3 Status by Subsystem (state, evidence link).
Either extend an active milestone in §7 or open a new one if it's a distinct phase of work.
Update the "Last updated" stamp at the top.

When you find a bug or risk during review:

Add it to §6 Known Issues with severity, file:line reference, and a one-line "Fix:" stub.
Once fixed, move it to a "Resolved" subsection with the resolving commit hash, or delete if minor.

When you write a new spec:

Place it in docs/project-plan/ (feature specs) or docs/plans/ (implementation plans).
Add a row to §9 Spec Spokes pointing to it.
Cross-link from the spec back to this hub via a **Parent Document:** line.