AI Agent Facilitation Implementation Plan¶

Status: Phase 1 Complete, Phase 2 Next Created: 2026-01-21 Last Updated: 2026-01-21 Parent Document: ai-assistant.md

Executive Summary¶

This document provides the concrete implementation plan for AI Agent Facilitation in Metricis. The goal is to build governed AI co-pilots that accelerate study setup and amendments while maintaining regulatory defensibility.

Key Architecture Decision: AI agents produce typed, reviewable ChangeSets—not free-form outputs. Every AI action is auditable, requires human approval, and integrates with the existing metadata versioning workflow.

Phase Overview¶

Phase	Scope	Status
Phase 1	Foundation - Database models, LLM service, ChangeSet infrastructure	✅ Complete
Phase 2	Protocol Ingestion & Study Structure Generation	🔜 Next
Phase 3	Battery & Assessment Configuration Assistance	🔜 Planned
Phase 4	Amendment Impact Analysis & Cutover Planning	🔜 Planned

Phase 1: Foundation Infrastructure ✅ COMPLETE¶

Implemented: 2026-01-21 Files Created: - server/alembic/versions/025_ai_agent_foundation.py - Database migration - server/app/services/llm_service.py - LLM integration - server/app/services/agent_service.py - Task orchestration - server/app/services/changeset_validator.py - Validation service - server/app/routers/ai_assistant.py - REST API endpoints

Files Modified: - server/app/db/models.py - Added AgentDocument, AgentRun, ChangeSet, ChangeSetItem, AuditTrailEvent - server/app/config.py - Added AI settings (anthropic_api_key, ai_model_id, etc.) - server/app/main.py - Registered AI assistant router

1.1 Database Models¶

File: server/alembic/versions/025_ai_agent_foundation.py

AgentRun Model¶

Tracks a single AI task execution.

class AgentRun(Base):
    """Tracks a single AI agent task execution."""
    __tablename__ = "agent_runs"

    id: Mapped[uuid.UUID]  # Primary key
    study_id: Mapped[uuid.UUID]  # Foreign key to studies

    # Task identification
    task_type: Mapped[str]  # "protocol_ingestion", "battery_config", "amendment_analysis"
    task_name: Mapped[str]  # Human-readable task name

    # Execution status
    status: Mapped[str]  # "pending", "running", "completed", "failed", "cancelled"
    started_at: Mapped[Optional[datetime]]
    completed_at: Mapped[Optional[datetime]]
    error_message: Mapped[Optional[str]]

    # Input tracking
    input_artifacts: Mapped[dict]  # JSONB - references to uploaded docs, parameters
    input_constraints: Mapped[Optional[dict]]  # JSONB - "do not create visits", etc.

    # Output tracking
    output_changeset_id: Mapped[Optional[uuid.UUID]]  # Foreign key to change_sets

    # LLM usage tracking
    model_id: Mapped[str]  # "claude-3-5-sonnet-20241022"
    usage_stats: Mapped[Optional[dict]]  # JSONB - tokens_in, tokens_out, cost

    # Audit integration
    created_by_id: Mapped[uuid.UUID]  # Foreign key to users
    created_at: Mapped[datetime]

ChangeSet Model¶

Container for proposed configuration changes.

class ChangeSet(Base):
    """Container for AI-generated configuration proposals."""
    __tablename__ = "change_sets"

    id: Mapped[uuid.UUID]
    study_id: Mapped[uuid.UUID]
    agent_run_id: Mapped[Optional[uuid.UUID]]  # Nullable for manual changesets

    # Lifecycle
    status: Mapped[str]  # "draft", "validating", "ready", "applied", "rejected"

    # Validation results
    validation_status: Mapped[Optional[str]]  # "passed", "warnings", "failed"
    validation_results: Mapped[Optional[dict]]  # JSONB - detailed validation output

    # Application tracking
    target_metadata_version_id: Mapped[Optional[uuid.UUID]]  # Where changes will apply
    applied_at: Mapped[Optional[datetime]]
    applied_by_id: Mapped[Optional[uuid.UUID]]
    rejection_reason: Mapped[Optional[str]]
    rejected_at: Mapped[Optional[datetime]]
    rejected_by_id: Mapped[Optional[uuid.UUID]]

    # Summary
    summary: Mapped[Optional[str]]  # AI-generated summary of changes
    risk_flags: Mapped[Optional[dict]]  # JSONB - breaking changes, consent impacts

    created_by_id: Mapped[uuid.UUID]
    created_at: Mapped[datetime]

    # Relationships
    items: Mapped[list["ChangeSetItem"]]

ChangeSetItem Model¶

Individual artifact proposal within a ChangeSet.

class ChangeSetItem(Base):
    """Individual artifact proposal within a ChangeSet."""
    __tablename__ = "change_set_items"

    id: Mapped[uuid.UUID]
    changeset_id: Mapped[uuid.UUID]

    # Artifact identification
    artifact_type: Mapped[str]  # "study_event_def", "battery_version", "consent_version", "form_def", "rule"
    artifact_id: Mapped[Optional[uuid.UUID]]  # Existing ID if update, null if create
    artifact_oid: Mapped[Optional[str]]  # ODM OID for the artifact

    # Change specification
    action: Mapped[str]  # "create", "update", "deprecate"
    before_state: Mapped[Optional[dict]]  # JSONB - state before (for updates)
    after_state: Mapped[dict]  # JSONB - proposed new state

    # AI metadata
    rationale: Mapped[Optional[str]]  # Why the AI proposed this
    confidence: Mapped[Optional[float]]  # 0.0-1.0 confidence score
    source_references: Mapped[Optional[dict]]  # JSONB - protocol section, page, etc.

    # Review tracking
    status: Mapped[str]  # "pending", "accepted", "rejected", "needs_input"
    reviewer_notes: Mapped[Optional[str]]
    reviewed_by_id: Mapped[Optional[uuid.UUID]]
    reviewed_at: Mapped[Optional[datetime]]

    created_at: Mapped[datetime]

AgentDocument Model¶

Tracks uploaded documents for AI context.

class AgentDocument(Base):
    """Uploaded document for AI context (protocols, SoAs, etc.)."""
    __tablename__ = "agent_documents"

    id: Mapped[uuid.UUID]
    study_id: Mapped[uuid.UUID]

    # Document metadata
    document_type: Mapped[str]  # "protocol", "soa", "crf_spec", "consent", "battery_plan"
    filename: Mapped[str]
    mime_type: Mapped[str]
    file_size: Mapped[int]

    # Storage
    storage_path: Mapped[str]  # Local path or S3 key

    # Processing status
    processing_status: Mapped[str]  # "pending", "processing", "ready", "failed"
    extracted_text: Mapped[Optional[str]]  # Full text extraction
    extracted_sections: Mapped[Optional[dict]]  # JSONB - parsed sections

    uploaded_by_id: Mapped[uuid.UUID]
    created_at: Mapped[datetime]

1.2 LLM Service Layer¶

File: server/app/services/llm_service.py

class LLMService:
    """Service for LLM interactions with Anthropic Claude."""

    def __init__(self, api_key: str):
        self.client = anthropic.Anthropic(api_key=api_key)
        self.model = "claude-sonnet-4-20250514"

    async def generate_structured_output(
        self,
        system_prompt: str,
        user_prompt: str,
        output_schema: dict,  # JSON schema for structured output
        context_documents: list[AgentDocument] = None,
        max_tokens: int = 4096,
    ) -> tuple[dict, dict]:
        """
        Generate structured output from Claude.

        Returns:
            tuple: (parsed_output, usage_stats)
        """
        pass

    async def stream_response(
        self,
        system_prompt: str,
        user_prompt: str,
        on_chunk: Callable[[str], None],
    ) -> dict:
        """Stream response for real-time UI updates."""
        pass

    def estimate_tokens(self, text: str) -> int:
        """Estimate token count for context management."""
        pass

1.3 Agent Service (Orchestrator)¶

File: server/app/services/agent_service.py

class AgentService:
    """Orchestrates AI agent tasks and manages the execution lifecycle."""

    # Task type definitions with required permissions
    TASK_TYPES = {
        "protocol_ingestion": {
            "name": "Protocol Ingestion",
            "required_permission": "manage_study_design",
            "produces": ["study_event_def", "form_def"],
        },
        "battery_config": {
            "name": "Battery Configuration",
            "required_permission": "manage_batteries",
            "produces": ["battery_version", "event_linking"],
        },
        "consent_config": {
            "name": "Consent Configuration",
            "required_permission": "manage_consent",
            "produces": ["consent_version", "consent_trigger"],
        },
        "amendment_analysis": {
            "name": "Amendment Impact Analysis",
            "required_permission": "manage_amendments",
            "produces": ["impact_report", "cutover_plan"],
        },
    }

    async def start_task(
        self,
        study_id: uuid.UUID,
        task_type: str,
        user_id: uuid.UUID,
        input_artifacts: dict,
        constraints: dict = None,
    ) -> AgentRun:
        """Start a new AI task and return the run record."""
        pass

    async def get_run_status(self, run_id: uuid.UUID) -> AgentRun:
        """Get current status of an agent run."""
        pass

    async def cancel_run(self, run_id: uuid.UUID, user_id: uuid.UUID) -> bool:
        """Cancel a running task."""
        pass

1.4 ChangeSet Service¶

File: server/app/services/changeset_service.py

class ChangeSetService:
    """Manages ChangeSet lifecycle: create, validate, apply, reject."""

    # Validation rules per artifact type
    VALIDATORS = {
        "study_event_def": StudyEventDefValidator,
        "battery_version": BatteryVersionValidator,
        "consent_version": ConsentVersionValidator,
        "form_def": FormDefValidator,
    }

    async def create_changeset(
        self,
        study_id: uuid.UUID,
        agent_run_id: uuid.UUID,
        items: list[dict],
        summary: str,
    ) -> ChangeSet:
        """Create a new ChangeSet from AI outputs."""
        pass

    async def validate_changeset(self, changeset_id: uuid.UUID) -> ValidationResult:
        """Run all validators against the changeset items."""
        pass

    async def apply_changeset(
        self,
        changeset_id: uuid.UUID,
        target_version_id: uuid.UUID,
        user_id: uuid.UUID,
    ) -> bool:
        """Apply approved items to a draft metadata version."""
        pass

    async def reject_changeset(
        self,
        changeset_id: uuid.UUID,
        user_id: uuid.UUID,
        reason: str,
    ) -> bool:
        """Reject a changeset with reason."""
        pass

    async def get_changeset_diff(
        self,
        changeset_id: uuid.UUID,
    ) -> dict:
        """Generate diff view of proposed changes."""
        pass

1.5 API Router¶

File: server/app/routers/ai_assistant.py

# Endpoints:

# Documents
POST   /api/studies/{study_id}/ai/documents              # Upload document
GET    /api/studies/{study_id}/ai/documents              # List documents
DELETE /api/studies/{study_id}/ai/documents/{doc_id}     # Delete document

# Agent Runs
POST   /api/studies/{study_id}/ai/runs                   # Start AI task
GET    /api/studies/{study_id}/ai/runs                   # List runs
GET    /api/studies/{study_id}/ai/runs/{run_id}          # Get run details
POST   /api/studies/{study_id}/ai/runs/{run_id}/cancel   # Cancel run

# ChangeSets
GET    /api/studies/{study_id}/ai/changesets             # List changesets
GET    /api/studies/{study_id}/ai/changesets/{id}        # Get changeset
GET    /api/studies/{study_id}/ai/changesets/{id}/diff   # Get diff view
POST   /api/studies/{study_id}/ai/changesets/{id}/validate  # Run validation
POST   /api/studies/{study_id}/ai/changesets/{id}/apply  # Apply to metadata version
POST   /api/studies/{study_id}/ai/changesets/{id}/reject # Reject changeset

# Individual Items
GET    /api/studies/{study_id}/ai/changesets/{id}/items  # List items
PATCH  /api/studies/{study_id}/ai/changesets/{id}/items/{item_id}  # Update item status

1.6 Audit Event Types ✅¶

Implemented via AuditLog entries with the following action codes: - ai_document_upload - Document uploaded for AI processing - ai_document_archive - Document archived - ai_task_start - AI task execution started - ai_task_cancel - AI task execution cancelled - ai_changeset_apply - ChangeSet applied to metadata version - ai_changeset_reject - ChangeSet rejected - ai_item_accepted - ChangeSet item accepted - ai_item_rejected - ChangeSet item rejected - ai_item_modified - ChangeSet item modified by user - ai_item_needs_input - ChangeSet item needs user input

Phase 2: Protocol Ingestion & Study Structure 🔜¶

2.1 Protocol Ingestion Task¶

File: server/app/services/agent_tasks/protocol_ingestion.py

This task: 1. Parses uploaded protocol PDF and SoA 2. Extracts visit schedule, timing windows, forms 3. Generates StudyEventDef proposals 4. Generates FormDef stubs 5. Creates ChangeSet with all proposals

Input Schema:

{
  "protocol_document_id": "uuid",
  "soa_document_id": "uuid",
  "study_metadata": {
    "phase": "II",
    "arms": ["Treatment", "Control"],
    "countries": ["CA", "US"],
    "estimated_enrollment": 100
  }
}

Output Schema:

{
  "study_events": [
    {
      "artifact_type": "study_event_def",
      "action": "create",
      "after_state": {
        "event_oid": "SE.SCREENING",
        "name": "Screening Visit",
        "type": "scheduled",
        "mandatory": true,
        "target_day": -14,
        "window_before": 7,
        "window_after": 0
      },
      "rationale": "Extracted from Protocol Section 6.1 - Visit Schedule",
      "source_references": {"section": "6.1", "page": 42}
    }
  ],
  "forms": [...],
  "summary": "Generated 8 visits and 12 form stubs from protocol",
  "risk_flags": []
}

2.2 System Prompt Template¶

File: server/app/services/agent_prompts/protocol_ingestion.md

You are a clinical research configuration assistant for the Metricis EDC platform.

## Your Task
Extract study structure from the provided protocol and Schedule of Activities (SoA).

## Output Requirements
You MUST output valid JSON matching the provided schema. Do not include any text outside the JSON.

## Rules
1. Every visit must have an ODM-compliant OID (pattern: SE.{NAME})
2. Target days are relative to enrollment (Day 0)
3. Windows must be clinically reasonable
4. Flag any ambiguities as "needs_input"
5. Include source references (section, page) for every proposal

## Protocol Context
{protocol_text}

## Schedule of Activities
{soa_text}

## Study Metadata
{study_metadata}

Phase 3: Battery & Assessment Configuration¶

3.1 Battery Configuration Task¶

File: server/app/services/agent_tasks/battery_config.py

This task: 1. Reads battery specification (domains, timing, constraints) 2. Proposes BatteryVersion with modules 3. Proposes event linking (which battery at which visit) 4. Generates ODM ItemDef mappings for outputs

3.2 Event Linking Task¶

File: server/app/services/agent_tasks/event_linking.py

This task: 1. Takes existing visits and batteries 2. Proposes optimal linkings based on protocol 3. Identifies timing conflicts 4. Flags burden concerns (too many assessments per visit)

Phase 4: Amendment Impact Analysis¶

4.1 Amendment Analysis Task¶

File: server/app/services/agent_tasks/amendment_analysis.py

This task leverages existing AmendmentImpactService: 1. Takes proposed amendment description 2. Identifies affected visits, forms, batteries, consents 3. Generates per-participant impact narratives 4. Proposes queue reconciliation policy 5. Creates cutover checklist

Integration with existing services:

# Uses existing amendment_impact_service.py
impact_service = AmendmentImpactService(db)
preview = await impact_service.preview_amendment_impact(
    study_id, proposed_changes
)

# AI enhances with narratives
narratives = await llm_service.generate_narratives(preview)

Portal UI Components¶

Study Assistant Page¶

File: portal/src/pages/StudyAssistant.tsx

Tabs: 1. Overview - Dashboard with recent runs, quick actions 2. Documents - Upload and manage protocol docs 3. Runs - List of AI task executions 4. ChangeSets - Review and apply proposals 5. Audit - AI activity log

Inline Assists¶

Add "Assist with AI" buttons to: - StudyDesign.tsx - "Draft visits from protocol" - BatteryBuilder.tsx - "Suggest module selection" - ConsentDesigner.tsx - "Draft consent triggers" - MetadataVersions.tsx - "Explain differences"

Components¶

Component	Purpose
`ArtifactCard.tsx`	Display single proposed artifact
`ChangeSetDiffViewer.tsx`	Side-by-side diff display
`ValidationPanel.tsx`	Show validation results
`DocumentUploader.tsx`	Protocol/SoA upload
`AgentRunProgress.tsx`	Real-time task progress
`AIRationalePanel.tsx`	Show AI reasoning

Configuration & Environment¶

Environment Variables¶

# Required
ANTHROPIC_API_KEY=sk-ant-...

# Optional
AI_MODEL_ID=claude-sonnet-4-20250514
AI_MAX_TOKENS=4096
AI_TEMPERATURE=0.2
AI_ENABLED=true  # Feature flag

Feature Flags¶

# In study config
study.config = {
    "ai_assistant_enabled": True,
    "ai_allowed_tasks": ["protocol_ingestion", "battery_config"],
    "ai_require_approval": True,  # Always true for now
}

Security & Compliance¶

Access Control¶

Role	Permissions
Data Manager	Start tasks, apply changesets, manage documents
PI	View changesets, approve metadata versions (existing flow)
CRC	No AI access
Monitor	View AI audit logs only

Audit Requirements¶

Every AI interaction creates audit events: 1. Document upload → AI_DOCUMENT_UPLOADED 2. Task start → AI_RUN_STARTED 3. Task complete → AI_RUN_COMPLETED (with usage stats) 4. ChangeSet create → AI_CHANGESET_CREATED 5. Item accept/reject → AI_ITEM_ACCEPTED/REJECTED 6. Apply to version → AI_CHANGESET_APPLIED

Data Handling¶

Protocol documents are stored locally (not sent to 3rd party storage)
Extracted text may be sent to Claude API
No PHI/PII should be in protocol documents
Usage stats (tokens, cost) are tracked for governance

Implementation Progress¶

Phase 1 Foundation ✅ COMPLETE¶

✅ Create migration 025_ai_agent_foundation.py
✅ Implement llm_service.py (with optional anthropic dependency)
✅ Implement agent_service.py
✅ Implement changeset_validator.py
✅ Create ai_assistant.py router (documents, runs, changesets)
✅ Add audit event types

Phase 2 Protocol Ingestion 🔜 NEXT¶

⬜ Implement protocol_ingestion.py task
⬜ Create prompt templates
⬜ Add document processing (PDF extraction)
⬜ Build basic Portal UI (StudyAssistant page)
⬜ End-to-end testing

Phase 3 Battery Configuration 🔜 PLANNED¶

⬜ Implement battery_config.py task
⬜ Implement event_linking.py task
⬜ Add inline assists to BatteryBuilder
⬜ Validation pipeline for battery proposals

Phase 4 Amendment Analysis 🔜 PLANNED¶

⬜ Integrate with existing AmendmentImpactService
⬜ Implement amendment_analysis.py task
⬜ Narrative generation for impacts
⬜ Cutover planning assistance

Success Metrics¶

Metric	Target
Study setup time reduction	50%
Configuration errors caught by validation	>90%
Audit coverage	100% of AI actions
Human approval rate	100% (by design)

Risks & Mitigations¶

Risk	Mitigation
LLM hallucinations	Strict JSON schema validation, source references required
Regulatory concerns	All outputs are drafts, human approval required
Cost overruns	Token tracking, usage limits per study
Prompt injection	Input sanitization, structured prompts only
Breaking changes	Validation pipeline catches incompatible proposals

AI Assistant Specification - Full feature spec
Metadata Versioning - Version workflow
Audit Event Taxonomy - Event definitions