Skip to content

AI Agent Facilitation Implementation Plan

Status: Phase 1 Complete, Phase 2 Next Created: 2026-01-21 Last Updated: 2026-01-21 Parent Document: ai-assistant.md


Executive Summary

This document provides the concrete implementation plan for AI Agent Facilitation in Metricis. The goal is to build governed AI co-pilots that accelerate study setup and amendments while maintaining regulatory defensibility.

Key Architecture Decision: AI agents produce typed, reviewable ChangeSetsβ€”not free-form outputs. Every AI action is auditable, requires human approval, and integrates with the existing metadata versioning workflow.


Phase Overview

Phase Scope Status
Phase 1 Foundation - Database models, LLM service, ChangeSet infrastructure βœ… Complete
Phase 2 Protocol Ingestion & Study Structure Generation πŸ”œ Next
Phase 3 Battery & Assessment Configuration Assistance πŸ”œ Planned
Phase 4 Amendment Impact Analysis & Cutover Planning πŸ”œ Planned

Phase 1: Foundation Infrastructure βœ… COMPLETE

Implemented: 2026-01-21 Files Created: - server/alembic/versions/025_ai_agent_foundation.py - Database migration - server/app/services/llm_service.py - LLM integration - server/app/services/agent_service.py - Task orchestration - server/app/services/changeset_validator.py - Validation service - server/app/routers/ai_assistant.py - REST API endpoints

Files Modified: - server/app/db/models.py - Added AgentDocument, AgentRun, ChangeSet, ChangeSetItem, AuditTrailEvent - server/app/config.py - Added AI settings (anthropic_api_key, ai_model_id, etc.) - server/app/main.py - Registered AI assistant router

1.1 Database Models

File: server/alembic/versions/025_ai_agent_foundation.py

AgentRun Model

Tracks a single AI task execution.

class AgentRun(Base):
    """Tracks a single AI agent task execution."""
    __tablename__ = "agent_runs"

    id: Mapped[uuid.UUID]  # Primary key
    study_id: Mapped[uuid.UUID]  # Foreign key to studies

    # Task identification
    task_type: Mapped[str]  # "protocol_ingestion", "battery_config", "amendment_analysis"
    task_name: Mapped[str]  # Human-readable task name

    # Execution status
    status: Mapped[str]  # "pending", "running", "completed", "failed", "cancelled"
    started_at: Mapped[Optional[datetime]]
    completed_at: Mapped[Optional[datetime]]
    error_message: Mapped[Optional[str]]

    # Input tracking
    input_artifacts: Mapped[dict]  # JSONB - references to uploaded docs, parameters
    input_constraints: Mapped[Optional[dict]]  # JSONB - "do not create visits", etc.

    # Output tracking
    output_changeset_id: Mapped[Optional[uuid.UUID]]  # Foreign key to change_sets

    # LLM usage tracking
    model_id: Mapped[str]  # "claude-3-5-sonnet-20241022"
    usage_stats: Mapped[Optional[dict]]  # JSONB - tokens_in, tokens_out, cost

    # Audit integration
    created_by_id: Mapped[uuid.UUID]  # Foreign key to users
    created_at: Mapped[datetime]

ChangeSet Model

Container for proposed configuration changes.

class ChangeSet(Base):
    """Container for AI-generated configuration proposals."""
    __tablename__ = "change_sets"

    id: Mapped[uuid.UUID]
    study_id: Mapped[uuid.UUID]
    agent_run_id: Mapped[Optional[uuid.UUID]]  # Nullable for manual changesets

    # Lifecycle
    status: Mapped[str]  # "draft", "validating", "ready", "applied", "rejected"

    # Validation results
    validation_status: Mapped[Optional[str]]  # "passed", "warnings", "failed"
    validation_results: Mapped[Optional[dict]]  # JSONB - detailed validation output

    # Application tracking
    target_metadata_version_id: Mapped[Optional[uuid.UUID]]  # Where changes will apply
    applied_at: Mapped[Optional[datetime]]
    applied_by_id: Mapped[Optional[uuid.UUID]]
    rejection_reason: Mapped[Optional[str]]
    rejected_at: Mapped[Optional[datetime]]
    rejected_by_id: Mapped[Optional[uuid.UUID]]

    # Summary
    summary: Mapped[Optional[str]]  # AI-generated summary of changes
    risk_flags: Mapped[Optional[dict]]  # JSONB - breaking changes, consent impacts

    created_by_id: Mapped[uuid.UUID]
    created_at: Mapped[datetime]

    # Relationships
    items: Mapped[list["ChangeSetItem"]]

ChangeSetItem Model

Individual artifact proposal within a ChangeSet.

class ChangeSetItem(Base):
    """Individual artifact proposal within a ChangeSet."""
    __tablename__ = "change_set_items"

    id: Mapped[uuid.UUID]
    changeset_id: Mapped[uuid.UUID]

    # Artifact identification
    artifact_type: Mapped[str]  # "study_event_def", "battery_version", "consent_version", "form_def", "rule"
    artifact_id: Mapped[Optional[uuid.UUID]]  # Existing ID if update, null if create
    artifact_oid: Mapped[Optional[str]]  # ODM OID for the artifact

    # Change specification
    action: Mapped[str]  # "create", "update", "deprecate"
    before_state: Mapped[Optional[dict]]  # JSONB - state before (for updates)
    after_state: Mapped[dict]  # JSONB - proposed new state

    # AI metadata
    rationale: Mapped[Optional[str]]  # Why the AI proposed this
    confidence: Mapped[Optional[float]]  # 0.0-1.0 confidence score
    source_references: Mapped[Optional[dict]]  # JSONB - protocol section, page, etc.

    # Review tracking
    status: Mapped[str]  # "pending", "accepted", "rejected", "needs_input"
    reviewer_notes: Mapped[Optional[str]]
    reviewed_by_id: Mapped[Optional[uuid.UUID]]
    reviewed_at: Mapped[Optional[datetime]]

    created_at: Mapped[datetime]

AgentDocument Model

Tracks uploaded documents for AI context.

class AgentDocument(Base):
    """Uploaded document for AI context (protocols, SoAs, etc.)."""
    __tablename__ = "agent_documents"

    id: Mapped[uuid.UUID]
    study_id: Mapped[uuid.UUID]

    # Document metadata
    document_type: Mapped[str]  # "protocol", "soa", "crf_spec", "consent", "battery_plan"
    filename: Mapped[str]
    mime_type: Mapped[str]
    file_size: Mapped[int]

    # Storage
    storage_path: Mapped[str]  # Local path or S3 key

    # Processing status
    processing_status: Mapped[str]  # "pending", "processing", "ready", "failed"
    extracted_text: Mapped[Optional[str]]  # Full text extraction
    extracted_sections: Mapped[Optional[dict]]  # JSONB - parsed sections

    uploaded_by_id: Mapped[uuid.UUID]
    created_at: Mapped[datetime]

1.2 LLM Service Layer

File: server/app/services/llm_service.py

class LLMService:
    """Service for LLM interactions with Anthropic Claude."""

    def __init__(self, api_key: str):
        self.client = anthropic.Anthropic(api_key=api_key)
        self.model = "claude-sonnet-4-20250514"

    async def generate_structured_output(
        self,
        system_prompt: str,
        user_prompt: str,
        output_schema: dict,  # JSON schema for structured output
        context_documents: list[AgentDocument] = None,
        max_tokens: int = 4096,
    ) -> tuple[dict, dict]:
        """
        Generate structured output from Claude.

        Returns:
            tuple: (parsed_output, usage_stats)
        """
        pass

    async def stream_response(
        self,
        system_prompt: str,
        user_prompt: str,
        on_chunk: Callable[[str], None],
    ) -> dict:
        """Stream response for real-time UI updates."""
        pass

    def estimate_tokens(self, text: str) -> int:
        """Estimate token count for context management."""
        pass

1.3 Agent Service (Orchestrator)

File: server/app/services/agent_service.py

class AgentService:
    """Orchestrates AI agent tasks and manages the execution lifecycle."""

    # Task type definitions with required permissions
    TASK_TYPES = {
        "protocol_ingestion": {
            "name": "Protocol Ingestion",
            "required_permission": "manage_study_design",
            "produces": ["study_event_def", "form_def"],
        },
        "battery_config": {
            "name": "Battery Configuration",
            "required_permission": "manage_batteries",
            "produces": ["battery_version", "event_linking"],
        },
        "consent_config": {
            "name": "Consent Configuration",
            "required_permission": "manage_consent",
            "produces": ["consent_version", "consent_trigger"],
        },
        "amendment_analysis": {
            "name": "Amendment Impact Analysis",
            "required_permission": "manage_amendments",
            "produces": ["impact_report", "cutover_plan"],
        },
    }

    async def start_task(
        self,
        study_id: uuid.UUID,
        task_type: str,
        user_id: uuid.UUID,
        input_artifacts: dict,
        constraints: dict = None,
    ) -> AgentRun:
        """Start a new AI task and return the run record."""
        pass

    async def get_run_status(self, run_id: uuid.UUID) -> AgentRun:
        """Get current status of an agent run."""
        pass

    async def cancel_run(self, run_id: uuid.UUID, user_id: uuid.UUID) -> bool:
        """Cancel a running task."""
        pass

1.4 ChangeSet Service

File: server/app/services/changeset_service.py

class ChangeSetService:
    """Manages ChangeSet lifecycle: create, validate, apply, reject."""

    # Validation rules per artifact type
    VALIDATORS = {
        "study_event_def": StudyEventDefValidator,
        "battery_version": BatteryVersionValidator,
        "consent_version": ConsentVersionValidator,
        "form_def": FormDefValidator,
    }

    async def create_changeset(
        self,
        study_id: uuid.UUID,
        agent_run_id: uuid.UUID,
        items: list[dict],
        summary: str,
    ) -> ChangeSet:
        """Create a new ChangeSet from AI outputs."""
        pass

    async def validate_changeset(self, changeset_id: uuid.UUID) -> ValidationResult:
        """Run all validators against the changeset items."""
        pass

    async def apply_changeset(
        self,
        changeset_id: uuid.UUID,
        target_version_id: uuid.UUID,
        user_id: uuid.UUID,
    ) -> bool:
        """Apply approved items to a draft metadata version."""
        pass

    async def reject_changeset(
        self,
        changeset_id: uuid.UUID,
        user_id: uuid.UUID,
        reason: str,
    ) -> bool:
        """Reject a changeset with reason."""
        pass

    async def get_changeset_diff(
        self,
        changeset_id: uuid.UUID,
    ) -> dict:
        """Generate diff view of proposed changes."""
        pass

1.5 API Router

File: server/app/routers/ai_assistant.py

# Endpoints:

# Documents
POST   /api/studies/{study_id}/ai/documents              # Upload document
GET    /api/studies/{study_id}/ai/documents              # List documents
DELETE /api/studies/{study_id}/ai/documents/{doc_id}     # Delete document

# Agent Runs
POST   /api/studies/{study_id}/ai/runs                   # Start AI task
GET    /api/studies/{study_id}/ai/runs                   # List runs
GET    /api/studies/{study_id}/ai/runs/{run_id}          # Get run details
POST   /api/studies/{study_id}/ai/runs/{run_id}/cancel   # Cancel run

# ChangeSets
GET    /api/studies/{study_id}/ai/changesets             # List changesets
GET    /api/studies/{study_id}/ai/changesets/{id}        # Get changeset
GET    /api/studies/{study_id}/ai/changesets/{id}/diff   # Get diff view
POST   /api/studies/{study_id}/ai/changesets/{id}/validate  # Run validation
POST   /api/studies/{study_id}/ai/changesets/{id}/apply  # Apply to metadata version
POST   /api/studies/{study_id}/ai/changesets/{id}/reject # Reject changeset

# Individual Items
GET    /api/studies/{study_id}/ai/changesets/{id}/items  # List items
PATCH  /api/studies/{study_id}/ai/changesets/{id}/items/{item_id}  # Update item status

1.6 Audit Event Types βœ…

Implemented via AuditLog entries with the following action codes: - ai_document_upload - Document uploaded for AI processing - ai_document_archive - Document archived - ai_task_start - AI task execution started - ai_task_cancel - AI task execution cancelled - ai_changeset_apply - ChangeSet applied to metadata version - ai_changeset_reject - ChangeSet rejected - ai_item_accepted - ChangeSet item accepted - ai_item_rejected - ChangeSet item rejected - ai_item_modified - ChangeSet item modified by user - ai_item_needs_input - ChangeSet item needs user input


Phase 2: Protocol Ingestion & Study Structure πŸ”œ

2.1 Protocol Ingestion Task

File: server/app/services/agent_tasks/protocol_ingestion.py

This task: 1. Parses uploaded protocol PDF and SoA 2. Extracts visit schedule, timing windows, forms 3. Generates StudyEventDef proposals 4. Generates FormDef stubs 5. Creates ChangeSet with all proposals

Input Schema:

{
  "protocol_document_id": "uuid",
  "soa_document_id": "uuid",
  "study_metadata": {
    "phase": "II",
    "arms": ["Treatment", "Control"],
    "countries": ["CA", "US"],
    "estimated_enrollment": 100
  }
}

Output Schema:

{
  "study_events": [
    {
      "artifact_type": "study_event_def",
      "action": "create",
      "after_state": {
        "event_oid": "SE.SCREENING",
        "name": "Screening Visit",
        "type": "scheduled",
        "mandatory": true,
        "target_day": -14,
        "window_before": 7,
        "window_after": 0
      },
      "rationale": "Extracted from Protocol Section 6.1 - Visit Schedule",
      "source_references": {"section": "6.1", "page": 42}
    }
  ],
  "forms": [...],
  "summary": "Generated 8 visits and 12 form stubs from protocol",
  "risk_flags": []
}

2.2 System Prompt Template

File: server/app/services/agent_prompts/protocol_ingestion.md

You are a clinical research configuration assistant for the Metricis EDC platform.

## Your Task
Extract study structure from the provided protocol and Schedule of Activities (SoA).

## Output Requirements
You MUST output valid JSON matching the provided schema. Do not include any text outside the JSON.

## Rules
1. Every visit must have an ODM-compliant OID (pattern: SE.{NAME})
2. Target days are relative to enrollment (Day 0)
3. Windows must be clinically reasonable
4. Flag any ambiguities as "needs_input"
5. Include source references (section, page) for every proposal

## Protocol Context
{protocol_text}

## Schedule of Activities
{soa_text}

## Study Metadata
{study_metadata}

Phase 3: Battery & Assessment Configuration

3.1 Battery Configuration Task

File: server/app/services/agent_tasks/battery_config.py

This task: 1. Reads battery specification (domains, timing, constraints) 2. Proposes BatteryVersion with modules 3. Proposes event linking (which battery at which visit) 4. Generates ODM ItemDef mappings for outputs

3.2 Event Linking Task

File: server/app/services/agent_tasks/event_linking.py

This task: 1. Takes existing visits and batteries 2. Proposes optimal linkings based on protocol 3. Identifies timing conflicts 4. Flags burden concerns (too many assessments per visit)


Phase 4: Amendment Impact Analysis

4.1 Amendment Analysis Task

File: server/app/services/agent_tasks/amendment_analysis.py

This task leverages existing AmendmentImpactService: 1. Takes proposed amendment description 2. Identifies affected visits, forms, batteries, consents 3. Generates per-participant impact narratives 4. Proposes queue reconciliation policy 5. Creates cutover checklist

Integration with existing services:

# Uses existing amendment_impact_service.py
impact_service = AmendmentImpactService(db)
preview = await impact_service.preview_amendment_impact(
    study_id, proposed_changes
)

# AI enhances with narratives
narratives = await llm_service.generate_narratives(preview)


Portal UI Components

Study Assistant Page

File: portal/src/pages/StudyAssistant.tsx

Tabs: 1. Overview - Dashboard with recent runs, quick actions 2. Documents - Upload and manage protocol docs 3. Runs - List of AI task executions 4. ChangeSets - Review and apply proposals 5. Audit - AI activity log

Inline Assists

Add "Assist with AI" buttons to: - StudyDesign.tsx - "Draft visits from protocol" - BatteryBuilder.tsx - "Suggest module selection" - ConsentDesigner.tsx - "Draft consent triggers" - MetadataVersions.tsx - "Explain differences"

Components

Component Purpose
ArtifactCard.tsx Display single proposed artifact
ChangeSetDiffViewer.tsx Side-by-side diff display
ValidationPanel.tsx Show validation results
DocumentUploader.tsx Protocol/SoA upload
AgentRunProgress.tsx Real-time task progress
AIRationalePanel.tsx Show AI reasoning

Configuration & Environment

Environment Variables

# Required
ANTHROPIC_API_KEY=sk-ant-...

# Optional
AI_MODEL_ID=claude-sonnet-4-20250514
AI_MAX_TOKENS=4096
AI_TEMPERATURE=0.2
AI_ENABLED=true  # Feature flag

Feature Flags

# In study config
study.config = {
    "ai_assistant_enabled": True,
    "ai_allowed_tasks": ["protocol_ingestion", "battery_config"],
    "ai_require_approval": True,  # Always true for now
}

Security & Compliance

Access Control

Role Permissions
Data Manager Start tasks, apply changesets, manage documents
PI View changesets, approve metadata versions (existing flow)
CRC No AI access
Monitor View AI audit logs only

Audit Requirements

Every AI interaction creates audit events: 1. Document upload β†’ AI_DOCUMENT_UPLOADED 2. Task start β†’ AI_RUN_STARTED 3. Task complete β†’ AI_RUN_COMPLETED (with usage stats) 4. ChangeSet create β†’ AI_CHANGESET_CREATED 5. Item accept/reject β†’ AI_ITEM_ACCEPTED/REJECTED 6. Apply to version β†’ AI_CHANGESET_APPLIED

Data Handling

  • Protocol documents are stored locally (not sent to 3rd party storage)
  • Extracted text may be sent to Claude API
  • No PHI/PII should be in protocol documents
  • Usage stats (tokens, cost) are tracked for governance

Implementation Progress

Phase 1 Foundation βœ… COMPLETE

  1. βœ… Create migration 025_ai_agent_foundation.py
  2. βœ… Implement llm_service.py (with optional anthropic dependency)
  3. βœ… Implement agent_service.py
  4. βœ… Implement changeset_validator.py
  5. βœ… Create ai_assistant.py router (documents, runs, changesets)
  6. βœ… Add audit event types

Phase 2 Protocol Ingestion πŸ”œ NEXT

  1. ⬜ Implement protocol_ingestion.py task
  2. ⬜ Create prompt templates
  3. ⬜ Add document processing (PDF extraction)
  4. ⬜ Build basic Portal UI (StudyAssistant page)
  5. ⬜ End-to-end testing

Phase 3 Battery Configuration πŸ”œ PLANNED

  1. ⬜ Implement battery_config.py task
  2. ⬜ Implement event_linking.py task
  3. ⬜ Add inline assists to BatteryBuilder
  4. ⬜ Validation pipeline for battery proposals

Phase 4 Amendment Analysis πŸ”œ PLANNED

  1. ⬜ Integrate with existing AmendmentImpactService
  2. ⬜ Implement amendment_analysis.py task
  3. ⬜ Narrative generation for impacts
  4. ⬜ Cutover planning assistance

Success Metrics

Metric Target
Study setup time reduction 50%
Configuration errors caught by validation >90%
Audit coverage 100% of AI actions
Human approval rate 100% (by design)

Risks & Mitigations

Risk Mitigation
LLM hallucinations Strict JSON schema validation, source references required
Regulatory concerns All outputs are drafts, human approval required
Cost overruns Token tracking, usage limits per study
Prompt injection Input sanitization, structured prompts only
Breaking changes Validation pipeline catches incompatible proposals