Phase 1 Step 3 of PHASE1_RUNBOOK.md. compliance/api/schemas.py is
decomposed into 16 per-domain Pydantic schema modules under
compliance/schemas/:
common.py ( 79) — 6 API enums + PaginationMeta
regulation.py ( 52)
requirement.py ( 80)
control.py (119) — Control + Mapping
evidence.py ( 66)
risk.py ( 79)
ai_system.py ( 63)
dashboard.py (195) — Dashboard, Export, Executive Dashboard
service_module.py (121)
bsi.py ( 58) — BSI + PDF extraction
audit_session.py (172)
report.py ( 53)
isms_governance.py (343) — Scope, Context, Policy, Objective, SoA
isms_audit.py (431) — Finding, CAPA, Review, Internal Audit, Readiness, Trail, ISO27001
vvt.py (168)
tom.py ( 71)
compliance/api/schemas.py becomes a 39-line re-export shim so existing
imports (from compliance.api.schemas import RegulationResponse) keep
working unchanged. New code should import from the domain module
directly (from compliance.schemas.regulation import RegulationResponse).
Deferred-from-sweep: all 28 class Config blocks in the original file
were converted to model_config = ConfigDict(...) during the split.
schemas.py-sourced PydanticDeprecatedSince20 warnings are now gone.
Cross-domain references handled via targeted imports (e.g. dashboard.py
imports EvidenceResponse from evidence, RiskResponse from risk). common
API enums + PaginationMeta are imported by every domain module.
Verified:
- 173/173 pytest compliance/tests/ tests/contracts/ pass
- OpenAPI 360 paths / 484 operations unchanged (contract test green)
- All new files under the 500-line hard cap (largest: isms_audit.py
at 431, isms_governance.py at 343, dashboard.py at 195)
- No file in compliance/schemas/ or compliance/api/schemas.py
exceeds the hard cap
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
59 lines
2.0 KiB
Python
59 lines
2.0 KiB
Python
"""
|
|
BSI / PDF Extraction Pydantic schemas — extracted from compliance/api/schemas.py.
|
|
|
|
Phase 1 Step 3: the monolithic ``compliance.api.schemas`` module is being
|
|
split per domain under ``compliance.schemas``. This module is re-exported
|
|
from ``compliance.api.schemas`` for backwards compatibility.
|
|
"""
|
|
|
|
from datetime import datetime, date
|
|
from typing import Optional, List, Any, Dict
|
|
|
|
from pydantic import BaseModel, ConfigDict, Field
|
|
|
|
from compliance.schemas.common import (
|
|
PaginationMeta, RegulationType, ControlType, ControlDomain,
|
|
ControlStatus, RiskLevel, EvidenceStatus,
|
|
)
|
|
|
|
|
|
# ============================================================================
|
|
# PDF Extraction Schemas
|
|
# ============================================================================
|
|
|
|
class BSIAspectResponse(BaseModel):
|
|
"""A single extracted BSI-TR Pruefaspekt (test aspect)."""
|
|
aspect_id: str
|
|
title: str
|
|
full_text: str
|
|
category: str
|
|
page_number: int
|
|
section: str
|
|
requirement_level: str
|
|
source_document: str
|
|
keywords: Optional[List[str]] = None
|
|
related_aspects: Optional[List[str]] = None
|
|
|
|
|
|
class PDFExtractionRequest(BaseModel):
|
|
"""Request for PDF extraction."""
|
|
document_code: str = Field(..., description="BSI-TR document code, e.g. BSI-TR-03161-2")
|
|
save_to_db: bool = Field(True, description="Whether to save extracted requirements to database")
|
|
force: bool = Field(False, description="Force re-extraction even if requirements exist")
|
|
|
|
|
|
class PDFExtractionResponse(BaseModel):
|
|
"""Response from PDF extraction endpoint."""
|
|
# Simple endpoint format (new /pdf/extract/{doc_code})
|
|
doc_code: Optional[str] = None
|
|
total_extracted: Optional[int] = None
|
|
saved_to_db: Optional[int] = None
|
|
aspects: Optional[List[BSIAspectResponse]] = None
|
|
# Legacy scraper endpoint format (/scraper/extract-pdf)
|
|
success: Optional[bool] = None
|
|
source_document: Optional[str] = None
|
|
total_aspects: Optional[int] = None
|
|
statistics: Optional[Dict[str, Any]] = None
|
|
requirements_created: Optional[int] = None
|
|
|