refactor: phase 0 guardrails + phase 1 step 2 (models.py split)

Squash of branch refactor/phase0-guardrails-and-models-split — 4 commits, 81 files, 173/173 pytest green, OpenAPI contract preserved (360 paths / 484 operations). ## Phase 0 — Architecture guardrails Three defense-in-depth layers to keep the architecture rules enforced regardless of who opens Claude Code in this repo: 1. .claude/settings.json PreToolUse hook on Write/Edit blocks any file that would exceed the 500-line hard cap. Auto-loads in every Claude session in this repo. 2. scripts/githooks/pre-commit (install via scripts/install-hooks.sh) enforces the LOC cap locally, freezes migrations/ without [migration-approved], and protects guardrail files without [guardrail-change]. 3. .gitea/workflows/ci.yaml gains loc-budget + guardrail-integrity + sbom-scan (syft+grype) jobs, adds mypy --strict for the new Python packages (compliance/{services,repositories,domain,schemas}), and tsc --noEmit for admin-compliance + developer-portal. Per-language conventions documented in AGENTS.python.md, AGENTS.go.md, AGENTS.typescript.md at the repo root — layering, tooling, and explicit "what you may NOT do" lists. Root CLAUDE.md is prepended with the six non-negotiable rules. Each of the 10 services gets a README.md. scripts/check-loc.sh enforces soft 300 / hard 500 and surfaces the current baseline of 205 hard + 161 soft violations so Phases 1-4 can drain it incrementally. CI gates only CHANGED files in PRs so the legacy baseline does not block unrelated work. ## Deprecation sweep 47 files. Pydantic V1 regex= -> pattern= (2 sites), class Config -> ConfigDict in source_policy_router.py (schemas.py intentionally skipped; it is the Phase 1 Step 3 split target). datetime.utcnow() -> datetime.now(timezone.utc) everywhere including SQLAlchemy default= callables. All DB columns already declare timezone=True, so this is a latent-bug fix at the Python side, not a schema change. DeprecationWarning count dropped from 158 to 35. ## Phase 1 Step 1 — Contract test harness tests/contracts/test_openapi_baseline.py diffs the live FastAPI /openapi.json against tests/contracts/openapi.baseline.json on every test run. Fails on removed paths, removed status codes, or new required request body fields. Regenerate only via tests/contracts/regenerate_baseline.py after a consumer-updated contract change. This is the safety harness for all subsequent refactor commits. ## Phase 1 Step 2 — models.py split (1466 -> 85 LOC shim) compliance/db/models.py is decomposed into seven sibling aggregate modules following the existing repo pattern (dsr_models.py, vvt_models.py, ...): regulation_models.py (134) — Regulation, Requirement control_models.py (279) — Control, Mapping, Evidence, Risk ai_system_models.py (141) — AISystem, AuditExport service_module_models.py (176) — ServiceModule, ModuleRegulation, ModuleRisk audit_session_models.py (177) — AuditSession, AuditSignOff isms_governance_models.py (323) — ISMSScope, Context, Policy, Objective, SoA isms_audit_models.py (468) — Finding, CAPA, MgmtReview, InternalAudit, AuditTrail, Readiness models.py becomes an 85-line re-export shim in dependency order so existing imports continue to work unchanged. Schema is byte-identical: __tablename__, column definitions, relationship strings, back_populates, cascade directives all preserved. All new sibling files are under the 500-line hard cap; largest is isms_audit_models.py at 468. No file in compliance/db/ now exceeds the hard cap. ## Phase 1 Step 3 — infrastructure only backend-compliance/compliance/{schemas,domain,repositories}/ packages are created as landing zones with docstrings. compliance/domain/ exports DomainError / NotFoundError / ConflictError / ValidationError / PermissionError — the base classes services will use to raise domain-level errors instead of HTTPException. PHASE1_RUNBOOK.md at backend-compliance/PHASE1_RUNBOOK.md documents the nine-step execution plan for Phase 1: snapshot baseline, characterization tests, split models.py (this commit), split schemas.py (next), extract services, extract repositories, mypy --strict, coverage. ## Verification backend-compliance/.venv-phase1: uv python install 3.12 + pip -r requirements.txt PYTHONPATH=. pytest compliance/tests/ tests/contracts/ -> 173 passed, 0 failed, 35 warnings, OpenAPI 360/484 unchanged Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:18:29 +02:00
parent 1dfea51919
commit 3320ef94fc
84 changed files with 52849 additions and 1731 deletions
@@ -16,7 +16,7 @@ Uses reportlab for PDF generation (lightweight, no external dependencies).

 import io
 import logging
-from datetime import datetime
+from datetime import datetime, timezone
 from typing import Dict, List, Any, Optional, Tuple

 from sqlalchemy.orm import Session
@@ -255,7 +255,7 @@ class AuditPDFGenerator:
        doc.build(story)

        # Generate filename
-        date_str = datetime.utcnow().strftime('%Y%m%d')
+        date_str = datetime.now(timezone.utc).strftime('%Y%m%d')
        filename = f"audit_report_{session.name.replace(' ', '_')}_{date_str}.pdf"

        return buffer.getvalue(), filename
@@ -429,7 +429,7 @@ class AuditPDFGenerator:
        story.append(Spacer(1, 30*mm))
        gen_label = 'Generiert am' if language == 'de' else 'Generated on'
        story.append(Paragraph(
-            f"{gen_label}: {datetime.utcnow().strftime('%d.%m.%Y %H:%M')} UTC",
+            f"{gen_label}: {datetime.now(timezone.utc).strftime('%d.%m.%Y %H:%M')} UTC",
            self.styles['Footer']
        ))

@@ -11,7 +11,7 @@ Sprint 6: CI/CD Evidence Collection (2026-01-18)
 """

 import logging
-from datetime import datetime
+from datetime import datetime, timezone
 from typing import Dict, List, Optional
 from dataclasses import dataclass
 from enum import Enum
@@ -140,7 +140,7 @@ class AutoRiskUpdater:
        if new_status != old_status:
            control.status = ControlStatusEnum(new_status)
            control.status_notes = self._generate_status_notes(scan_result)
-            control.updated_at = datetime.utcnow()
+            control.updated_at = datetime.now(timezone.utc)
            control_updated = True
            logger.info(f"Control {scan_result.control_id} status changed: {old_status} -> {new_status}")

@@ -225,7 +225,7 @@ class AutoRiskUpdater:
            source="ci_pipeline",
            ci_job_id=scan_result.ci_job_id,
            status=EvidenceStatusEnum.VALID,
-            valid_from=datetime.utcnow(),
+            valid_from=datetime.now(timezone.utc),
            collected_at=scan_result.timestamp,
        )

@@ -298,8 +298,8 @@ class AutoRiskUpdater:
                        risk_updated = True

            if risk_updated:
-                risk.last_assessed_at = datetime.utcnow()
-                risk.updated_at = datetime.utcnow()
+                risk.last_assessed_at = datetime.now(timezone.utc)
+                risk.updated_at = datetime.now(timezone.utc)
                affected_risks.append(risk.risk_id)
                logger.info(f"Updated risk {risk.risk_id} due to control {control.control_id} status change")

@@ -354,7 +354,7 @@ class AutoRiskUpdater:
        try:
            ts = datetime.fromisoformat(timestamp.replace('Z', '+00:00'))
        except (ValueError, AttributeError):
-            ts = datetime.utcnow()
+            ts = datetime.now(timezone.utc)

        # Determine scan type from evidence_type
        scan_type = ScanType.SAST  # Default
@@ -16,7 +16,7 @@ import os
 import shutil
 import tempfile
 import zipfile
-from datetime import datetime, date
+from datetime import datetime, date, timezone
 from pathlib import Path
 from typing import Dict, List, Optional, Any

@@ -98,7 +98,7 @@ class AuditExportGenerator:
            export_record.file_hash = file_hash
            export_record.file_size_bytes = file_size
            export_record.status = ExportStatusEnum.COMPLETED
-            export_record.completed_at = datetime.utcnow()
+            export_record.completed_at = datetime.now(timezone.utc)

            # Calculate statistics
            stats = self._calculate_statistics(
@@ -11,7 +11,7 @@ Similar pattern to edu-search and zeugnisse-crawler.

 import logging
 import re
-from datetime import datetime
+from datetime import datetime, timezone
 from typing import Dict, List, Any, Optional
 from enum import Enum

@@ -198,7 +198,7 @@ class RegulationScraperService:
    async def scrape_all(self) -> Dict[str, Any]:
        """Scrape all known regulation sources."""
        self.status = ScraperStatus.RUNNING
-        self.stats["last_run"] = datetime.utcnow().isoformat()
+        self.stats["last_run"] = datetime.now(timezone.utc).isoformat()

        results = {
            "success": [],
@@ -11,7 +11,7 @@ Reports include:
 """

 import logging
-from datetime import datetime, date, timedelta
+from datetime import datetime, date, timedelta, timezone
 from typing import Dict, List, Any, Optional
 from enum import Enum

@@ -75,7 +75,7 @@ class ComplianceReportGenerator:

        report = {
            "report_metadata": {
-                "generated_at": datetime.utcnow().isoformat(),
+                "generated_at": datetime.now(timezone.utc).isoformat(),
                "period": period.value,
                "as_of_date": as_of_date.isoformat(),
                "date_range_start": date_range["start"].isoformat(),
@@ -415,7 +415,7 @@ class ComplianceReportGenerator:
        evidence_stats = self.evidence_repo.get_statistics()

        return {
-            "generated_at": datetime.utcnow().isoformat(),
+            "generated_at": datetime.now(timezone.utc).isoformat(),
            "compliance_score": stats.get("compliance_score", 0),
            "controls": {
                "total": stats.get("total", 0),