refactor: phase 0 guardrails + phase 1 step 2 (models.py split)

Squash of branch refactor/phase0-guardrails-and-models-split — 4 commits,
81 files, 173/173 pytest green, OpenAPI contract preserved (360 paths /
484 operations).

## Phase 0 — Architecture guardrails

Three defense-in-depth layers to keep the architecture rules enforced
regardless of who opens Claude Code in this repo:

  1. .claude/settings.json PreToolUse hook on Write/Edit blocks any file
     that would exceed the 500-line hard cap. Auto-loads in every Claude
     session in this repo.
  2. scripts/githooks/pre-commit (install via scripts/install-hooks.sh)
     enforces the LOC cap locally, freezes migrations/ without
     [migration-approved], and protects guardrail files without
     [guardrail-change].
  3. .gitea/workflows/ci.yaml gains loc-budget + guardrail-integrity +
     sbom-scan (syft+grype) jobs, adds mypy --strict for the new Python
     packages (compliance/{services,repositories,domain,schemas}), and
     tsc --noEmit for admin-compliance + developer-portal.

Per-language conventions documented in AGENTS.python.md, AGENTS.go.md,
AGENTS.typescript.md at the repo root — layering, tooling, and explicit
"what you may NOT do" lists. Root CLAUDE.md is prepended with the six
non-negotiable rules. Each of the 10 services gets a README.md.

scripts/check-loc.sh enforces soft 300 / hard 500 and surfaces the
current baseline of 205 hard + 161 soft violations so Phases 1-4 can
drain it incrementally. CI gates only CHANGED files in PRs so the
legacy baseline does not block unrelated work.

## Deprecation sweep

47 files. Pydantic V1 regex= -> pattern= (2 sites), class Config ->
ConfigDict in source_policy_router.py (schemas.py intentionally skipped;
it is the Phase 1 Step 3 split target). datetime.utcnow() ->
datetime.now(timezone.utc) everywhere including SQLAlchemy default=
callables. All DB columns already declare timezone=True, so this is a
latent-bug fix at the Python side, not a schema change.

DeprecationWarning count dropped from 158 to 35.

## Phase 1 Step 1 — Contract test harness

tests/contracts/test_openapi_baseline.py diffs the live FastAPI /openapi.json
against tests/contracts/openapi.baseline.json on every test run. Fails on
removed paths, removed status codes, or new required request body fields.
Regenerate only via tests/contracts/regenerate_baseline.py after a
consumer-updated contract change. This is the safety harness for all
subsequent refactor commits.

## Phase 1 Step 2 — models.py split (1466 -> 85 LOC shim)

compliance/db/models.py is decomposed into seven sibling aggregate modules
following the existing repo pattern (dsr_models.py, vvt_models.py, ...):

  regulation_models.py       (134) — Regulation, Requirement
  control_models.py          (279) — Control, Mapping, Evidence, Risk
  ai_system_models.py        (141) — AISystem, AuditExport
  service_module_models.py   (176) — ServiceModule, ModuleRegulation, ModuleRisk
  audit_session_models.py    (177) — AuditSession, AuditSignOff
  isms_governance_models.py  (323) — ISMSScope, Context, Policy, Objective, SoA
  isms_audit_models.py       (468) — Finding, CAPA, MgmtReview, InternalAudit,
                                     AuditTrail, Readiness

models.py becomes an 85-line re-export shim in dependency order so
existing imports continue to work unchanged. Schema is byte-identical:
__tablename__, column definitions, relationship strings, back_populates,
cascade directives all preserved.

All new sibling files are under the 500-line hard cap; largest is
isms_audit_models.py at 468. No file in compliance/db/ now exceeds
the hard cap.

## Phase 1 Step 3 — infrastructure only

backend-compliance/compliance/{schemas,domain,repositories}/ packages
are created as landing zones with docstrings. compliance/domain/
exports DomainError / NotFoundError / ConflictError / ValidationError /
PermissionError — the base classes services will use to raise
domain-level errors instead of HTTPException.

PHASE1_RUNBOOK.md at backend-compliance/PHASE1_RUNBOOK.md documents
the nine-step execution plan for Phase 1: snapshot baseline,
characterization tests, split models.py (this commit), split schemas.py
(next), extract services, extract repositories, mypy --strict, coverage.

## Verification

  backend-compliance/.venv-phase1: uv python install 3.12 + pip -r requirements.txt
  PYTHONPATH=. pytest compliance/tests/ tests/contracts/
  -> 173 passed, 0 failed, 35 warnings, OpenAPI 360/484 unchanged

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Sharang Parnerkar
2026-04-07 13:18:29 +02:00
parent 1dfea51919
commit 3320ef94fc
84 changed files with 52849 additions and 1731 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,25 @@
#!/usr/bin/env python3
"""Regenerate the OpenAPI baseline.
Run this ONLY when you have intentionally made an additive API change and want
the contract test to pick up the new baseline. Removing or renaming anything is
a breaking change and requires updating every consumer in the same change set.
Usage:
python tests/contracts/regenerate_baseline.py
"""
from __future__ import annotations
import json
import sys
from pathlib import Path
THIS_DIR = Path(__file__).parent
REPO_ROOT = THIS_DIR.parent.parent # backend-compliance/
sys.path.insert(0, str(REPO_ROOT))
from main import app # type: ignore[import-not-found] # noqa: E402
out = THIS_DIR / "openapi.baseline.json"
out.write_text(json.dumps(app.openapi(), indent=2, sort_keys=True) + "\n")
print(f"wrote {out}")

View File

@@ -0,0 +1,102 @@
"""OpenAPI contract test.
This test pins the public HTTP contract of backend-compliance. It loads the
FastAPI app, extracts the live OpenAPI schema, and compares it against a
checked-in baseline at ``tests/contracts/openapi.baseline.json``.
Rules:
- Adding new paths/operations/fields → OK (additive change).
- Removing a path, changing a method, changing a status code, removing or
renaming a response/request field → FAIL. Such changes require updating
every consumer (admin-compliance, developer-portal, SDKs) in the same
change, then regenerating the baseline with:
python tests/contracts/regenerate_baseline.py
and explaining the contract change in the PR description.
The baseline is missing on first run — the test prints the command to create
it and skips. This is intentional: Phase 1 step 1 generates it fresh from the
current app state before any refactoring begins.
"""
from __future__ import annotations
import json
from pathlib import Path
from typing import Any
import pytest
BASELINE_PATH = Path(__file__).parent / "openapi.baseline.json"
def _load_live_schema() -> dict[str, Any]:
"""Import the FastAPI app and extract its OpenAPI schema.
Kept inside the function so that test collection does not fail if the app
has import-time side effects that aren't satisfied in the test env.
"""
from main import app # type: ignore[import-not-found]
return app.openapi()
def _collect_operations(schema: dict[str, Any]) -> dict[str, dict[str, Any]]:
"""Return a flat {f'{METHOD} {path}': operation} map for diffing."""
out: dict[str, dict[str, Any]] = {}
for path, methods in schema.get("paths", {}).items():
for method, op in methods.items():
if method.lower() in {"get", "post", "put", "patch", "delete", "options", "head"}:
out[f"{method.upper()} {path}"] = op
return out
@pytest.mark.contract
def test_openapi_no_breaking_changes() -> None:
if not BASELINE_PATH.exists():
pytest.skip(
f"Baseline missing. Run: python {Path(__file__).parent}/regenerate_baseline.py"
)
baseline = json.loads(BASELINE_PATH.read_text())
live = _load_live_schema()
baseline_ops = _collect_operations(baseline)
live_ops = _collect_operations(live)
# 1. No operation may disappear.
removed = sorted(set(baseline_ops) - set(live_ops))
assert not removed, (
f"Breaking change: {len(removed)} operation(s) removed from public API:\n "
+ "\n ".join(removed)
)
# 2. For operations that exist in both, response status codes must be a superset.
for key, baseline_op in baseline_ops.items():
live_op = live_ops[key]
baseline_codes = set((baseline_op.get("responses") or {}).keys())
live_codes = set((live_op.get("responses") or {}).keys())
missing = baseline_codes - live_codes
assert not missing, (
f"Breaking change: {key} no longer returns status code(s) {sorted(missing)}"
)
# 3. Required request-body fields may not be added (would break existing clients).
for key, baseline_op in baseline_ops.items():
live_op = live_ops[key]
base_req = _required_body_fields(baseline_op)
live_req = _required_body_fields(live_op)
new_required = live_req - base_req
assert not new_required, (
f"Breaking change: {key} added required request field(s) {sorted(new_required)}"
)
def _required_body_fields(op: dict[str, Any]) -> set[str]:
rb = op.get("requestBody") or {}
content = rb.get("content") or {}
for media in content.values():
schema = media.get("schema") or {}
return set(schema.get("required") or [])
return set()

View File

@@ -10,7 +10,7 @@ import pytest
import uuid
import os
import sys
from datetime import datetime
from datetime import datetime, timezone
from unittest.mock import MagicMock
from fastapi import FastAPI
@@ -51,7 +51,7 @@ _RawSessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
@event.listens_for(engine, "connect")
def _register_sqlite_functions(dbapi_conn, connection_record):
"""Register PostgreSQL-compatible functions for SQLite."""
dbapi_conn.create_function("NOW", 0, lambda: datetime.utcnow().isoformat())
dbapi_conn.create_function("NOW", 0, lambda: datetime.now(timezone.utc).isoformat())
TENANT_ID = "default"

View File

@@ -6,7 +6,7 @@ Pattern: app.dependency_overrides[get_db] for FastAPI DI.
import uuid
import os
import sys
from datetime import datetime, timedelta
from datetime import datetime, timedelta, timezone
import pytest
from fastapi import FastAPI
@@ -75,7 +75,7 @@ def db_session():
def _create_dsr_in_db(db, **kwargs):
"""Helper to create a DSR directly in DB."""
now = datetime.utcnow()
now = datetime.now(timezone.utc)
defaults = {
"tenant_id": uuid.UUID(TENANT_ID),
"request_number": f"DSR-2026-{str(uuid.uuid4())[:6].upper()}",
@@ -241,8 +241,8 @@ class TestListDSR:
assert len(data["requests"]) == 2
def test_list_overdue_only(self, db_session):
_create_dsr_in_db(db_session, deadline_at=datetime.utcnow() - timedelta(days=5), status="processing")
_create_dsr_in_db(db_session, deadline_at=datetime.utcnow() + timedelta(days=20), status="processing")
_create_dsr_in_db(db_session, deadline_at=datetime.now(timezone.utc) - timedelta(days=5), status="processing")
_create_dsr_in_db(db_session, deadline_at=datetime.now(timezone.utc) + timedelta(days=20), status="processing")
resp = client.get("/api/compliance/dsr?overdue_only=true", headers=HEADERS)
assert resp.status_code == 200
@@ -339,7 +339,7 @@ class TestDSRStats:
_create_dsr_in_db(db_session, status="intake", request_type="access")
_create_dsr_in_db(db_session, status="processing", request_type="erasure")
_create_dsr_in_db(db_session, status="completed", request_type="access",
completed_at=datetime.utcnow())
completed_at=datetime.now(timezone.utc))
resp = client.get("/api/compliance/dsr/stats", headers=HEADERS)
assert resp.status_code == 200
@@ -561,9 +561,9 @@ class TestDeadlineProcessing:
def test_process_deadlines_with_overdue(self, db_session):
_create_dsr_in_db(db_session, status="processing",
deadline_at=datetime.utcnow() - timedelta(days=5))
deadline_at=datetime.now(timezone.utc) - timedelta(days=5))
_create_dsr_in_db(db_session, status="processing",
deadline_at=datetime.utcnow() + timedelta(days=20))
deadline_at=datetime.now(timezone.utc) + timedelta(days=20))
resp = client.post("/api/compliance/dsr/deadlines/process", headers=HEADERS)
assert resp.status_code == 200
@@ -609,7 +609,7 @@ class TestDSRTemplates:
subject="Bestaetigung",
body_html="<p>Test</p>",
status="published",
published_at=datetime.utcnow(),
published_at=datetime.now(timezone.utc),
)
db_session.add(v)
db_session.commit()

View File

@@ -7,7 +7,7 @@ Consent widerrufen, Statistiken.
import pytest
from unittest.mock import MagicMock, patch
from datetime import datetime
from datetime import datetime, timezone
import uuid
@@ -25,7 +25,7 @@ def make_catalog(tenant_id='test-tenant'):
rec.tenant_id = tenant_id
rec.selected_data_point_ids = ['dp-001', 'dp-002']
rec.custom_data_points = []
rec.updated_at = datetime.utcnow()
rec.updated_at = datetime.now(timezone.utc)
return rec
@@ -34,7 +34,7 @@ def make_company(tenant_id='test-tenant'):
rec.id = uuid.uuid4()
rec.tenant_id = tenant_id
rec.data = {'company_name': 'Test GmbH', 'email': 'datenschutz@test.de'}
rec.updated_at = datetime.utcnow()
rec.updated_at = datetime.now(timezone.utc)
return rec
@@ -47,7 +47,7 @@ def make_cookies(tenant_id='test-tenant'):
{'id': 'analytics', 'name': 'Analyse', 'isRequired': False, 'defaultEnabled': False},
]
rec.config = {'position': 'bottom', 'style': 'bar'}
rec.updated_at = datetime.utcnow()
rec.updated_at = datetime.now(timezone.utc)
return rec
@@ -58,13 +58,13 @@ def make_consent(tenant_id='test-tenant', user_id='user-001', data_point_id='dp-
rec.user_id = user_id
rec.data_point_id = data_point_id
rec.granted = granted
rec.granted_at = datetime.utcnow()
rec.granted_at = datetime.now(timezone.utc)
rec.revoked_at = None
rec.consent_version = '1.0'
rec.source = 'website'
rec.ip_address = None
rec.user_agent = None
rec.created_at = datetime.utcnow()
rec.created_at = datetime.now(timezone.utc)
return rec
@@ -263,7 +263,7 @@ class TestConsentDB:
user_id='user-001',
data_point_id='dp-marketing',
granted=True,
granted_at=datetime.utcnow(),
granted_at=datetime.now(timezone.utc),
consent_version='1.0',
source='website',
)
@@ -276,13 +276,13 @@ class TestConsentDB:
consent = make_consent()
assert consent.revoked_at is None
consent.revoked_at = datetime.utcnow()
consent.revoked_at = datetime.now(timezone.utc)
assert consent.revoked_at is not None
def test_cannot_revoke_already_revoked(self):
"""Should not be possible to revoke an already revoked consent."""
consent = make_consent()
consent.revoked_at = datetime.utcnow()
consent.revoked_at = datetime.now(timezone.utc)
# Simulate the guard logic from the route
already_revoked = consent.revoked_at is not None
@@ -315,7 +315,7 @@ class TestConsentStats:
make_consent(user_id='user-2', data_point_id='dp-1', granted=True),
]
# Revoke one
consents[1].revoked_at = datetime.utcnow()
consents[1].revoked_at = datetime.now(timezone.utc)
total = len(consents)
active = sum(1 for c in consents if c.granted and not c.revoked_at)
@@ -334,7 +334,7 @@ class TestConsentStats:
make_consent(user_id='user-2', granted=True),
make_consent(user_id='user-3', granted=True),
]
consents[2].revoked_at = datetime.utcnow() # user-3 revoked
consents[2].revoked_at = datetime.now(timezone.utc) # user-3 revoked
unique_users = len(set(c.user_id for c in consents))
users_with_active = len(set(c.user_id for c in consents if c.granted and not c.revoked_at))
@@ -501,7 +501,7 @@ class TestConsentHistoryTracking:
from compliance.db.einwilligungen_models import EinwilligungenConsentHistoryDB
consent = make_consent()
consent.revoked_at = datetime.utcnow()
consent.revoked_at = datetime.now(timezone.utc)
entry = EinwilligungenConsentHistoryDB(
consent_id=consent.id,
tenant_id=consent.tenant_id,
@@ -516,7 +516,7 @@ class TestConsentHistoryTracking:
entry_id = _uuid.uuid4()
consent_id = _uuid.uuid4()
now = datetime.utcnow()
now = datetime.now(timezone.utc)
row = {
"id": str(entry_id),

View File

@@ -13,7 +13,7 @@ Run with: cd backend-compliance && python3 -m pytest tests/test_isms_routes.py -
import os
import sys
import pytest
from datetime import date, datetime
from datetime import date, datetime, timezone
from fastapi import FastAPI
from fastapi.testclient import TestClient
@@ -40,7 +40,7 @@ def _set_sqlite_pragma(dbapi_conn, connection_record):
cursor = dbapi_conn.cursor()
cursor.execute("PRAGMA foreign_keys=ON")
cursor.close()
dbapi_conn.create_function("NOW", 0, lambda: datetime.utcnow().isoformat())
dbapi_conn.create_function("NOW", 0, lambda: datetime.now(timezone.utc).isoformat())
app = FastAPI()

View File

@@ -7,7 +7,7 @@ Rejection-Flow, approval history.
import pytest
from unittest.mock import MagicMock, patch
from datetime import datetime
from datetime import datetime, timezone
import uuid
@@ -27,7 +27,7 @@ def make_document(type='privacy_policy', name='Datenschutzerklärung', tenant_id
doc.name = name
doc.description = 'Test description'
doc.mandatory = False
doc.created_at = datetime.utcnow()
doc.created_at = datetime.now(timezone.utc)
doc.updated_at = None
return doc
@@ -46,7 +46,7 @@ def make_version(document_id=None, version='1.0', status='draft', title='Test Ve
v.approved_by = None
v.approved_at = None
v.rejection_reason = None
v.created_at = datetime.utcnow()
v.created_at = datetime.now(timezone.utc)
v.updated_at = None
return v
@@ -58,7 +58,7 @@ def make_approval(version_id=None, action='created'):
a.action = action
a.approver = 'admin@test.de'
a.comment = None
a.created_at = datetime.utcnow()
a.created_at = datetime.now(timezone.utc)
return a
@@ -179,7 +179,7 @@ class TestVersionToResponse:
from compliance.api.legal_document_routes import _version_to_response
v = make_version(status='approved')
v.approved_by = 'dpo@company.de'
v.approved_at = datetime.utcnow()
v.approved_at = datetime.now(timezone.utc)
resp = _version_to_response(v)
assert resp.status == 'approved'
assert resp.approved_by == 'dpo@company.de'
@@ -254,7 +254,7 @@ class TestApprovalWorkflow:
# Step 2: Approve
mock_db.reset_mock()
_transition(mock_db, str(v.id), ['review'], 'approved', 'approved', 'dpo', 'Korrekt',
extra_updates={'approved_by': 'dpo', 'approved_at': datetime.utcnow()})
extra_updates={'approved_by': 'dpo', 'approved_at': datetime.now(timezone.utc)})
assert v.status == 'approved'
# Step 3: Publish

View File

@@ -5,7 +5,7 @@ Tests for Legal Document extended routes (User Consents, Audit Log, Cookie Categ
import uuid
import os
import sys
from datetime import datetime
from datetime import datetime, timezone
import pytest
from fastapi import FastAPI
@@ -103,7 +103,7 @@ def _publish_version(version_id):
v = db.query(LegalDocumentVersionDB).filter(LegalDocumentVersionDB.id == vid).first()
v.status = "published"
v.approved_by = "admin"
v.approved_at = datetime.utcnow()
v.approved_at = datetime.now(timezone.utc)
db.commit()
db.refresh(v)
result = {"id": str(v.id), "status": v.status}

View File

@@ -15,7 +15,7 @@ import pytest
import uuid
import os
import sys
from datetime import datetime
from datetime import datetime, timezone
from fastapi import FastAPI
from fastapi.testclient import TestClient
@@ -40,7 +40,7 @@ TENANT_ID = "default"
@event.listens_for(engine, "connect")
def _register_sqlite_functions(dbapi_conn, connection_record):
dbapi_conn.create_function("NOW", 0, lambda: datetime.utcnow().isoformat())
dbapi_conn.create_function("NOW", 0, lambda: datetime.now(timezone.utc).isoformat())
class _DictRow(dict):

View File

@@ -186,7 +186,7 @@ class TestActivityToResponse:
act.next_review_at = kwargs.get("next_review_at", None)
act.created_by = kwargs.get("created_by", None)
act.dsfa_id = kwargs.get("dsfa_id", None)
act.created_at = datetime.utcnow()
act.created_at = datetime.now(timezone.utc)
act.updated_at = None
return act
@@ -330,7 +330,7 @@ class TestVVTConsolidationResponse:
act.next_review_at = kwargs.get("next_review_at", None)
act.created_by = kwargs.get("created_by", None)
act.dsfa_id = kwargs.get("dsfa_id", None)
act.created_at = datetime.utcnow()
act.created_at = datetime.now(timezone.utc)
act.updated_at = None
return act

View File

@@ -10,7 +10,7 @@ Verifies that:
import pytest
import uuid
from unittest.mock import MagicMock, AsyncMock, patch
from datetime import datetime
from datetime import datetime, timezone
from fastapi import HTTPException
from fastapi.testclient import TestClient
@@ -144,8 +144,8 @@ def _make_activity(tenant_id, vvt_id="VVT-001", name="Test", **kwargs):
act.next_review_at = None
act.created_by = "system"
act.dsfa_id = None
act.created_at = datetime.utcnow()
act.updated_at = datetime.utcnow()
act.created_at = datetime.now(timezone.utc)
act.updated_at = datetime.now(timezone.utc)
return act