Files
breakpilot-compliance/backend-compliance/tests/test_legal_document_routes_extended.py
Sharang Parnerkar 3320ef94fc refactor: phase 0 guardrails + phase 1 step 2 (models.py split)
Squash of branch refactor/phase0-guardrails-and-models-split — 4 commits,
81 files, 173/173 pytest green, OpenAPI contract preserved (360 paths /
484 operations).

## Phase 0 — Architecture guardrails

Three defense-in-depth layers to keep the architecture rules enforced
regardless of who opens Claude Code in this repo:

  1. .claude/settings.json PreToolUse hook on Write/Edit blocks any file
     that would exceed the 500-line hard cap. Auto-loads in every Claude
     session in this repo.
  2. scripts/githooks/pre-commit (install via scripts/install-hooks.sh)
     enforces the LOC cap locally, freezes migrations/ without
     [migration-approved], and protects guardrail files without
     [guardrail-change].
  3. .gitea/workflows/ci.yaml gains loc-budget + guardrail-integrity +
     sbom-scan (syft+grype) jobs, adds mypy --strict for the new Python
     packages (compliance/{services,repositories,domain,schemas}), and
     tsc --noEmit for admin-compliance + developer-portal.

Per-language conventions documented in AGENTS.python.md, AGENTS.go.md,
AGENTS.typescript.md at the repo root — layering, tooling, and explicit
"what you may NOT do" lists. Root CLAUDE.md is prepended with the six
non-negotiable rules. Each of the 10 services gets a README.md.

scripts/check-loc.sh enforces soft 300 / hard 500 and surfaces the
current baseline of 205 hard + 161 soft violations so Phases 1-4 can
drain it incrementally. CI gates only CHANGED files in PRs so the
legacy baseline does not block unrelated work.

## Deprecation sweep

47 files. Pydantic V1 regex= -> pattern= (2 sites), class Config ->
ConfigDict in source_policy_router.py (schemas.py intentionally skipped;
it is the Phase 1 Step 3 split target). datetime.utcnow() ->
datetime.now(timezone.utc) everywhere including SQLAlchemy default=
callables. All DB columns already declare timezone=True, so this is a
latent-bug fix at the Python side, not a schema change.

DeprecationWarning count dropped from 158 to 35.

## Phase 1 Step 1 — Contract test harness

tests/contracts/test_openapi_baseline.py diffs the live FastAPI /openapi.json
against tests/contracts/openapi.baseline.json on every test run. Fails on
removed paths, removed status codes, or new required request body fields.
Regenerate only via tests/contracts/regenerate_baseline.py after a
consumer-updated contract change. This is the safety harness for all
subsequent refactor commits.

## Phase 1 Step 2 — models.py split (1466 -> 85 LOC shim)

compliance/db/models.py is decomposed into seven sibling aggregate modules
following the existing repo pattern (dsr_models.py, vvt_models.py, ...):

  regulation_models.py       (134) — Regulation, Requirement
  control_models.py          (279) — Control, Mapping, Evidence, Risk
  ai_system_models.py        (141) — AISystem, AuditExport
  service_module_models.py   (176) — ServiceModule, ModuleRegulation, ModuleRisk
  audit_session_models.py    (177) — AuditSession, AuditSignOff
  isms_governance_models.py  (323) — ISMSScope, Context, Policy, Objective, SoA
  isms_audit_models.py       (468) — Finding, CAPA, MgmtReview, InternalAudit,
                                     AuditTrail, Readiness

models.py becomes an 85-line re-export shim in dependency order so
existing imports continue to work unchanged. Schema is byte-identical:
__tablename__, column definitions, relationship strings, back_populates,
cascade directives all preserved.

All new sibling files are under the 500-line hard cap; largest is
isms_audit_models.py at 468. No file in compliance/db/ now exceeds
the hard cap.

## Phase 1 Step 3 — infrastructure only

backend-compliance/compliance/{schemas,domain,repositories}/ packages
are created as landing zones with docstrings. compliance/domain/
exports DomainError / NotFoundError / ConflictError / ValidationError /
PermissionError — the base classes services will use to raise
domain-level errors instead of HTTPException.

PHASE1_RUNBOOK.md at backend-compliance/PHASE1_RUNBOOK.md documents
the nine-step execution plan for Phase 1: snapshot baseline,
characterization tests, split models.py (this commit), split schemas.py
(next), extract services, extract repositories, mypy --strict, coverage.

## Verification

  backend-compliance/.venv-phase1: uv python install 3.12 + pip -r requirements.txt
  PYTHONPATH=. pytest compliance/tests/ tests/contracts/
  -> 173 passed, 0 failed, 35 warnings, OpenAPI 360/484 unchanged

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:18:29 +02:00

428 lines
16 KiB
Python

"""
Tests for Legal Document extended routes (User Consents, Audit Log, Cookie Categories, Public endpoints).
"""
import uuid
import os
import sys
from datetime import datetime, timezone
import pytest
from fastapi import FastAPI
from fastapi.testclient import TestClient
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
from classroom_engine.database import Base, get_db
from compliance.db.legal_document_models import (
LegalDocumentDB, LegalDocumentVersionDB, LegalDocumentApprovalDB,
)
from compliance.db.legal_document_extend_models import (
UserConsentDB, ConsentAuditLogDB, CookieCategoryDB,
)
from compliance.api.legal_document_routes import router as legal_document_router
SQLALCHEMY_DATABASE_URL = "sqlite:///./test_legal_docs_ext.db"
engine = create_engine(SQLALCHEMY_DATABASE_URL, connect_args={"check_same_thread": False})
TestingSessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
TENANT_ID = "9282a473-5c95-4b3a-bf78-0ecc0ec71d3e"
HEADERS = {"X-Tenant-ID": TENANT_ID}
app = FastAPI()
app.include_router(legal_document_router, prefix="/api/compliance")
def override_get_db():
db = TestingSessionLocal()
try:
yield db
finally:
db.close()
app.dependency_overrides[get_db] = override_get_db
client = TestClient(app)
@pytest.fixture(autouse=True)
def setup_db():
Base.metadata.create_all(bind=engine)
yield
Base.metadata.drop_all(bind=engine)
# =============================================================================
# Helpers — use raw SQLAlchemy to avoid UUID-string issue in SQLite
# =============================================================================
def _create_document(doc_type="privacy_policy", name="Datenschutzerklaerung"):
"""Create a doc directly via SQLAlchemy and return dict with string id."""
db = TestingSessionLocal()
doc = LegalDocumentDB(
tenant_id=TENANT_ID,
type=doc_type,
name=name,
)
db.add(doc)
db.commit()
db.refresh(doc)
result = {"id": str(doc.id), "type": doc.type, "name": doc.name}
db.close()
return result
def _create_version(document_id, version="1.0", title="DSE v1", content="<p>Content</p>"):
"""Create a version directly via SQLAlchemy."""
import uuid as uuid_mod
db = TestingSessionLocal()
doc_uuid = uuid_mod.UUID(document_id) if isinstance(document_id, str) else document_id
v = LegalDocumentVersionDB(
document_id=doc_uuid,
version=version,
title=title,
content=content,
language="de",
status="draft",
)
db.add(v)
db.commit()
db.refresh(v)
result = {"id": str(v.id), "document_id": str(v.document_id), "version": v.version, "status": v.status}
db.close()
return result
def _publish_version(version_id):
"""Directly set version to published via SQLAlchemy."""
import uuid as uuid_mod
db = TestingSessionLocal()
vid = uuid_mod.UUID(version_id) if isinstance(version_id, str) else version_id
v = db.query(LegalDocumentVersionDB).filter(LegalDocumentVersionDB.id == vid).first()
v.status = "published"
v.approved_by = "admin"
v.approved_at = datetime.now(timezone.utc)
db.commit()
db.refresh(v)
result = {"id": str(v.id), "status": v.status}
db.close()
return result
# =============================================================================
# Public Endpoints
# =============================================================================
class TestPublicDocuments:
def test_list_public_empty(self):
r = client.get("/api/compliance/legal-documents/public", headers=HEADERS)
assert r.status_code == 200
assert r.json() == []
def test_list_public_only_published(self):
doc = _create_document()
v = _create_version(doc["id"])
# Still draft — should not appear
r = client.get("/api/compliance/legal-documents/public", headers=HEADERS)
assert len(r.json()) == 0
# Publish it
_publish_version(v["id"])
r = client.get("/api/compliance/legal-documents/public", headers=HEADERS)
data = r.json()
assert len(data) == 1
assert data[0]["type"] == "privacy_policy"
assert data[0]["version"] == "1.0"
def test_get_latest_published(self):
doc = _create_document()
v = _create_version(doc["id"])
_publish_version(v["id"])
r = client.get("/api/compliance/legal-documents/public/privacy_policy/latest?language=de", headers=HEADERS)
assert r.status_code == 200
data = r.json()
assert data["type"] == "privacy_policy"
assert data["version"] == "1.0"
def test_get_latest_not_found(self):
r = client.get("/api/compliance/legal-documents/public/nonexistent/latest", headers=HEADERS)
assert r.status_code == 404
# =============================================================================
# User Consents
# =============================================================================
class TestUserConsents:
def test_record_consent(self):
doc = _create_document()
r = client.post("/api/compliance/legal-documents/consents", json={
"user_id": "user-123",
"document_id": doc["id"],
"document_type": "privacy_policy",
"consented": True,
"ip_address": "1.2.3.4",
}, headers=HEADERS)
assert r.status_code == 200
data = r.json()
assert data["user_id"] == "user-123"
assert data["consented"] is True
assert data["withdrawn_at"] is None
def test_record_consent_doc_not_found(self):
r = client.post("/api/compliance/legal-documents/consents", json={
"user_id": "user-123",
"document_id": str(uuid.uuid4()),
"document_type": "privacy_policy",
}, headers=HEADERS)
assert r.status_code == 404
def test_get_my_consents(self):
doc = _create_document()
client.post("/api/compliance/legal-documents/consents", json={
"user_id": "user-A",
"document_id": doc["id"],
"document_type": "privacy_policy",
}, headers=HEADERS)
client.post("/api/compliance/legal-documents/consents", json={
"user_id": "user-B",
"document_id": doc["id"],
"document_type": "privacy_policy",
}, headers=HEADERS)
r = client.get("/api/compliance/legal-documents/consents/my?user_id=user-A", headers=HEADERS)
assert r.status_code == 200
assert len(r.json()) == 1
assert r.json()[0]["user_id"] == "user-A"
def test_check_consent_exists(self):
doc = _create_document()
client.post("/api/compliance/legal-documents/consents", json={
"user_id": "user-X",
"document_id": doc["id"],
"document_type": "privacy_policy",
}, headers=HEADERS)
r = client.get("/api/compliance/legal-documents/consents/check/privacy_policy?user_id=user-X", headers=HEADERS)
assert r.status_code == 200
assert r.json()["has_consent"] is True
def test_check_consent_not_exists(self):
r = client.get("/api/compliance/legal-documents/consents/check/privacy_policy?user_id=nobody", headers=HEADERS)
assert r.status_code == 200
assert r.json()["has_consent"] is False
def test_withdraw_consent(self):
doc = _create_document()
cr = client.post("/api/compliance/legal-documents/consents", json={
"user_id": "user-W",
"document_id": doc["id"],
"document_type": "privacy_policy",
}, headers=HEADERS)
consent_id = cr.json()["id"]
r = client.delete(f"/api/compliance/legal-documents/consents/{consent_id}", headers=HEADERS)
assert r.status_code == 200
assert r.json()["consented"] is False
assert r.json()["withdrawn_at"] is not None
def test_withdraw_already_withdrawn(self):
doc = _create_document()
cr = client.post("/api/compliance/legal-documents/consents", json={
"user_id": "user-W2",
"document_id": doc["id"],
"document_type": "terms",
}, headers=HEADERS)
consent_id = cr.json()["id"]
client.delete(f"/api/compliance/legal-documents/consents/{consent_id}", headers=HEADERS)
r = client.delete(f"/api/compliance/legal-documents/consents/{consent_id}", headers=HEADERS)
assert r.status_code == 400
def test_check_after_withdraw(self):
doc = _create_document()
cr = client.post("/api/compliance/legal-documents/consents", json={
"user_id": "user-CW",
"document_id": doc["id"],
"document_type": "privacy_policy",
}, headers=HEADERS)
client.delete(f"/api/compliance/legal-documents/consents/{cr.json()['id']}", headers=HEADERS)
r = client.get("/api/compliance/legal-documents/consents/check/privacy_policy?user_id=user-CW", headers=HEADERS)
assert r.json()["has_consent"] is False
# =============================================================================
# Consent Statistics
# =============================================================================
class TestConsentStats:
def test_stats_empty(self):
r = client.get("/api/compliance/legal-documents/stats/consents", headers=HEADERS)
assert r.status_code == 200
data = r.json()
assert data["total"] == 0
assert data["active"] == 0
assert data["withdrawn"] == 0
assert data["unique_users"] == 0
def test_stats_with_data(self):
doc = _create_document()
# Two users consent
client.post("/api/compliance/legal-documents/consents", json={
"user_id": "u1", "document_id": doc["id"], "document_type": "privacy_policy",
}, headers=HEADERS)
cr = client.post("/api/compliance/legal-documents/consents", json={
"user_id": "u2", "document_id": doc["id"], "document_type": "privacy_policy",
}, headers=HEADERS)
# Withdraw one
client.delete(f"/api/compliance/legal-documents/consents/{cr.json()['id']}", headers=HEADERS)
r = client.get("/api/compliance/legal-documents/stats/consents", headers=HEADERS)
data = r.json()
assert data["total"] == 2
assert data["active"] == 1
assert data["withdrawn"] == 1
assert data["unique_users"] == 2
assert data["by_type"]["privacy_policy"] == 2
# =============================================================================
# Audit Log
# =============================================================================
class TestAuditLog:
def test_audit_log_empty(self):
r = client.get("/api/compliance/legal-documents/audit-log", headers=HEADERS)
assert r.status_code == 200
assert r.json()["entries"] == []
def test_audit_log_after_consent(self):
doc = _create_document()
client.post("/api/compliance/legal-documents/consents", json={
"user_id": "audit-user",
"document_id": doc["id"],
"document_type": "privacy_policy",
}, headers=HEADERS)
r = client.get("/api/compliance/legal-documents/audit-log", headers=HEADERS)
entries = r.json()["entries"]
assert len(entries) >= 1
assert entries[0]["action"] == "consent_given"
def test_audit_log_after_withdraw(self):
doc = _create_document()
cr = client.post("/api/compliance/legal-documents/consents", json={
"user_id": "wd-user",
"document_id": doc["id"],
"document_type": "privacy_policy",
}, headers=HEADERS)
client.delete(f"/api/compliance/legal-documents/consents/{cr.json()['id']}", headers=HEADERS)
r = client.get("/api/compliance/legal-documents/audit-log", headers=HEADERS)
actions = [e["action"] for e in r.json()["entries"]]
assert "consent_given" in actions
assert "consent_withdrawn" in actions
def test_audit_log_filter(self):
doc = _create_document()
client.post("/api/compliance/legal-documents/consents", json={
"user_id": "f-user",
"document_id": doc["id"],
"document_type": "terms",
}, headers=HEADERS)
r = client.get("/api/compliance/legal-documents/audit-log?action=consent_given", headers=HEADERS)
assert r.json()["total"] >= 1
for e in r.json()["entries"]:
assert e["action"] == "consent_given"
def test_audit_log_pagination(self):
doc = _create_document()
for i in range(5):
client.post("/api/compliance/legal-documents/consents", json={
"user_id": f"p-user-{i}",
"document_id": doc["id"],
"document_type": "privacy_policy",
}, headers=HEADERS)
r = client.get("/api/compliance/legal-documents/audit-log?limit=2&offset=0", headers=HEADERS)
data = r.json()
assert data["total"] == 5
assert len(data["entries"]) == 2
# =============================================================================
# Cookie Categories
# =============================================================================
class TestCookieCategories:
def test_list_empty(self):
r = client.get("/api/compliance/legal-documents/cookie-categories", headers=HEADERS)
assert r.status_code == 200
assert r.json() == []
def test_create_category(self):
r = client.post("/api/compliance/legal-documents/cookie-categories", json={
"name_de": "Notwendig",
"name_en": "Necessary",
"is_required": True,
"sort_order": 0,
}, headers=HEADERS)
assert r.status_code == 200
data = r.json()
assert data["name_de"] == "Notwendig"
assert data["is_required"] is True
def test_list_ordered(self):
client.post("/api/compliance/legal-documents/cookie-categories", json={
"name_de": "Marketing", "sort_order": 30,
}, headers=HEADERS)
client.post("/api/compliance/legal-documents/cookie-categories", json={
"name_de": "Notwendig", "sort_order": 0,
}, headers=HEADERS)
r = client.get("/api/compliance/legal-documents/cookie-categories", headers=HEADERS)
data = r.json()
assert len(data) == 2
assert data[0]["name_de"] == "Notwendig"
assert data[1]["name_de"] == "Marketing"
def test_update_category(self):
cr = client.post("/api/compliance/legal-documents/cookie-categories", json={
"name_de": "Analyse", "sort_order": 20,
}, headers=HEADERS)
cat_id = cr.json()["id"]
r = client.put(f"/api/compliance/legal-documents/cookie-categories/{cat_id}", json={
"name_de": "Analytics", "description_de": "Tracking-Cookies",
}, headers=HEADERS)
assert r.status_code == 200
assert r.json()["name_de"] == "Analytics"
assert r.json()["description_de"] == "Tracking-Cookies"
def test_update_not_found(self):
r = client.put(f"/api/compliance/legal-documents/cookie-categories/{uuid.uuid4()}", json={
"name_de": "X",
}, headers=HEADERS)
assert r.status_code == 404
def test_delete_category(self):
cr = client.post("/api/compliance/legal-documents/cookie-categories", json={
"name_de": "Temp",
}, headers=HEADERS)
cat_id = cr.json()["id"]
r = client.delete(f"/api/compliance/legal-documents/cookie-categories/{cat_id}", headers=HEADERS)
assert r.status_code == 204
r = client.get("/api/compliance/legal-documents/cookie-categories", headers=HEADERS)
assert len(r.json()) == 0
def test_delete_not_found(self):
r = client.delete(f"/api/compliance/legal-documents/cookie-categories/{uuid.uuid4()}", headers=HEADERS)
assert r.status_code == 404