breakpilot-compliance

Author	SHA1	Message	Date
Sharang Parnerkar	ae008d7d25	refactor(backend/api): extract DSFA schemas + services (Step 4 — file 14 of 18) - Create compliance/schemas/dsfa.py (161 LOC) — extract DSFACreate, DSFAUpdate, DSFAStatusUpdate, DSFASectionUpdate, DSFAApproveRequest - Create compliance/services/dsfa_service.py (386 LOC) — CRUD + helpers + stats + audit-log + CSV export; uses domain errors - Create compliance/services/dsfa_workflow_service.py (347 LOC) — status update, section update, submit-for-review, approve, export JSON, versions - Rewrite compliance/api/dsfa_routes.py (339 LOC) as thin handlers with Depends + translate_domain_errors(); re-export legacy symbols via __all__ - Add [mypy-compliance.api.dsfa_routes] ignore_errors = False to mypy.ini - Update tests: 422 -> 400 for domain ValidationError (6 assertions) - Regenerate OpenAPI baseline (360 paths / 484 operations — unchanged) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 19:20:48 +02:00
Sharang Parnerkar	d2c94619d8	refactor(backend/api): extract LegalDocumentConsentService (Step 4 — file 12 of 18) Extract consent, audit log, cookie category, and consent stats endpoints from legal_document_routes into LegalDocumentConsentService. The route file is now a thin handler layer delegating to LegalDocumentService and LegalDocumentConsentService with translate_domain_errors(). Legacy helpers (_doc_to_response, _version_to_response, _transition, _log_approval) and schemas are re-exported for existing tests. Two transition tests updated to expect domain errors instead of HTTPException. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 08:47:56 +02:00
Sharang Parnerkar	cc1c61947d	refactor(backend/api): extract Incident services (Step 4 — file 11 of 18) compliance/api/incident_routes.py (916 LOC) -> 280 LOC thin routes + two services + 95-line schemas file. Two-service split for DSGVO Art. 33/34 Datenpannen-Management: incident_service.py (460 LOC): - CRUD (create, list, get, update, delete) - Stats, status update, timeline append, close - Module-level helpers: _calculate_risk_level, _is_notification_required, _calculate_72h_deadline, _incident_to_response, _measure_to_response, _parse_jsonb, _append_timeline, DEFAULT_TENANT_ID incident_workflow_service.py (329 LOC): - Risk assessment (likelihood x impact -> risk_level) - Art. 33 authority notification (with 72h deadline tracking) - Art. 34 data subject notification - Corrective measures CRUD Both services use raw SQL via sqlalchemy.text() — no ORM models for incident_incidents / incident_measures tables. Migrated from the Go ai-compliance-sdk; Python backend is Source of Truth. Legacy test compat: tests/test_incident_routes.py imports _calculate_risk_level, _is_notification_required, _calculate_72h_deadline, _incident_to_response, _measure_to_response, _parse_jsonb, DEFAULT_TENANT_ID directly from compliance.api.incident_routes — all re-exported via __all__. Verified: - 223/223 pytest pass (173 core + 50 incident) - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 141 source files - incident_routes.py 916 -> 280 LOC - Hard-cap violations: 8 -> 7 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 08:35:57 +02:00
Sharang Parnerkar	0c2e03f294	refactor(backend/api): extract Email Template services (Step 4 — file 10 of 18) compliance/api/email_template_routes.py (823 LOC) -> 295 LOC thin routes + 402-line EmailTemplateService + 241-line EmailTemplateVersionService + 61-line schemas file. Two-service split along natural responsibility seam: email_template_service.py (402 LOC): - Template type catalog (TEMPLATE_TYPES constant) - Template CRUD (list, create, get) - Stats, settings, send logs, initialization, default content - Shared _template_to_dict / _version_to_dict / _render_template helpers email_template_version_service.py (241 LOC): - Version CRUD (create, list, get, update) - Workflow transitions (submit, approve, reject, publish) - Preview and test-send TEMPLATE_TYPES, VALID_CATEGORIES, VALID_STATUSES re-exported from the route module for any legacy consumers. State-transition errors use ValidationError (-> HTTPException 400) to preserve the original handler's 400 status for "Only draft/review versions can be ..." checks, since the existing TestClient integration tests (47 tests) assert status_code == 400. Verified: - 47/47 tests/test_email_template_routes.py pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 138 source files - email_template_routes.py 823 -> 295 LOC - Hard-cap violations: 9 -> 8 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 22:39:19 +02:00
Sharang Parnerkar	a638d0e527	refactor(backend/api): extract EvidenceService (Step 4 — file 9 of 18) compliance/api/evidence_routes.py (641 LOC) -> 240 LOC thin routes + 460-line EvidenceService. Manages evidence CRUD, file upload, CI/CD evidence collection (SAST/dependency/SBOM/container scans), and CI status dashboard. Service injection pattern: EvidenceService takes the EvidenceRepository, ControlRepository, and AutoRiskUpdater classes as constructor parameters. The route's get_evidence_service factory reads these class references from its own module namespace so tests that ``patch("compliance.api.evidence_routes.EvidenceRepository", ...)`` still take effect through the factory. The `_store_evidence` and `_update_risks` helpers stay as module-level callables in evidence_service and are re-exported from the route module. The collect_ci_evidence handler remains inline (not delegated to a service method) so tests can patch `compliance.api.evidence_routes._store_evidence` and have the patch take effect at the handler's call site. Legacy re-exports via __all__: SOURCE_CONTROL_MAP, EvidenceRepository, ControlRepository, AutoRiskUpdater, _parse_ci_evidence, _extract_findings_detail, _store_evidence, _update_risks. Verified: - 208/208 pytest (core + 35 evidence tests) pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 135 source files - evidence_routes.py 641 -> 240 LOC - Hard-cap violations: 10 -> 9 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 21:59:03 +02:00
Sharang Parnerkar	e613af1a7d	refactor(backend/api): extract ScreeningService (Step 4 — file 8 of 18) compliance/api/screening_routes.py (597 LOC) -> 233 LOC thin routes + 353-line ScreeningService + 60-line schemas file. Manages SBOM generation (CycloneDX 1.5) and OSV.dev vulnerability scanning. Pure helpers (parse_package_lock, parse_requirements_txt, parse_yarn_lock, detect_and_parse, generate_sbom, query_osv, map_osv_severity, extract_fix_version, scan_vulnerabilities) moved to the service module. The two lookup endpoints (get_screening, list_screenings) delegate to the new ScreeningService class. Test-mock compatibility: tests/test_screening_routes.py uses `patch("compliance.api.screening_routes.SessionLocal", ...)` and `patch("compliance.api.screening_routes.scan_vulnerabilities", ...)`. Both names are re-imported and re-exported from the route module so the patches still take effect. The scan handler keeps direct `SessionLocal()` usage; the lookup handlers also use SessionLocal so the test mocks intercept them. Latent bug fixed: the original scan handler had text = content.decode("utf-8") on line 339, shadowing the imported `sqlalchemy.text` so that the subsequent `text("INSERT ...")` calls would have raised at runtime. The variable is now named `file_text`. Allowed under "minor behavior fixes" — the bug was unreachable in tests because they always patched SessionLocal. Verified: - 240/240 pytest pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 134 source files - screening_routes.py 597 -> 233 LOC - Hard-cap violations: 11 -> 10 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 20:03:16 +02:00
Sharang Parnerkar	7107a31496	refactor(backend/api): extract SourcePolicyService (Step 4 — file 7 of 18) compliance/api/source_policy_router.py (580 LOC) -> 253 LOC thin routes + 453-line SourcePolicyService + 83-line schemas file. Manages allowed data sources, operations matrix, PII rules, blocked-content log, audit trail, and dashboard stats/report. Single-service split. ORM-based (uses compliance.db.source_policy_models). Date-string parsing extracted to a module-level _parse_iso_optional helper so the audit + blocked-content list endpoints share it instead of duplicating try/except blocks. Legacy test compat: SourceCreate, SourceUpdate, SourceResponse, PIIRuleCreate, PIIRuleUpdate, OperationUpdate, _log_audit re-exported from compliance.api.source_policy_router via __all__. Verified: - 208/208 pytest pass (173 core + 35 source policy) - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 132 source files - source_policy_router.py 580 -> 253 LOC - Hard-cap violations: 12 -> 11 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 19:58:02 +02:00
Sharang Parnerkar	b850368ec9	refactor(backend/api): extract CanonicalControlService (Step 4 — file 6 of 18) compliance/api/canonical_control_routes.py (514 LOC) -> 192 LOC thin routes + 316-line CanonicalControlService + 105-line schemas file. Canonical Control Library manages OWASP/NIST/ENISA-anchored security control frameworks and controls. Like company_profile_routes, this file uses raw SQL via sqlalchemy.text() because there are no SQLAlchemy models for canonical_control_frameworks or canonical_controls. Single-service split. Session management moved from bespoke `with SessionLocal() as db:` blocks to Depends(get_db) for consistency. Legacy test imports preserved via re-export (FrameworkResponse, ControlResponse, SimilarityCheckRequest, SimilarityCheckResponse, _control_row). Validation extracted to a module-level `_validate_control_input` helper so both create and update share the same checks. ValidationError (from compliance.domain) replaces raw HTTPException(400) raises. Verified: - 187/187 pytest (173 core + 14 canonical) pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 130 source files - canonical_control_routes.py 514 -> 192 LOC - Hard-cap violations: 13 -> 12 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 19:53:55 +02:00
Sharang Parnerkar	4fa0dd6f6d	refactor(backend/api): extract VVTService (Step 4 — file 5 of 18) compliance/api/vvt_routes.py (550 LOC) -> 225 LOC thin routes + 475-line VVTService. Covers the organization header, processing activities CRUD, audit log, JSON/CSV export, stats, and version lookups for the Art. 30 DSGVO Verzeichnis. Single-service split: organization + activities + audit + stats all revolve around the same tenant's VVT document, and the existing test suite (tests/test_vvt_routes.py — 768 LOC, tests/test_vvt_tenant_isolation.py — 205 LOC) exercises them together. Module-level helpers (_activity_to_response, _log_audit, _export_csv) stay module-level in compliance.services.vvt_service and are re-exported from compliance.api.vvt_routes so the two test files keep importing from the old path. Pydantic schemas already live in compliance.schemas.vvt from Step 3 — no new schema file needed this round. mypy.ini flips compliance.api.vvt_routes from ignore_errors=True to False. Two SQLAlchemy Column[str] vs str dict-index errors fixed with explicit str() casts on status/business_function in the stats loop. Verified: - 242/242 pytest (173 core + 69 VVT integration) pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 128 source files - vvt_routes.py 550 -> 225 LOC - vvt_service.py 475 LOC (under 500 hard cap) - Hard-cap violations: 14 -> 13 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 19:50:40 +02:00
Sharang Parnerkar	f39c7ca40c	refactor(backend/api): extract CompanyProfileService (Step 4 — file 4 of 18) compliance/api/company_profile_routes.py (640 LOC) -> 154 LOC thin routes. Unusual for this repo: persistence uses raw SQL via sqlalchemy.text() because the underlying compliance_company_profiles table has ~45 columns with complex jsonb coercion and there is no SQLAlchemy model for it. New files: compliance/schemas/company_profile.py (127) — 4 request/response models compliance/services/company_profile_service.py (340) — Service class + row_to_response + log_audit compliance/services/_company_profile_sql.py (139) — 70-line INSERT/UPDATE statements separated for readability Minor behavioral improvement: the handlers now use Depends(get_db) for session management instead of the bespoke `db = SessionLocal(); try: ... finally: db.close()` pattern. This makes the routes consistent with every other refactored service, fixes the broken-ness under test dependency_overrides, and removes 6 duplicate try/finally blocks. Legacy exports preserved: CompanyProfileRequest, CompanyProfileResponse, AuditEntryResponse, AuditListResponse, row_to_response, and log_audit are re-exported from compliance.api.company_profile_routes so that the two existing test files (tests/test_company_profile_routes.py, tests/test_company_profile_extend.py) keep importing from the same path. Pre-existing broken tests noted: 6 tests in those files feed a 40-tuple row into row_to_response, but _BASE_COLUMNS_LIST has 46 columns (has had since the Phase 2 Stammdaten extension). These tests fail on main too (verified via `git stash` round-trip). Not fixed in this commit — they require a rewrite of the test's _make_row helper, which is out of scope for a pure structural refactor. Flagged for follow-up. Verified: - 173/173 pytest compliance/tests/ tests/contracts/ pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 127 source files - company_profile_routes.py 640 -> 154 LOC - All new files under soft 300 target except service (340, under hard 500) - Hard-cap violations: 15 -> 14 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 19:47:29 +02:00
Sharang Parnerkar	d571412657	refactor(backend/api): extract TOMService (Step 4 — file 3 of 18) compliance/api/tom_routes.py (609 LOC) -> 215 LOC thin routes + 434-line TOMService. Request bodies (TOMStateBody, TOMMeasureCreate, TOMMeasureUpdate, TOMMeasureBulkItem, TOMMeasureBulkBody) moved to compliance/schemas/tom.py (joining the existing response models from the Step 3 split). Single-service split (not two like banner): state, measures CRUD + bulk upsert, stats, export, and version lookups are all tightly coupled around the TOMMeasureDB aggregate, so splitting would create artificial boundaries. TOMService is 434 LOC — comfortably under the 500 hard cap. Domain error mapping: - ConflictError -> 409 (version conflict on state save; duplicate control_id on create) - NotFoundError -> 404 (missing measure on update; missing version) - ValidationError -> 400 (missing tenant_id on DELETE /state) Legacy test compat: the existing tests/test_tom_routes.py imports TOMMeasureBulkItem, _parse_dt, _measure_to_dict, and DEFAULT_TENANT_ID directly from compliance.api.tom_routes. All re-exported via __all__ so the 44-test file runs unchanged. mypy.ini flips compliance.api.tom_routes from ignore_errors=True to False. TOMService carries the scoped Column[T] header. Verified: - 217/217 pytest (173 baseline + 44 TOM) pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 124 source files - tom_routes.py 609 -> 215 LOC - Hard-cap violations: 16 -> 15 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 19:42:17 +02:00
Sharang Parnerkar	10073f3ef0	refactor(backend/api): extract BannerConsent + BannerAdmin services (Step 4) Phase 1 Step 4, file 2 of 18. Same cookbook as audit_routes (`4a91814` + `883ef70`) applied to banner_routes.py. compliance/api/banner_routes.py (653 LOC) is decomposed into: compliance/api/banner_routes.py (255) — thin handlers compliance/services/banner_consent_service.py (298) — public SDK surface compliance/services/banner_admin_service.py (238) — site/category/vendor CRUD compliance/services/_banner_serializers.py ( 81) — ORM-to-dict helpers shared between the two services compliance/schemas/banner.py ( 85) — Pydantic request models Split rationale: the SDK-facing endpoints (consent CRUD, config retrieval, export, stats) and the admin CRUD endpoints (sites + categories + vendors) have distinct audiences and different auth stories, and combined they would push the service file over the 500 hard cap. Two focused services is cleaner than one ~540-line god class. The shared ORM-to-dict helpers live in a private sibling module (_banner_serializers) rather than a static method on either service, so both services can import without a cycle. Handlers follow the established pattern: - Depends(get_consent_service) or Depends(get_admin_service) - `with translate_domain_errors():` wrapping the service call - Explicit return type annotations - ~3-5 lines per handler Services raise NotFoundError / ConflictError / ValidationError from compliance.domain; no HTTPException in the service layer. mypy.ini flips compliance.api.banner_routes from ignore_errors=True to False, joining audit_routes in the strict scope. The services carry the same scoped `# mypy: disable-error-code="arg-type,assignment"` header used by the audit services for the ORM Column[T] issue. Pydantic schemas moved to compliance.schemas.banner (mirroring the Step 3 schemas split). They were previously defined inline in banner_routes.py and not referenced by anything outside it, so no backwards-compat shim is needed. Verified: - 224/224 pytest (173 baseline + 26 audit integration + 25 banner integration) pass - tests/contracts/test_openapi_baseline.py green (360/484 unchanged) - mypy compliance/ -> Success: no issues found in 123 source files - All new files under the 300 soft target (largest: 298) - banner_routes.py drops from 653 -> 255 LOC (below hard cap) Hard-cap violations remaining: 16 (was 17). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 18:52:31 +02:00
Sharang Parnerkar	883ef702ac	tech-debt: mypy --strict config + integration tests for audit routes Phase 1 Step 4 follow-up addressing the debt flagged in the worked-example commit (`4a91814`). ## mypy --strict policy Adds backend-compliance/mypy.ini declaring the strict-mode scope: Fully strict (enforced today): - compliance/domain/ - compliance/schemas/ - compliance/api/_http_errors.py - compliance/api/audit_routes.py (refactored in Step 4) - compliance/services/audit_session_service.py - compliance/services/audit_signoff_service.py Loose (ignore_errors=True) with a migration path: - compliance/db/* — SQLAlchemy 1.x Column[] vs runtime T; unblocks Phase 1 until a Mapped[T] migration. - compliance/api/<route>.py — each route file flips to strict as its own Step 4 refactor lands. - compliance/services/<legacy util> — 14 utility services (llm_provider, pdf_extractor, seeder, ...) that predate the clean-arch refactor. - compliance/tests/ — excluded (legacy placeholder style). The new TestClient- based integration suite is type-annotated. The two new service files carry a scoped `# mypy: disable-error-code="arg-type,assignment"` header for the ORM Column[T] issue — same underlying SQLAlchemy limitation, narrowly scoped rather than wholesale ignore_errors. Flow: `cd backend-compliance && mypy compliance/` -> clean on 119 files. CI yaml updated to use the config instead of ad-hoc package lists. ## Bugs fixed while enabling strict mypy --strict surfaced two latent bugs in the pre-refactor code. Both were invisible because the old `compliance/tests/test_audit_routes.py` is a placeholder suite that asserts on request-data shape and never calls the handlers: - AuditSessionResponse.updated_at is a required field in the schema, but the original handler didn't pass it. Fixed in AuditSessionService._to_response. - PaginationMeta requires has_next + has_prev. The original audit checklist handler didn't compute them. Fixed in AuditSignOffService.get_checklist. Both are behavior-preserving at the HTTP level because the old code would have raised Pydantic ValidationError at response serialization had the endpoint actually been exercised. ## Integration test suite Adds backend-compliance/tests/test_audit_routes_integration.py — 26 real TestClient tests against an in-memory sqlite backend (StaticPool). Replaces the coverage gap left by the placeholder suite. Covers: - Session CRUD + lifecycle transitions (draft -> in_progress -> completed -> archived), including the 409 paths for illegal transitions - Checklist pagination, filtering, search - Sign-off create / update / auto-start-session / count-flipping - Sign-off 400 (invalid result), 404 (missing requirement), 409 (completed session) - Get-signoff 404 / 200 round-trip Uses a module-scoped schema fixture + per-test DELETE-sweep so the suite runs in ~2.3s despite the ~50-table ORM surface. Verified: - 199/199 pytest (173 original + 26 new audit integration) pass - tests/contracts/test_openapi_baseline.py green, OpenAPI 360/484 unchanged - mypy compliance/ -> Success: no issues found in 119 source files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 18:39:40 +02:00

13 Commits