Non-negotiable structural rules that apply to every Claude Code session in
this repo and to every commit, enforced via three defense-in-depth layers:
1. PreToolUse hook in .claude/settings.json blocks any Write/Edit that
would push a file past the 500-line hard cap. Auto-loads for any
Claude session in this repo regardless of who launched it.
2. scripts/githooks/pre-commit (installed via scripts/install-hooks.sh)
enforces the LOC cap, freezes migrations/ unless [migration-approved],
and protects guardrail files unless [guardrail-change] is present.
3. .gitea/workflows/ci.yaml gets loc-budget + guardrail-integrity jobs,
plus mypy --strict on new Python packages, tsc --noEmit on Node
services, and a syft+grype SBOM scan.
Per-language conventions are documented in AGENTS.python.md / AGENTS.go.md /
AGENTS.typescript.md at the repo root — layering (router->service->repo for
Python, hexagonal for Go, colocation for Next.js), tooling baseline, and
explicit "what you may NOT do" lists.
Adds scripts/check-loc.sh (soft 300 / hard 500, reports 205 hard and 161
soft violations in the current codebase) plus .claude/rules/loc-exceptions.txt
(initially empty — the list is designed to shrink over time).
Per-service READMEs for all 10 services + PHASE1_RUNBOOK.md for the
backend-compliance refactor. Skeleton packages (compliance/{domain,
repositories,schemas}) are the landing zone for the clean-arch rewrite that
begins in Phase 1.
CLAUDE.md is prepended with the six non-negotiable rules.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
182 lines
8.7 KiB
Markdown
182 lines
8.7 KiB
Markdown
# Phase 1 Runbook — backend-compliance refactor
|
|
|
|
This document is the step-by-step execution guide for Phase 1 of the repo refactor plan at `~/.claude/plans/vectorized-purring-barto.md`. It exists because the refactor must be driven from a session that can actually run `pytest` against the service, and every step must be verified green before moving to the next.
|
|
|
|
## Prerequisites
|
|
|
|
- Python 3.12 venv with `backend-compliance/requirements.txt` installed.
|
|
- Local Postgres reachable via `COMPLIANCE_DATABASE_URL` (use the compose db).
|
|
- Existing 48 pytest test files pass from a clean checkout: `pytest compliance/tests/ -v` → all green. **Do not proceed until this is true.**
|
|
|
|
## Step 0 — Record the baseline
|
|
|
|
```bash
|
|
cd backend-compliance
|
|
pytest compliance/tests/ -v --tb=short | tee /tmp/baseline.txt
|
|
pytest --cov=compliance --cov-report=term | tee /tmp/baseline-coverage.txt
|
|
python tests/contracts/regenerate_baseline.py # creates openapi.baseline.json
|
|
git add tests/contracts/openapi.baseline.json
|
|
git commit -m "phase1: pin OpenAPI baseline before refactor"
|
|
```
|
|
|
|
The baseline file is the contract. From this point forward, `pytest tests/contracts/` MUST stay green.
|
|
|
|
## Step 1 — Characterization tests (before any code move)
|
|
|
|
For each oversized route file we will refactor, add a happy-path + 1-error-path test **before** touching the source. These are called "characterization tests" and their purpose is to freeze current observable behavior so the refactor cannot change it silently.
|
|
|
|
Oversized route files to cover (ordered by size):
|
|
|
|
| File | LOC | Endpoints to cover |
|
|
|---|---:|---|
|
|
| `compliance/api/isms_routes.py` | 1676 | one happy + one 4xx per route |
|
|
| `compliance/api/dsr_routes.py` | 1176 | same |
|
|
| `compliance/api/vvt_routes.py` | *N* | same |
|
|
| `compliance/api/dsfa_routes.py` | *N* | same |
|
|
| `compliance/api/tom_routes.py` | *N* | same |
|
|
| `compliance/api/schemas.py` | 1899 | N/A (covered transitively) |
|
|
| `compliance/db/models.py` | 1466 | N/A (covered by existing + route tests) |
|
|
| `compliance/db/repository.py` | 1547 | add unit tests per repo class as they are extracted |
|
|
|
|
Use `httpx.AsyncClient` + factory fixtures; see `AGENTS.python.md`. Place under `tests/integration/test_<domain>_contract.py`.
|
|
|
|
Commit: `phase1: characterization tests for <domain> routes`.
|
|
|
|
## Step 2 — Split `compliance/db/models.py` (1466 → <500 per file)
|
|
|
|
⚠️ **Atomic step.** A `compliance/db/models/` package CANNOT coexist with the existing `compliance/db/models.py` module — Python's import system shadows the module with the package, breaking every `from compliance.db.models import X` call. The directory skeleton was intentionally NOT pre-created for this reason. Do the following in **one commit**:
|
|
|
|
1. Create `compliance/db/models/` directory with `__init__.py` (re-export shim — see template below).
|
|
2. Move aggregate model classes into `compliance/db/models/<aggregate>.py` modules.
|
|
3. Delete the old `compliance/db/models.py` file in the same commit.
|
|
|
|
Strategy uses a **re-export shim** so no import sites change:
|
|
|
|
1. For each aggregate, create `compliance/db/models/<aggregate>.py` containing the model classes. Copy verbatim; do not rename `__tablename__`, columns, or relationship strings.
|
|
2. Aggregate suggestions (verify by reading `models.py`):
|
|
- `dsr.py` (DSR requests, exports)
|
|
- `dsfa.py`
|
|
- `vvt.py`
|
|
- `tom.py`
|
|
- `ai.py` (AI systems, compliance checks)
|
|
- `consent.py`
|
|
- `evidence.py`
|
|
- `vendor.py`
|
|
- `audit.py`
|
|
- `policy.py`
|
|
- `project.py`
|
|
3. After every aggregate is moved, replace `compliance/db/models.py` with:
|
|
```python
|
|
"""Re-export shim — see compliance.db.models package."""
|
|
from compliance.db.models.dsr import * # noqa: F401,F403
|
|
from compliance.db.models.dsfa import * # noqa: F401,F403
|
|
# ... one per module
|
|
```
|
|
This keeps `from compliance.db.models import XYZ` working everywhere it's used today.
|
|
4. Run `pytest` after every move. Green → commit. Red → revert that move and investigate.
|
|
5. Existing aggregate-level files (`compliance/db/dsr_models.py`, `vvt_models.py`, `tom_models.py`, etc.) should be folded into the new `compliance/db/models/` package in the same pass — do not leave two parallel naming conventions.
|
|
|
|
**Do not** add `__init__.py` star-imports that change `Base.metadata` discovery order. Alembic's autogenerate depends on it. Verify via: `alembic check` if the env is set up.
|
|
|
|
## Step 3 — Split `compliance/api/schemas.py` (1899 → per domain)
|
|
|
|
Mirror the models split:
|
|
|
|
1. For each domain, create `compliance/schemas/<domain>.py` with the Pydantic models.
|
|
2. Replace `compliance/api/schemas.py` with a re-export shim.
|
|
3. Keep `Create`/`Update`/`Read` variants separated; do not merge them into unions.
|
|
4. Run `pytest` + contract test after each domain. Green → commit.
|
|
|
|
## Step 4 — Extract services (router → service delegation)
|
|
|
|
For each route file > 500 LOC, pull handler bodies into a service class under `compliance/services/<domain>_service.py` (new-style domain services, not the utility `compliance/services/` modules that already exist — consider renaming those to `compliance/services/_legacy/` if collisions arise).
|
|
|
|
Router handlers become:
|
|
|
|
```python
|
|
@router.post("/dsr/requests", response_model=DSRRequestRead, status_code=201)
|
|
async def create_dsr_request(
|
|
payload: DSRRequestCreate,
|
|
service: DSRService = Depends(get_dsr_service),
|
|
tenant_id: UUID = Depends(get_tenant_id),
|
|
) -> DSRRequestRead:
|
|
try:
|
|
return await service.create(tenant_id, payload)
|
|
except ConflictError as exc:
|
|
raise HTTPException(409, str(exc)) from exc
|
|
except NotFoundError as exc:
|
|
raise HTTPException(404, str(exc)) from exc
|
|
```
|
|
|
|
Rules:
|
|
- Handler body ≤ 30 LOC.
|
|
- Service raises domain errors (`compliance.domain`), never `HTTPException`.
|
|
- Inject service via `Depends` on a factory that wires the repository.
|
|
|
|
Run tests after each router is thinned. Contract test must stay green.
|
|
|
|
## Step 5 — Extract repositories
|
|
|
|
`compliance/db/repository.py` (1547) and `compliance/db/isms_repository.py` (838) split into:
|
|
|
|
```
|
|
compliance/repositories/
|
|
├── dsr_repository.py
|
|
├── dsfa_repository.py
|
|
├── vvt_repository.py
|
|
├── isms_repository.py # <500 LOC, split if needed
|
|
└── ...
|
|
```
|
|
|
|
Each repository class:
|
|
- Takes `AsyncSession` (or equivalent) in constructor.
|
|
- Exposes intent-named methods (`get_pending_for_tenant`, not `select_where`).
|
|
- Returns ORM instances or domain VOs. No `Row`.
|
|
- No business logic.
|
|
|
|
Unit-test every repo class against the compose Postgres with a transactional fixture (begin → rollback).
|
|
|
|
## Step 6 — mypy --strict on new packages
|
|
|
|
CI already runs `mypy --strict` against `compliance/{services,repositories,domain,schemas}/`. After every extraction, verify locally:
|
|
|
|
```bash
|
|
mypy --strict --ignore-missing-imports compliance/schemas compliance/repositories compliance/domain compliance/services
|
|
```
|
|
|
|
If you have type errors, fix them in the extracted module. **Do not** add `# type: ignore` blanket waivers. If a third-party lib is poorly typed, add it to `[mypy.overrides]` in `pyproject.toml`/`mypy.ini` with a one-line rationale.
|
|
|
|
## Step 7 — Expand test coverage
|
|
|
|
- Unit tests per service (mocked repo).
|
|
- Integration tests per repository (real db, transactional).
|
|
- Contract test stays green.
|
|
- Target: 80% coverage on new code. Never decrease the service baseline.
|
|
|
|
## Step 8 — Guardrail enforcement
|
|
|
|
After Phase 1 completes, `compliance/db/models.py`, `compliance/db/repository.py`, and `compliance/api/schemas.py` are either re-export shims (≤50 LOC each) or deleted. No file in `backend-compliance/compliance/` exceeds 500 LOC. Run:
|
|
|
|
```bash
|
|
../scripts/check-loc.sh backend-compliance/
|
|
```
|
|
|
|
Any remaining hard violations → document in `.claude/rules/loc-exceptions.txt` with rationale, or keep splitting.
|
|
|
|
## Done when
|
|
|
|
- `pytest compliance/tests/ tests/ -v` all green.
|
|
- `pytest tests/contracts/` green — OpenAPI has no removals, no renames, no new required request fields.
|
|
- Coverage ≥ baseline.
|
|
- `mypy --strict` clean on new packages.
|
|
- `scripts/check-loc.sh backend-compliance/` reports 0 hard violations in new/touched files (legacy allowlisted in `loc-exceptions.txt` only with rationale).
|
|
- CI all green on PR.
|
|
|
|
## Pitfalls
|
|
|
|
- **Do not change `__tablename__` or column names.** Even a rename breaks the DB contract.
|
|
- **Do not change relationship back_populates / backref strings.** SQLAlchemy resolves these by name at mapper configuration.
|
|
- **Do not change route paths or pydantic field names.** Contract test will catch most — but JSON field aliasing (`Field(alias=...)`) is easy to break accidentally.
|
|
- **Do not eagerly reformat unrelated code.** Keep the diff reviewable. One PR per major step.
|
|
- **Do not bypass the pre-commit hook.** If a file legitimately must be >500 LOC during an intermediate step, squash commits at the end so the final state is clean.
|