phase0: add architecture guardrails, CI gates, per-language AGENTS.md

Non-negotiable structural rules that apply to every Claude Code session in
this repo and to every commit, enforced via three defense-in-depth layers:

  1. PreToolUse hook in .claude/settings.json blocks any Write/Edit that
     would push a file past the 500-line hard cap. Auto-loads for any
     Claude session in this repo regardless of who launched it.
  2. scripts/githooks/pre-commit (installed via scripts/install-hooks.sh)
     enforces the LOC cap, freezes migrations/ unless [migration-approved],
     and protects guardrail files unless [guardrail-change] is present.
  3. .gitea/workflows/ci.yaml gets loc-budget + guardrail-integrity jobs,
     plus mypy --strict on new Python packages, tsc --noEmit on Node
     services, and a syft+grype SBOM scan.

Per-language conventions are documented in AGENTS.python.md / AGENTS.go.md /
AGENTS.typescript.md at the repo root — layering (router->service->repo for
Python, hexagonal for Go, colocation for Next.js), tooling baseline, and
explicit "what you may NOT do" lists.

Adds scripts/check-loc.sh (soft 300 / hard 500, reports 205 hard and 161
soft violations in the current codebase) plus .claude/rules/loc-exceptions.txt
(initially empty — the list is designed to shrink over time).

Per-service READMEs for all 10 services + PHASE1_RUNBOOK.md for the
backend-compliance refactor. Skeleton packages (compliance/{domain,
repositories,schemas}) are the landing zone for the clean-arch rewrite that
begins in Phase 1.

CLAUDE.md is prepended with the six non-negotiable rules.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Sharang Parnerkar
2026-04-07 13:09:26 +02:00
parent 1dfea51919
commit 512b7a0f6c
25 changed files with 1308 additions and 10 deletions

94
AGENTS.python.md Normal file
View File

@@ -0,0 +1,94 @@
# AGENTS.python.md — Python Service Conventions
Applies to: `backend-compliance/`, `document-crawler/`, `dsms-gateway/`, `compliance-tts-service/`.
## Layered architecture (FastAPI)
```
compliance/
├── api/ # HTTP layer — routers only. Thin (≤30 LOC per handler).
│ └── <domain>_routes.py
├── services/ # Business logic. Pure-ish; no FastAPI imports.
│ └── <domain>_service.py
├── repositories/ # DB access. Owns SQLAlchemy session usage.
│ └── <domain>_repository.py
├── domain/ # Value objects, enums, domain exceptions.
├── schemas/ # Pydantic models, split per domain. NEVER one giant schemas.py.
│ └── <domain>.py
└── db/
└── models/ # SQLAlchemy ORM, one module per aggregate. __tablename__ frozen.
```
**Dependency direction:** `api → services → repositories → db.models`. Lower layers must not import upper layers.
## Routers
- One `APIRouter` per domain file.
- Handlers do exactly: parse request → call service → map domain errors to HTTPException → return response model.
- Inject services via `Depends`. No globals.
- Tag routes; document with summary + response_model.
```python
@router.post("/dsr/requests", response_model=DSRRequestRead, status_code=201)
async def create_dsr_request(
payload: DSRRequestCreate,
service: DSRService = Depends(get_dsr_service),
tenant_id: UUID = Depends(get_tenant_id),
) -> DSRRequestRead:
try:
return await service.create(tenant_id, payload)
except DSRConflict as exc:
raise HTTPException(409, str(exc)) from exc
```
## Services
- Constructor takes the repository (interface, not concrete).
- No `Request`, `Response`, or HTTP knowledge.
- Raise domain exceptions (e.g. `DSRConflict`, `DSRNotFound`), never `HTTPException`.
- Return domain objects or Pydantic schemas — pick one and stay consistent inside a service.
## Repositories
- Methods are intent-named (`get_pending_for_tenant`), not CRUD-named (`select_where`).
- Sessions injected, not constructed inside.
- No business logic. No cross-aggregate joins for unrelated workflows — that belongs in a service.
- Return ORM models or domain VOs; never `Row`.
## Schemas (Pydantic v2)
- One module per domain. Module ≤300 lines.
- Use `model_config = ConfigDict(from_attributes=True, frozen=True)` for read models.
- Separate `*Create`, `*Update`, `*Read`. No giant union schemas.
## Tests (`pytest`)
- Layout: `tests/unit/`, `tests/integration/`, `tests/contracts/`.
- Unit tests mock the repository. Use `pytest.fixture` + `unittest.mock.AsyncMock`.
- Integration tests run against the real Postgres from `docker-compose.yml` via a transactional fixture (rollback after each test).
- Contract tests diff `/openapi.json` against `tests/contracts/openapi.baseline.json`.
- Naming: `test_<unit>_<scenario>_<expected>.py::TestClass::test_method`.
- `pytest-asyncio` mode = `auto`. Mark slow tests with `@pytest.mark.slow`.
- Coverage target: 80% for new code; never decrease the service baseline.
## Tooling
- `ruff check` + `ruff format` (line length 100).
- `mypy --strict` on `services/`, `repositories/`, `domain/`. Expand outward.
- `pip-audit` in CI.
- Async-first: prefer `httpx.AsyncClient`, `asyncpg`/`SQLAlchemy 2.x async`.
## Errors & logging
- Domain errors inherit from a single `DomainError` base per service.
- Log via `structlog` with bound context (`tenant_id`, `request_id`). Never log secrets, PII, or full request bodies.
- Audit-relevant actions go through the audit logger, not the application logger.
## What you may NOT do
- Add a new Alembic migration.
- Rename a `__tablename__`, column, or enum value.
- Change a public route's path/method/status/schema without simultaneous dashboard fix.
- Catch `Exception` broadly — catch the specific domain or library error.
- Put business logic in a router or in a Pydantic validator.
- Create a new file >500 lines. Period.