Files
breakpilot-compliance/AGENTS.python.md
Sharang Parnerkar c05a71163b
Some checks failed
Build + Deploy / build-admin-compliance (push) Successful in 1m37s
Build + Deploy / build-backend-compliance (push) Successful in 12s
Build + Deploy / build-ai-sdk (push) Successful in 10s
Build + Deploy / build-developer-portal (push) Successful in 12s
Build + Deploy / build-tts (push) Successful in 12s
Build + Deploy / build-document-crawler (push) Successful in 11s
Build + Deploy / build-dsms-gateway (push) Successful in 12s
CI/CD / loc-budget (push) Successful in 21s
CI/CD / guardrail-integrity (push) Has been skipped
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 42s
CI/CD / test-python-backend-compliance (push) Has started running
CI/CD / test-python-document-crawler (push) Has been cancelled
CI/CD / test-python-dsms-gateway (push) Has been cancelled
CI/CD / sbom-scan (push) Has been cancelled
CI/CD / validate-canonical-controls (push) Has been cancelled
Build + Deploy / trigger-orca (push) Successful in 2m19s
fix: resolve CI failures in Python tests and admin-compliance build
Python: add missing 'import enum' to compliance/db/models.py shim.
TypeScript: remove duplicate export of useVendorCompliance from
  vendor-compliance/context.tsx (already exported from ./hooks).
Docs: add mandatory pre-push checklist (lint + test + build) to
  AGENTS.python.md and AGENTS.go.md. [guardrail-change]

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 16:41:39 +02:00

169 lines
6.6 KiB
Markdown

# AGENTS.python.md — Python Service Conventions
Applies to: `backend-compliance/`, `document-crawler/`, `dsms-gateway/`, `compliance-tts-service/`.
## Layered architecture (FastAPI)
```
compliance/
├── api/ # HTTP layer — routers only. Thin (≤30 LOC per handler).
│ └── <domain>_routes.py
├── services/ # Business logic. Pure-ish; no FastAPI imports.
│ └── <domain>_service.py
├── repositories/ # DB access. Owns SQLAlchemy session usage.
│ └── <domain>_repository.py
├── domain/ # Value objects, enums, domain exceptions.
├── schemas/ # Pydantic models, split per domain. NEVER one giant schemas.py.
│ └── <domain>.py
└── db/
└── models/ # SQLAlchemy ORM, one module per aggregate. __tablename__ frozen.
```
**Dependency direction:** `api → services → repositories → db.models`. Lower layers must not import upper layers.
## Routers
- One `APIRouter` per domain file.
- Handlers do exactly: parse request → call service → map domain errors to HTTPException → return response model.
- Inject services via `Depends`. No globals.
- Tag routes; document with summary + response_model.
```python
@router.post("/dsr/requests", response_model=DSRRequestRead, status_code=201)
async def create_dsr_request(
payload: DSRRequestCreate,
service: DSRService = Depends(get_dsr_service),
tenant_id: UUID = Depends(get_tenant_id),
) -> DSRRequestRead:
try:
return await service.create(tenant_id, payload)
except DSRConflict as exc:
raise HTTPException(409, str(exc)) from exc
```
## Services
- Constructor takes the repository (interface, not concrete).
- No `Request`, `Response`, or HTTP knowledge.
- Raise domain exceptions (e.g. `DSRConflict`, `DSRNotFound`), never `HTTPException`.
- Return domain objects or Pydantic schemas — pick one and stay consistent inside a service.
## Repositories
- Methods are intent-named (`get_pending_for_tenant`), not CRUD-named (`select_where`).
- Sessions injected, not constructed inside.
- No business logic. No cross-aggregate joins for unrelated workflows — that belongs in a service.
- Return ORM models or domain VOs; never `Row`.
## Schemas (Pydantic v2)
- One module per domain. Module ≤300 lines.
- Use `model_config = ConfigDict(from_attributes=True, frozen=True)` for read models.
- Separate `*Create`, `*Update`, `*Read`. No giant union schemas.
## Tests (`pytest`)
- Layout: `tests/unit/`, `tests/integration/`, `tests/contracts/`.
- Unit tests mock the repository. Use `pytest.fixture` + `unittest.mock.AsyncMock`.
- Integration tests run against the real Postgres from `docker-compose.yml` via a transactional fixture (rollback after each test).
- Contract tests diff `/openapi.json` against `tests/contracts/openapi.baseline.json`.
- Naming: `test_<unit>_<scenario>_<expected>.py::TestClass::test_method`.
- `pytest-asyncio` mode = `auto`. Mark slow tests with `@pytest.mark.slow`.
- Coverage target: 80% for new code; never decrease the service baseline.
## Tooling
- `ruff check` + `ruff format` (line length 100).
- `mypy --strict` on `services/`, `repositories/`, `domain/`. Expand outward.
- `pip-audit` in CI.
- Async-first: prefer `httpx.AsyncClient`, `asyncpg`/`SQLAlchemy 2.x async`.
## mypy configuration
`backend-compliance/mypy.ini` is the mypy config. Strict mode is on globally; per-module overrides exist only for legacy files that have not been cleaned up yet.
- New modules added to `compliance/services/` or `compliance/repositories/` **must** pass `mypy --strict`.
- To type-check a new module: `cd backend-compliance && mypy compliance/your_new_module.py`
- When you fully type a legacy file, **remove its loose-override block** from `mypy.ini` as part of the same PR.
## Dependency injection
Services and repositories are wired via FastAPI `Depends`. Never instantiate a service or repository directly inside a handler.
```python
# dependencies.py
def get_my_service(db: AsyncSession = Depends(get_db)) -> MyService:
return MyService(MyRepository(db))
# router
@router.get("/items", response_model=list[ItemRead])
async def list_items(svc: MyService = Depends(get_my_service)) -> list[ItemRead]:
return await svc.list()
```
- Services take repositories in `__init__`; repositories take `Session` or `AsyncSession`.
## Structured logging
```python
import structlog
logger = structlog.get_logger()
# Always bind context before logging:
logger.bind(tenant_id=str(tid), action="create_dsfa").info("dsfa created")
```
- Audit-relevant actions must use the audit logger with a `legal_basis` field.
- Never log secrets, PII, or full request bodies.
## Barrel re-export pattern
When an oversized file (e.g. `schemas.py`, `models.py`) is split into a sub-package, the original stays as a **thin re-exporter** so existing consumer imports keep working:
```python
# compliance/schemas.py (barrel — DO NOT ADD NEW CODE HERE)
from .schemas.ai import * # noqa: F401, F403
from .schemas.consent import * # noqa: F401, F403
```
- New code imports from the specific module (e.g. `from compliance.schemas.ai import AIRiskRead`), not the barrel.
- `from module import *` is only permitted in barrel files.
## Errors & logging
- Domain errors inherit from a single `DomainError` base per service.
- Log via `structlog` with bound context (`tenant_id`, `request_id`). Never log secrets, PII, or full request bodies.
- Audit-relevant actions go through the audit logger, not the application logger.
## Before every push — MANDATORY
Run all three steps for every Python service you touched before pushing. CI runs the same checks and will fail if you skip this.
```bash
cd <service> # backend-compliance | document-crawler | dsms-gateway | compliance-tts-service
# 1. Lint
ruff check .
mypy compliance/ # only for backend-compliance
# 2. Tests
pytest -x
# 3. Import sanity (catches NameError at collection time)
python -c "import compliance" # or the service's main module
```
All steps must exit 0. Do not push if any step fails.
## What you may NOT do
- Add a new Alembic migration.
- Rename a `__tablename__`, column, or enum value.
- Change a public route's path/method/status/schema without simultaneous dashboard fix.
- Catch `Exception` broadly — catch the specific domain or library error.
- Put business logic in a router or in a Pydantic validator.
- `from module import *` in new code — only in barrel re-exporters.
- `raise HTTPException` inside the service layer — raise domain exceptions; map them in the router.
- Use `model_validate` on untrusted external data without an explicit schema boundary.
- Create a new file >500 lines. Period.