b6cfc0a503
The bottleneck is not content, it is knowledge PRODUCTION. Instead of writing 200 playbooks by hand, generate drafts deterministically from data the software already owns, then have an expert review them. Mirrors the legal pipeline (Gesetz -> Parser -> Obligation -> Review) for BreakPilot's own knowledge: new Capability -> Registry -> Transition Pattern -> Playbook Draft Generator -> Expert Review -> versioned Playbook. - compliance/knowledge_production/: generate_playbook_draft(capability, requirement, control_links) + drafts_from_pattern(pattern) -> one PlaybookDraft per delta capability. Owned fields (why / closes_regulations / expected_evidence / typical_controls) are assembled with per-field provenance; the practitioner know-how (tools / process_steps / how_others) is left as an explicit TODO. - DraftStatus lifecycle (Freigabestatus): draft_generated -> in_review -> reviewed -> validated -> proven. Deterministic, NO LLM in the core (any model enrichment stays offline/advisory/propose-only). - ADR-005: extends "the engine does not change, the corpus grows" with "and the corpus is not written by hand — it is deterministically prepared, then curated". - reference suite: "Knowledge Production" section turns the convergence pattern into 12 auto-assembled drafts (why/closes/evidence filled, tools/steps TODO) -> review 12 drafts, don't write 12 playbooks. 10 tests (50 with playbook/optimization/transition/company), mypy --strict clean, check-loc 0. Product code with no app caller + ADR/reference = non-runtime -> no deploy (ADR-001). Freeze-safe. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
47 lines
2.4 KiB
Python
47 lines
2.4 KiB
Python
"""Schemas for Knowledge Production — deterministic draft assembly + lifecycle.
|
|
|
|
The corpus is no longer written by hand: it is deterministically PREPARED from data the software
|
|
already owns (Capability, Transition Pattern, Controls, Evidence, leverage), then curated by an
|
|
expert. A `PlaybookDraft` is a machine-assembled skeleton with per-field provenance and an explicit
|
|
TODO list of what still needs human (or offline-propose) input. No LLM in the deterministic core.
|
|
Python 3.9 compatible (no `|` unions).
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
from enum import Enum
|
|
from typing import Dict, List
|
|
|
|
from pydantic import BaseModel, Field
|
|
|
|
|
|
class DraftStatus(str, Enum):
|
|
"""Freigabestatus — the knowledge lifecycle from machine draft to proven (mirrors the
|
|
transition-pattern / playbook maturity, with a machine-assembled pre-stage)."""
|
|
|
|
DRAFT_GENERATED = "draft_generated" # machine-assembled, NOT yet expert-touched
|
|
IN_REVIEW = "in_review" # an expert is curating it
|
|
REVIEWED = "reviewed" # internally reviewed
|
|
VALIDATED = "validated" # domain expert confirmed
|
|
PROVEN = "proven" # confirmed in the field
|
|
|
|
|
|
class PlaybookDraft(BaseModel):
|
|
"""A deterministically assembled playbook draft for one capability.
|
|
|
|
Owned fields (why / closes_regulations / expected_evidence / typical_controls) are filled from
|
|
existing data with provenance; the practitioner know-how (tools / process_steps / how_others)
|
|
is left as TODO. The expert reviews a draft instead of writing from a blank page.
|
|
"""
|
|
|
|
capability_id: str
|
|
status: DraftStatus = DraftStatus.DRAFT_GENERATED
|
|
title: str = ""
|
|
why: str = "" # from the transition pattern (why_asked/missing_because)
|
|
closes_regulations: List[str] = Field(default_factory=list) # from leverage (covers_targets)
|
|
expected_evidence: List[str] = Field(default_factory=list) # from the transition pattern
|
|
typical_controls: List[str] = Field(default_factory=list) # injected from Execution (may be empty)
|
|
provenance: Dict[str, str] = Field(default_factory=dict) # field -> source it was assembled from
|
|
todo: List[str] = Field(default_factory=list) # fields the expert/offline-propose must still add
|
|
disclaimer: str = "" # machine draft, requires expert curation
|