Phase 5: Timefold timetable-solver-service + solution persistence

school-service additions:
  - tt_solution + tt_lesson migration. tt_lesson carries three UNIQUEs
    (solution+class, solution+teacher, solution+room per slot) so the
    DB itself rejects any double-booking the solver might emit by
    mistake.
  - Solution CRUD + GET solutions/:id/lessons endpoint with joined
    class/subject/teacher/room names for display.
  - POST /timetable/solutions creates the row then fires off the
    solver-service via HTTP (5s timeout, mark failed if unreachable).
  - SOLVER_SERVICE_URL config wired through main.go/handlers.

New service timetable-solver-service:
  - Python 3.11 + FastAPI + Timefold Solver 1.21 (Apache-2.0). Dockerfile
    bundles OpenJDK 17 since Timefold for Python is a JPype bridge.
  - app/domain.py — Timefold @planning_entity Lesson with timeslot+room
    as PlanningVariables; @planning_solution Timetable holds problem
    facts (rooms/teachers/etc.) AND rule-fact collections.
  - app/rules.py — frozen dataclasses mirroring 6 of the 15 tt_
    constraint_* tables initially.
  - app/constraints.py — ConstraintProvider with 3 universal hard
    constraints (no double-booking) + 5 DB-driven constraints
    (teacher_unavailable_day/window, teacher_excluded_room,
    room_unavailable, room_requires_type) + 1 quality soft constraint
    (subject_preferred_period). Remaining 9 constraint types ready to
    plug in via the same join pattern.
  - app/repository.py — async loaders for stammdaten + rules; builds
    one Lesson per (curriculum row × weekly_hours), skipping rows
    without a tt_assignment teacher.
  - app/runner.py — runs solver in ThreadPoolExecutor so the FastAPI
    event loop stays responsive. Updates tt_solution status
    pending→running→completed|infeasible|failed.
  - app/main.py — POST /api/v1/solve (202 Accepted, background task),
    GET /api/v1/jobs/{id}, /health. School-service polls tt_solution
    directly instead of GET /jobs for the typical case.
  - docker-compose.yml adds the service on port 8095, depending on
    core-health-check.

Tests:
  - school-service: validator test for CreateTimetableSolutionRequest
    (allows empty name).
  - solver-service: tests/test_domain.py + tests/test_rules.py cover
    construction + hashability of the planning facts. Full solve flow
    deferred to Phase 8 integration with seed data.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-05-22 00:16:52 +02:00
parent 082a5bb68c
commit f042f2896b
25 changed files with 1431 additions and 2 deletions
+94
View File
@@ -0,0 +1,94 @@
"""Solver job runner. One async entry point per solve.
Lifecycle:
1. mark_running -> tt_solution.status = 'running'
2. build_problem -> Timetable from DB
3. SolverFactory.buildSolver() -> Timefold solver
4. solver.solve(problem) -> completed Timetable
5. persist_solution or mark_infeasible based on hard_score
Errors at any step → mark_failed.
Long solves are CPU-bound. We run the solver in an executor so the FastAPI
event loop stays responsive for other requests.
"""
import asyncio
import logging
import traceback
from concurrent.futures import ThreadPoolExecutor
from timefold.solver import SolverFactory
from timefold.solver.config import (
SolverConfig,
TerminationConfig,
Duration,
)
from .config import settings
from .constraints import define_constraints
from .db import get_pool
from .domain import Lesson, Timetable
from .repository import build_problem, mark_failed, mark_infeasible, mark_running, persist_solution
logger = logging.getLogger(__name__)
_executor = ThreadPoolExecutor(max_workers=2)
_solver_factory = SolverFactory.create(
SolverConfig(
solution_class=Timetable,
entity_class_list=[Lesson],
score_director_factory_config={"constraint_provider_function": define_constraints},
termination_config=TerminationConfig(
spent_limit=Duration(seconds=settings.solver_seconds_limit),
),
)
)
def _solve_sync(problem: Timetable) -> Timetable:
"""Blocking solver call; runs in a worker thread."""
solver = _solver_factory.build_solver()
return solver.solve(problem)
async def run_solve(solution_id: str, user_id: str) -> None:
"""Top-level async entry. Caller fires-and-forgets via BackgroundTasks."""
pool = await get_pool()
try:
await mark_running(pool, solution_id)
problem = await build_problem(pool, user_id)
if not problem.lessons:
await mark_failed(pool, solution_id,
"Keine Lessons — pruefe Stundentafel + Lehrauftraege.")
return
if not problem.timeslots:
await mark_failed(pool, solution_id,
"Kein Zeitraster definiert.")
return
if not problem.rooms:
await mark_failed(pool, solution_id,
"Keine Raeume definiert.")
return
loop = asyncio.get_running_loop()
solved: Timetable = await loop.run_in_executor(_executor, _solve_sync, problem)
score = solved.score
hard = score.hard_score() if score else 0
soft = score.soft_score() if score else 0
if hard < 0:
await mark_infeasible(pool, solution_id, hard, soft)
logger.info("Solution %s infeasible: hard=%d soft=%d", solution_id, hard, soft)
else:
await persist_solution(pool, solution_id, solved, hard, soft)
logger.info("Solution %s completed: hard=%d soft=%d", solution_id, hard, soft)
except Exception as exc:
logger.exception("Solver failed for %s", solution_id)
try:
await mark_failed(pool, solution_id, f"{exc.__class__.__name__}: {exc}\n{traceback.format_exc()[:1000]}")
except Exception:
logger.exception("Failed to even mark solution as failed")