58957a4aaa
1. Dockerfile: install Playwright AS appuser (not root) so chromium binary is accessible at runtime. Was causing 500 error. 2. DSE service matching: text-search fallback when LLM extraction fails. If "etracker" appears in DSE text, mark as documented even without LLM parsing the service list. 3. CMP skip: consent managers in category "cmp" skipped (not just "other" with id "cmp"). NOT DEPLOYED — RAG pipeline is running. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
32 lines
867 B
Docker
32 lines
867 B
Docker
FROM python:3.12-slim-bookworm
|
|
|
|
WORKDIR /app
|
|
|
|
# Install system dependencies for Playwright/Chromium
|
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
libnss3 libnspr4 libatk1.0-0 libatk-bridge2.0-0 libcups2 \
|
|
libdrm2 libxkbcommon0 libxcomposite1 libxdamage1 libxfixes3 \
|
|
libxrandr2 libgbm1 libpango-1.0-0 libcairo2 libasound2 \
|
|
curl \
|
|
&& rm -rf /var/lib/apt/lists/*
|
|
|
|
# Create user BEFORE installing Playwright (so browsers are in user's cache)
|
|
RUN useradd --create-home appuser
|
|
|
|
COPY requirements.txt .
|
|
RUN pip install --no-cache-dir -r requirements.txt
|
|
|
|
# Install Playwright browsers AS appuser (so they land in /home/appuser/.cache/)
|
|
USER appuser
|
|
RUN playwright install chromium
|
|
USER root
|
|
|
|
COPY . .
|
|
RUN chown -R appuser:appuser /app
|
|
|
|
USER appuser
|
|
|
|
EXPOSE 8094
|
|
|
|
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8094"]
|