What Changes
LLM goes from optional sidecar → core of the pipeline
Text-only Groq → vision-capable DeepSeek VLM
Sequential whole-doc processing → parallel per-page processing
Heuristic semantic analysis eliminated entirely
Rule-based HTML rendering eliminated — VLM generates HTML directly
NormalizedNode/BBox/SemanticIntent data model no longer needed
converter.py becomes async to support parallelism
What Stays
FastAPI app structure and API contract unchanged
Job lifecycle (create → poll → result) unchanged
Output format: document.html + manifest.json
A11y validation pass (updated, not replaced)
Frontend (static/index.html) unchanged
File upload validation logic unchanged
PPTX support (convert to PDF first, then same pipeline)