Xentropics Tiny ETL

Lightweight PDF ETL for teams that need files, not ceremony.

Sign in with Google, upload a PDF batch, run an entity-extraction profile, and keep every intermediate JSON artifact beside the generated HTML output.

Access is controlled by an email allowlist in config/allowed-emails.json.

Use this if you do not have Google. You still must be on the allowlist.

Extract

Parse PDFs into normalized text and store the raw extraction as JSON.

Transform

Run OCR repair tools and optional OpenAI prompts for cleanup and entity capture.

Load

Render HTML reports from a template and keep outputs on disk for review.