Cerova LabelCore ingests the US FDA drug-labeling corpus, normalises it to a canonical FHIR ePI model, machine-translates the patient-facing leaflet to Spanish behind a blocking numeric/unit fidelity gate and a risk-tiered human-review workflow — then serves both languages over a fast, cache-first public API. The control plane is called the Product Information Control Plane.
A regulated product-information hosting and translation system. Every served record carries provenance, a mandatory disclaimer, and a deep link to the authoritative FDA source — and every Spanish output has passed a deterministic safety gate before a human ever sees it.
The full DailyMed Structured Product Labeling corpus is parsed and normalised to a single FHIR ePI document model — LOINC-coded sections, NDCs, set IDs, content-hashed. One schema for 1,000 products, queryable as data, not PDFs.
An AI agent fleet produces the Spanish, but the qualifying control is independent verification — a blocking numeric/unit gate plus risk-tiered human review. Zero high-risk sections auto-publish.
Both languages served from the edge — search by name, NDC, or ingredient; fetch a leaflet or a FHIR Composition. Zero-egress object storage, and unchanged content is never re-translated, so steady-state translation cost approaches zero.
Source set ID, content hashes, engine + version, reviewer, and an append-only audit trail travel with each document. Unofficial translations are disclaimed and deep-linked to the official FDA labeling — defensible by construction.
A write plane ingests, normalises, translates, and gates content asynchronously. A read plane serves it from cache. The gate sits on the boundary — nothing reaches a reader until it has either passed deterministically or been approved by a human.
This is the patient-safety crux of doing pharma translation with AI. Before any Spanish segment can publish, every number and every locale-invariant unit in the English source must be present and unchanged in the target. The check is deterministic, it fails closed, and it is the reason an AI fleet can be trusted to touch a drug label at all.
| Source (EN) | Candidate (ES) | Why | Verdict |
|---|---|---|---|
| 200 mg | 200 mg | Number and unit identical. Atoms match exactly. | PASS · publishable |
| 200 mg | 207 mg | 200 missing, 207 added. A dose drifted. Blocked and routed to a human — it can never auto-publish. | BLOCK → review |
| 2.5 mL | 2,5 mL | Spanish decimal comma. Canonicalised to the same value — benign locale formatting is neutralised, not blocked. | PASS · locale-equal |
Only international unit symbols (mg, mcg, mL, IU, %, mg/mL…) are gated — they are identical across languages. Words like "hours" or "tablets" do translate, so the gate verifies their numerals but never the words. Word-level dosing fidelity is covered by the model plus mandatory human review of every dosing section.
0 high-risk sections auto-published. ~99% of 30,000+ translated segments clear the gate; the rest are held. OTC Drug Facts auto-publish on a clean pass, while Rx patient leaflets, Medication Guides, and boxed warnings are always held for human approval regardless of gate result.
Every response below was returned by the live service. Base URL https://lumen.james-564.workers.dev (migrating to api.cerova.io). Pick an endpoint.
For a pharma buyer, the differentiator isn't the translation — it's being able to prove, per record, where it came from, how it was produced, who verified it, and that nothing was published it shouldn't have been. That's built into the data model, not bolted on.
Predicate-rule driven. The authoritative record stays with the FDA; the served content is a disclaimed, deep-linked derivative — so full Part 11 e-records/e-signature controls aren't triggered, with the rationale documented and integrity controls applied proportionate to risk.
Assurance sized by risk, not ceremony, per FDA's CSA approach and GAMP 5. Because AI authors the work, the qualifying control is independent verification — test + gate + human review — not authorship.
The unofficial-translation disclaimer and the deep link to FDA/DailyMed are part of every payload by construction — a reader, an integrator, or an auditor always reaches the authoritative source in one hop.
The data is public but unusable. DailyMed is open, but it's English-only, document-shaped, and not built to query. Turning it into governed, bilingual, structured ePI is the hard part — and the durable one.
AI makes the translation tractable; controls make it shippable. The reason no one has a trustworthy Spanish FDA-leaflet source isn't translation cost — it's the patient-safety risk of getting a dose wrong. The gate is what unlocks doing it at scale.
Edge economics make it defensible. Translate-changed-only plus zero-egress storage and cache-first reads means coverage compounds while marginal cost trends to zero. Re-translating unchanged content costs roughly nothing.
The same controls-around-AI philosophy that governs the product governs how it's built. An AI fleet authors code and translations; independent verification is the qualifying control. Work is traced in an SDLC tool with risk-tiered rigour, and assurance follows CSA — effort proportionate to the risk each capability poses to patient safety and data integrity.