Cerova LabelCore — Governed product information, translated and hosted with control

What it is

Not just translated. Governed.

A regulated product-information hosting and translation system. Every served record carries provenance, a mandatory disclaimer, and a deep link to the authoritative FDA source — and every Spanish output has passed a deterministic safety gate before a human ever sees it.

Canonical FHIR ePI model

The full DailyMed Structured Product Labeling corpus is parsed and normalised to a single FHIR ePI document model — LOINC-coded sections, NDCs, set IDs, content-hashed. One schema for 1,000 products, queryable as data, not PDFs.

AI translation with hard controls

An AI agent fleet produces the Spanish, but the qualifying control is independent verification — a blocking numeric/unit gate plus risk-tiered human review. Zero high-risk sections auto-publish.

Cache-first public API

Both languages served from the edge — search by name, NDC, or ingredient; fetch a leaflet or a FHIR Composition. Zero-egress object storage, and unchanged content is never re-translated, so steady-state translation cost approaches zero.

Provenance on every record

Source set ID, content hashes, engine + version, reviewer, and an append-only audit trail travel with each document. Unofficial translations are disclaimed and deep-linked to the official FDA labeling — defensible by construction.

The safety gate

A dose can never drift.

This is the patient-safety crux of doing pharma translation with AI. Before any Spanish segment can publish, every number and every locale-invariant unit in the English source must be present and unchanged in the target. The check is deterministic, it fails closed, and it is the reason an AI fleet can be trusted to touch a drug label at all.

Source (EN)	Candidate (ES)	Why	Verdict
200 mg	200 mg	Number and unit identical. Atoms match exactly.	PASS · publishable
200 mg	207 mg	200 missing, 207 added. A dose drifted. Blocked and routed to a human — it can never auto-publish.	BLOCK → review
2.5 mL	2,5 mL	Spanish decimal comma. Canonicalised to the same value — benign locale formatting is neutralised, not blocked.	PASS · locale-equal

Scope is deliberate

Only international unit symbols (mg, mcg, mL, IU, %, mg/mL…) are gated — they are identical across languages. Words like "hours" or "tablets" do translate, so the gate verifies their numerals but never the words. Word-level dosing fidelity is covered by the model plus mandatory human review of every dosing section.

The KPI that matters

0 high-risk sections auto-published. ~99% of 30,000+ translated segments clear the gate; the rest are held. OTC Drug Facts auto-publish on a clean pass, while Rx patient leaflets, Medication Guides, and boxed warnings are always held for human approval regardless of gate result.

See it live

It's real. Here are the calls.

Every response below was returned by the live service. Base URL https://lumen.james-564.workers.dev (migrating to api.cerova.io). Pick an endpoint.

cerova-labelcore — live LIVE RESPONSE

$ curl -H "x-api-key: lmn_live_…" \
https://lumen.james-564.workers.dev/v1/stats

{
  "products": 1000,
  "productsWithEs": 312,
  "documents": 1000,
  "patientFacingSectionsEn": 2081,
  "esSectionsPublishable": 858,
  "esSectionsInReview": 901,
  "esSectionsGateFailed": 73,
  "sourceSegments": 32472,
  "sourceSegmentsTranslated": 30692,
  "translationMemoryApproved": 10402,
  "reviewQueuePending": 974,
  "segmentCoveragePct": 94.5,
  "generatedAt": "2026-06-16T15:11:49.492Z"
}

Coverage, reported honestly. 1,000 products live; Spanish completing to 100%, usage-prioritised. The gate-failed and in-review counts are not hidden — they are the governance working as designed.

$ curl -H "x-api-key: lmn_live_…" \
"https://lumen.james-564.workers.dev/v1/search?q=omeprazole&type=name"

{
  "query": "omeprazole", "type": "name", "count": 5,
  "results": [
    {
      "setId": "6919399e-f112-4274-b4de-df0b4c391e63",
      "brandName": "omeprazole",
      "genericName": "omeprazole",
      "ndcs": ["11822-1040", "11822-1040-1"],
      "docType": "otc", "usageRank": 47,
      "languages": ["en", "es"]
    },
    { "brandName": "Omeprazole", "docType": "rx", "usageRank": 46, "languages": ["en","es"] }
    … 3 more
  ]
}

Query as data. Search by name, NDC, or ingredient. Each hit carries its doc-type and which languages are live — note OTC vs Rx is first-class, because it changes how a section is classified and reviewed.

$ curl -H "x-api-key: lmn_live_…" \
https://lumen.james-564.workers.dev/v1/products/6919399e-…/leaflet?lang=es

{
  "setId": "6919399e-f112-4274-b4de-df0b4c391e63",
  "lang": "es", "official": false,
  "officialSource": "https://dailymed.nlm.nih.gov/…?setid=6919399e-…",
  "productTitle": "Rite Aid Corporation Omeprazole Drug Facts",
  "docType": "otc",
  "disclaimer": "Traducción no oficial. Fuente oficial: FDA/DailyMed.",
  "engine": "claude-opus-4-8@2026-06-11",
  "reviewStatus": "approved",
  "sections": [
    {
      "title": "Active ingredient (in each tablet)",
      "html": "<p>Omeprazol 20 mg</p>", "riskTier": "none"
    },
    { "title": "Purpose", "html": "<p>Reductor de ácido</p>" },
    { "title": "Warnings", "riskTier": "review", "html": "…Advertencia de alergia…" }
    … directions, etc.
  ]
}

"20 mg" survives translation exactly — that is the gate doing its job. The unofficial-translation disclaimer and the deep link to the official FDA source are part of the payload, not an afterthought, and the Warnings section carries riskTier: "review".

$ curl -H "x-api-key: lmn_live_…" \
https://lumen.james-564.workers.dev/v1/fhir/Composition/6919399e-…?lang=es

{
  "resourceType": "Bundle", "type": "document", "language": "es",
  "entry": [{ "resource": {
    "resourceType": "Composition", "status": "final",
    "type": { "coding": [{ "system": "http://loinc.org",
      "code": "34390-5", "display": "HUMAN OTC DRUG LABEL" }]},
    "title": "Rite Aid Corporation Omeprazole Drug Facts",
    "section": [{
      "title": "Active ingredient (in each tablet)",
      "code": { "coding": [{ "code": "55106-9" }]},
      "text": { "status": "generated",
        "div": "<div…><p>Omeprazol 20 mg</p></div>" }
    }, … ]
  }}]
}

Standards-native output. The same governed content is also served as a FHIR R4 document Bundle with LOINC-coded sections — interoperable with any system that already speaks FHIR ePI, no bespoke schema to integrate.

Governance & compliance

The moat is the paper trail.

For a pharma buyer, the differentiator isn't the translation — it's being able to prove, per record, where it came from, how it was produced, who verified it, and that nothing was published it shouldn't have been. That's built into the data model, not bolted on.

Provenance — carried on every served record

Source Set ID6919399e-f112-…b4c391e63

Engine + versionclaude-opus-4-8@2026-06-11

Content hashesSHA-256 · source + target

Reviewer & statusapproved · risk-tiered

Audit trailappend-only

Official sourcedeep-linked to DailyMed

Translations are unofficial. Every Spanish record ships the disclaimer "Traducción no oficial" and a link to the authoritative FDA labeling. Regulatory affairs has green-lit the alpha; GA-scale counsel sign-off is pending.

Compliance posture — GxP-adjacent, risk-based

21 CFR Part 11 — applicability assessment

Predicate-rule driven. The authoritative record stays with the FDA; the served content is a disclaimed, deep-linked derivative — so full Part 11 e-records/e-signature controls aren't triggered, with the rationale documented and integrity controls applied proportionate to risk.

Computer Software Assurance (CSA)

Assurance sized by risk, not ceremony, per FDA's CSA approach and GAMP 5. Because AI authors the work, the qualifying control is independent verification — test + gate + human review — not authorship.

Disclaimer + official-source linking, enforced

The unofficial-translation disclaimer and the deep link to FDA/DailyMed are part of every payload by construction — a reader, an integrator, or an auditor always reaches the authoritative source in one hop.

Why now

A gap hiding in plain sight.

40M+

US residents who speak Spanish — the largest Spanish-speaking population outside Mexico, and a group disproportionately affected by health-literacy barriers.

first-class, structured, freely queryable sources of Spanish-language FDA drug leaflets that exist today. Patient information is locked in English PDFs.

The data is public but unusable. DailyMed is open, but it's English-only, document-shaped, and not built to query. Turning it into governed, bilingual, structured ePI is the hard part — and the durable one.

ii.

AI makes the translation tractable; controls make it shippable. The reason no one has a trustworthy Spanish FDA-leaflet source isn't translation cost — it's the patient-safety risk of getting a dose wrong. The gate is what unlocks doing it at scale.

iii.

Edge economics make it defensible. Translate-changed-only plus zero-egress storage and cache-first reads means coverage compounds while marginal cost trends to zero. Re-translating unchanged content costs roughly nothing.

Governed product information,
translated and hosted with control.