Alpha · live Cloudflare-native · Workers · D1 · R2 · KV · Queues · Durable Objects

Governed product information,
translated and hosted with control.

Cerova LabelCore ingests the US FDA drug-labeling corpus, normalises it to a canonical FHIR ePI model, machine-translates the patient-facing leaflet to Spanish behind a blocking numeric/unit fidelity gate and a risk-tiered human-review workflow — then serves both languages over a fast, cache-first public API. The control plane is called the Product Information Control Plane.

1,000
top-by-usage US drug products live in English (100%)
32k+
source segments under translation governance
~99%
safety-gate pass across 30,000+ segments
≤80 ms
p95 global read latency target (cache-first)
What it is

Not just translated. Governed.

A regulated product-information hosting and translation system. Every served record carries provenance, a mandatory disclaimer, and a deep link to the authoritative FDA source — and every Spanish output has passed a deterministic safety gate before a human ever sees it.

Canonical FHIR ePI model

The full DailyMed Structured Product Labeling corpus is parsed and normalised to a single FHIR ePI document model — LOINC-coded sections, NDCs, set IDs, content-hashed. One schema for 1,000 products, queryable as data, not PDFs.

AI translation with hard controls

An AI agent fleet produces the Spanish, but the qualifying control is independent verification — a blocking numeric/unit gate plus risk-tiered human review. Zero high-risk sections auto-publish.

Cache-first public API

Both languages served from the edge — search by name, NDC, or ingredient; fetch a leaflet or a FHIR Composition. Zero-egress object storage, and unchanged content is never re-translated, so steady-state translation cost approaches zero.

Provenance on every record

Source set ID, content hashes, engine + version, reviewer, and an append-only audit trail travel with each document. Unofficial translations are disclaimed and deep-linked to the official FDA labeling — defensible by construction.

How it works

One pipeline, two planes.

A write plane ingests, normalises, translates, and gates content asynchronously. A read plane serves it from cache. The gate sits on the boundary — nothing reaches a reader until it has either passed deterministically or been approved by a human.

WRITE PLANE · async · Queues + Durable Objects READ PLANE · cache-first · p95 ≤ 80ms target Ingest DailyMed SPL mirror + hash R2 · KV Normalise → FHIR ePI LOINC · doc-type canonical model Translate AI agent fleet EN → ES, patient + translation memory GATE numeric / unit fidelity · blocking fails closed Human review risk-tiered queue high-risk held approve → publish Publish R2 + D1 PASS · auto-publish (OTC) BLOCK Public API · /v1 search · leaflet · FHIR Edge cache + KV ETag · revalidate Reader / Portal EN + ES, disclaimed cache-first reads
Write plane — asynchronous, idempotent, translate-changed-only Read plane — cache-first, globally distributed The gate — the boundary nothing crosses unverified
The safety gate

A dose can never drift.

This is the patient-safety crux of doing pharma translation with AI. Before any Spanish segment can publish, every number and every locale-invariant unit in the English source must be present and unchanged in the target. The check is deterministic, it fails closed, and it is the reason an AI fleet can be trusted to touch a drug label at all.

Source (EN)Candidate (ES)WhyVerdict
200 mg 200 mg Number and unit identical. Atoms match exactly. PASS · publishable
200 mg 207 mg 200 missing, 207 added. A dose drifted. Blocked and routed to a human — it can never auto-publish. BLOCK → review
2.5 mL 2,5 mL Spanish decimal comma. Canonicalised to the same value — benign locale formatting is neutralised, not blocked. PASS · locale-equal

Scope is deliberate

Only international unit symbols (mg, mcg, mL, IU, %, mg/mL…) are gated — they are identical across languages. Words like "hours" or "tablets" do translate, so the gate verifies their numerals but never the words. Word-level dosing fidelity is covered by the model plus mandatory human review of every dosing section.

The KPI that matters

0 high-risk sections auto-published. ~99% of 30,000+ translated segments clear the gate; the rest are held. OTC Drug Facts auto-publish on a clean pass, while Rx patient leaflets, Medication Guides, and boxed warnings are always held for human approval regardless of gate result.

See it live

It's real. Here are the calls.

Every response below was returned by the live service. Base URL https://lumen.james-564.workers.dev (migrating to api.cerova.io). Pick an endpoint.

cerova-labelcore — live LIVE RESPONSE
$ curl -H "x-api-key: lmn_live_…" \
  https://lumen.james-564.workers.dev/v1/stats
{
  "products": 1000,
  "productsWithEs": 312,
  "documents": 1000,
  "patientFacingSectionsEn": 2081,
  "esSectionsPublishable": 858,
  "esSectionsInReview": 901,
  "esSectionsGateFailed": 73,
  "sourceSegments": 32472,
  "sourceSegmentsTranslated": 30692,
  "translationMemoryApproved": 10402,
  "reviewQueuePending": 974,
  "segmentCoveragePct": 94.5,
  "generatedAt": "2026-06-16T15:11:49.492Z"
}

Coverage, reported honestly. 1,000 products live; Spanish completing to 100%, usage-prioritised. The gate-failed and in-review counts are not hidden — they are the governance working as designed.

$ curl -H "x-api-key: lmn_live_…" \
  "https://lumen.james-564.workers.dev/v1/search?q=omeprazole&type=name"
{
  "query": "omeprazole", "type": "name", "count": 5,
  "results": [
    {
      "setId": "6919399e-f112-4274-b4de-df0b4c391e63",
      "brandName": "omeprazole",
      "genericName": "omeprazole",
      "ndcs": ["11822-1040", "11822-1040-1"],
      "docType": "otc", "usageRank": 47,
      "languages": ["en", "es"]
    },
    { "brandName": "Omeprazole", "docType": "rx", "usageRank": 46, "languages": ["en","es"] }
    … 3 more
  ]
}

Query as data. Search by name, NDC, or ingredient. Each hit carries its doc-type and which languages are live — note OTC vs Rx is first-class, because it changes how a section is classified and reviewed.

$ curl -H "x-api-key: lmn_live_…" \
  https://lumen.james-564.workers.dev/v1/products/6919399e-…/leaflet?lang=es
{
  "setId": "6919399e-f112-4274-b4de-df0b4c391e63",
  "lang": "es", "official": false,
  "officialSource": "https://dailymed.nlm.nih.gov/…?setid=6919399e-…",
  "productTitle": "Rite Aid Corporation Omeprazole Drug Facts",
  "docType": "otc",
  "disclaimer": "Traducción no oficial. Fuente oficial: FDA/DailyMed.",
  "engine": "claude-opus-4-8@2026-06-11",
  "reviewStatus": "approved",
  "sections": [
    {
      "title": "Active ingredient (in each tablet)",
      "html": "<p>Omeprazol 20 mg</p>", "riskTier": "none"
    },
    { "title": "Purpose", "html": "<p>Reductor de ácido</p>" },
    { "title": "Warnings", "riskTier": "review", "html": "…Advertencia de alergia…" }
    … directions, etc.
  ]
}

"20 mg" survives translation exactly — that is the gate doing its job. The unofficial-translation disclaimer and the deep link to the official FDA source are part of the payload, not an afterthought, and the Warnings section carries riskTier: "review".

$ curl -H "x-api-key: lmn_live_…" \
  https://lumen.james-564.workers.dev/v1/fhir/Composition/6919399e-…?lang=es
{
  "resourceType": "Bundle", "type": "document", "language": "es",
  "entry": [{ "resource": {
    "resourceType": "Composition", "status": "final",
    "type": { "coding": [{ "system": "http://loinc.org",
      "code": "34390-5", "display": "HUMAN OTC DRUG LABEL" }]},
    "title": "Rite Aid Corporation Omeprazole Drug Facts",
    "section": [{
      "title": "Active ingredient (in each tablet)",
      "code": { "coding": [{ "code": "55106-9" }]},
      "text": { "status": "generated",
        "div": "<div…><p>Omeprazol 20 mg</p></div>" }
    }, … ]
  }}]
}

Standards-native output. The same governed content is also served as a FHIR R4 document Bundle with LOINC-coded sections — interoperable with any system that already speaks FHIR ePI, no bespoke schema to integrate.

Governance & compliance

The moat is the paper trail.

For a pharma buyer, the differentiator isn't the translation — it's being able to prove, per record, where it came from, how it was produced, who verified it, and that nothing was published it shouldn't have been. That's built into the data model, not bolted on.

Provenance — carried on every served record
Source Set ID6919399e-f112-…b4c391e63
Engine + versionclaude-opus-4-8@2026-06-11
Content hashesSHA-256 · source + target
Reviewer & statusapproved · risk-tiered
Audit trailappend-only
Official sourcedeep-linked to DailyMed
Translations are unofficial. Every Spanish record ships the disclaimer "Traducción no oficial" and a link to the authoritative FDA labeling. Regulatory affairs has green-lit the alpha; GA-scale counsel sign-off is pending.
Compliance posture — GxP-adjacent, risk-based

21 CFR Part 11 — applicability assessment

Predicate-rule driven. The authoritative record stays with the FDA; the served content is a disclaimed, deep-linked derivative — so full Part 11 e-records/e-signature controls aren't triggered, with the rationale documented and integrity controls applied proportionate to risk.

Computer Software Assurance (CSA)

Assurance sized by risk, not ceremony, per FDA's CSA approach and GAMP 5. Because AI authors the work, the qualifying control is independent verification — test + gate + human review — not authorship.

Disclaimer + official-source linking, enforced

The unofficial-translation disclaimer and the deep link to FDA/DailyMed are part of every payload by construction — a reader, an integrator, or an auditor always reaches the authoritative source in one hop.

Why now

A gap hiding in plain sight.

40M+
US residents who speak Spanish — the largest Spanish-speaking population outside Mexico, and a group disproportionately affected by health-literacy barriers.
0
first-class, structured, freely queryable sources of Spanish-language FDA drug leaflets that exist today. Patient information is locked in English PDFs.
i.

The data is public but unusable. DailyMed is open, but it's English-only, document-shaped, and not built to query. Turning it into governed, bilingual, structured ePI is the hard part — and the durable one.

ii.

AI makes the translation tractable; controls make it shippable. The reason no one has a trustworthy Spanish FDA-leaflet source isn't translation cost — it's the patient-safety risk of getting a dose wrong. The gate is what unlocks doing it at scale.

iii.

Edge economics make it defensible. Translate-changed-only plus zero-egress storage and cache-first reads means coverage compounds while marginal cost trends to zero. Re-translating unchanged content costs roughly nothing.

How we build

AI-driven SDLC, governed end to end.

The same controls-around-AI philosophy that governs the product governs how it's built. An AI fleet authors code and translations; independent verification is the qualifying control. Work is traced in an SDLC tool with risk-tiered rigour, and assurance follows CSA — effort proportionate to the risk each capability poses to patient safety and data integrity.

Epic intent Feature capability Requirement FR / KPI Test risk-tiered Gate + human review assurance