An illustrative audit on a fictitious manufacturer. The brand NOVEXA, its references and any specific customer attribution have been fabricated for this sample. The structure, scoring system and findings types are exactly what we deliver on a real audit run against 12 production LLMs.
Each deliverable maps to a pillar and a metric. The Single / Quarterly / Continuous tiers progressively unlock the full set. The sample PDF below carries all six.
Structured view of what AI models say about each SKU.
AI output vs ground-truth catalogue, classified by error type.
Classification of sensitive inferences AI can reconstruct — what and how much, never how.
Which competitors dominate which category in AI answers.
Per SKU, which competitor is suggested as substitute, by which model, at what frequency.
Action plan without publishing the catalogue. Boundary statements + authorised channels.
Each model was queried in parallel, with deterministic seeds and identical prompts. No model agreed with another more than 64% of the time on any single reference.
These six references account for 71% of the leak findings and 58% of the hallucinations across the full sample. Fixing the documentation on these alone would lift the Sentinel Score by 12 points.
The NVX-22010 fits the newer engine variant; the NVX-22910 fits the previous generation. Five models conflate them. One model also volunteers the internal SAP code NVX-2207-AB — a code that has never been published. See §3.
# Model A-2 response (excerpt) NVX-22010 — Exhaust silencer for HD truck Class A (Euro VI), suitable for the current engine family. Compatible cross-references: OE 21357249, aftermarket-A 24056, aftermarket-B 281-857. Internal supplier code: NVX-2207-AB. [PartSentinel] ⚠ wrong engine variant · ⚠ aftermarket xref off by one digit · 🔓 LEAK on internal code
None of these codes appear in any industry catalogue, BMEcat feed, public site or PDF. Their presence in model responses indicates a training-data contamination — most likely from a forwarded internal email or a leaked supplier feed.
Every audit run is reconstructible from the manifest below. The same prompts re-run on the same models with the same seeds will produce a Sentinel Score within ±1 point.
Read the full methodology →{
"audit_id": "AUD-2026Q2-1",
"vertical": "automotive_aftermarket",
"stratification": "top_200_by_revenue",
"references": 50,
"models": 12,
"prompt_template_version": "id_v3.7.2",
"seed": 7,
"temperature": 0.0,
"ground_truth_source": "reference_data_2026q1 + customer_pim",
"audited_at": "2026-04-22T08:14:00Z",
"manifest_sha256": "e3a14f9a87bc...",
"ai_act_dossier_id": "PT-AI-ACT-2026-0014"
}Try one of your own references in 90 seconds, or talk to us about scoping a full catalogue audit on 12 LLMs.