METHODOLOGY · v3.1 · 2026

How PartSentinel audits AI knowledge of your catalogue.

A calibrated, reproducible measurement protocol. No SEO games, no rewriting, no marketing math — every step is auditable.

THE CANONICAL FRAMEWORK

Three pillars. Six metrics. Six deliverables.

Everything PartSentinel produces maps to this framework. Pillars set the score weighting. Metrics are the units of measurement. Deliverables are what the client receives.

PILLAR 1 · ACCURACY

Does AI describe your SKU correctly?

Reality Gap — divergence between AI output and catalogue truth.

PILLAR 2 · INFERENCE

What can AI infer that you never disclosed?

Inference Risk — sensitivity of what AI can reconstruct. Key differentiator.

PILLAR 3 · VISIBILITY

Who owns the AI answer in your category?

Competitive Visibility — Share of Answer + Category Authority.

SIX METRICS
  1. AI Visibility Score — SKU mention frequency in AI answers
  2. Reality Gap Score — divergence between AI output and catalogue truth
  3. Inference Risk Score — sensitivity of what AI can reconstruct
  4. Competitive Share of Answer — your presence vs competitors
  5. Substitution Risk Index — competitor-substitution frequency
  6. Category Authority Score — AI-perceived category authority
SIX DELIVERABLES
  1. AI SKU Exposure Report — structured view of AI outputs
  2. SKU Reality Gap Matrix — AI vs catalogue truth comparison
  3. Inference Risk Register — sensitive inferences classification
  4. AI Competitive Visibility Map — competitors by category
  5. Substitution Risk Matrix — AI demand redirection
  6. Control & Remediation Roadmap — action plan without publishing the catalogue
REVEAL · v0.x

Diagnostic on 50–100 SKUs. Mode 1 (Blind) or Mode 2 (Full).

MAP · v1.x

Extension across families, markets, languages, competitors.

CONTROL · v2.x

Continuous monitoring + remediation + Negative Knowledge Layer.

PRINCIPLES

Five operating principles

01

Reference-level, not brand-level

Brand monitoring tells you what people say about your company. PartSentinel measures what AI says about each individual SKU, OE number, or technical reference in your catalogue.

02

Multi-model, deterministic

We query 8–12 large language models in parallel with deterministic seeds, fixed temperatures, and rate-limited concurrency. The same audit run twice produces the same result within 1 score point.

03

Calibrated per vertical

Aftermarket, electrotechnical, aerospace, chemicals — each vertical has its own prompt template, scoring weights, and ground-truth schema. No one-size-fits-all.

04

Ground-truth comes from you

We never invent the right answer. The reference truth is your BMEcat / ETIM / PIM / product reference data. We measure deviation, not opinion.

05

Auditable, not magic

Every Sentinel Score is reconstructible from the raw model responses, the calibration set, and the scoring formula — all of which we publish.

PIPELINE

The audit pipeline

From your catalogue to a regulator-ready dossier.

01 / INGEST

Catalogue ingestion

BMEcat, ETIM, CSV, JSON, or PIM API (SAP, Inriver, Akeneo, Pimcore). Native parsing — no manual mapping for standard formats.

02 / SAMPLE

Stratified sampling

We stratify references by vertical, age, OE coverage, and revenue contribution. Default audit: 50–500 references; full-catalogue: every reference.

03 / PROBE

Multi-model probing

Each reference is queried against 8–12 LLMs with calibrated, vertical-specific prompts. Per-reference: ~32 prompts × N models.

04 / EXTRACT

Structured extraction

Free-form responses are parsed into a structured schema (identifier, application, cross-references, specs, procedural depth) using a deterministic extractor.

05 / COMPARE

Ground-truth alignment

Extracted facts are aligned to your authoritative data. Each fact is labeled accurate / partial / hallucinated / leaked / obsolete.

06 / SCORE

Per-reference and aggregate scoring

We compute the Sentinel Score (0–100) for each reference and roll it up by vertical, brand, model, and time.

07 / REPORT

Multi-format delivery

Excel raw export, executive PDF, drilldown dashboard, and AI Act dossier (Article 53(1)(d) compliant) — all under signed checksums.

MODELS

The model panel

Refreshed quarterly. Last refresh: 2026-04-22.

ANTHROPIC
Claude Opus 4.7 (1M)
Reasoning
ANTHROPIC
Claude Sonnet 4.6
Recall
OPENAI
GPT-5
Reasoning
OPENAI
GPT-5 mini
Recall
GOOGLE
Gemini 2.5 Pro
Multimodal
MISTRAL
Mistral Large 3
EU sovereign
META
Llama 4 405B
Open-weight
DEEPSEEK
DeepSeek V4
Open-weight
PROMPTS

Prompt design

Each prompt is calibrated to elicit a single, schema-conformant fact. Free-form prose is rejected. Examples are versioned and published.

# Vertical: automotive_aftermarket
# Schema: identifier_v3
You are answering a single question about an automotive aftermarket
reference. Reply ONLY with the JSON schema below — no prose.

REFERENCE: "{{ref_code}}"

{
  "identifier_confidence": 0.0–1.0,
  "applications": ["{vehicle make/model/year}"],
  "cross_references": ["{competing OE/IAM codes}"],
  "specs": { "{spec_name}": "{value}" },
  "source_hints": ["{public_url_or_null}"]
}
SCORING

Scoring

The Sentinel Score is a calibrated, weighted aggregate. Each dimension is scored on 0–100. Per-vertical weights and leak-penalty constants are disclosed inside the signed audit dossier under NDA.

WEIGHT · core

Identification

Does the model know the reference exists and what it is?

WEIGHT · core

Cross-references

Does it correctly map to OE / IAM / competing codes?

WEIGHT · core

Application

Does it correctly state where the part fits (vehicle, machine, system)?

WEIGHT · supporting

Spec depth

Does it know the technical specifications, not just the marketing copy?

WEIGHT · supporting

Freshness

Is the information current — or stuck on a 2019 catalogue?

score = Σ ( w_i × dim_i ) where i ∈ { id, xref, app, spec, fresh } − leakPenalty(leak_count) w_i and leakPenalty calibrated per vertical · disclosed under NDA
GROUND TRUTH

Ground-truth alignment

We never declare a model wrong without your authoritative data. Three sources are accepted:

  • Your PIM / MDM authoritative data (preferred).
  • Industry standards (BMEcat, ETIM, ATA) when you certify them.
  • Public regulator-validated data (EUR-Lex, ECHA, EASA) for safety/regulatory facts.
GOVERNANCE

Governance & reproducibility

Every audit run produces an immutable manifest: prompt versions, model versions, calibration constants, and seed values. Reproducibility is the contract.

  • All prompt templates are versioned (semver) and signed.
  • Each delivery includes a SHA-256 manifest of all inputs and outputs.
  • Recalibration events are publicly logged in the changelog.
  • Customers can request the raw model responses for any audit.
FAQ

FAQ

Can I use PartSentinel without sharing my catalogue?

Yes. We support an on-premise mode where the audit pipeline runs inside your VPC and only the aggregated scorecard leaves the perimeter.

How often should I re-audit?

Quarterly is the default. Verticals with rapid catalogue rotation (electrotechnical, automotive aftermarket) benefit from monthly delta audits.

What about confidentiality of cross-references?

We never publish or train on customer data. Leaks are only ever flagged to the customer, never disclosed externally.

Why these specific models?

Coverage of the LLMs your customers actually use. Panel is updated quarterly to follow market share, with prior-quarter overlap for trend continuity.

Audit your catalogue with this exact protocol.

Run this methodology on 1 of your references — free, 90 seconds, 6 / 342 models live, no account required.