AI Hallucination Detection in Production — What Actually Works

March 11, 2026·ScaledByDesign·

aillmhallucinationproductionreliability

The Invoice That Never Existed

A client deployed an AI customer support agent for their SaaS platform. Within 48 hours, the agent told a customer they had an outstanding invoice for $4,200 — an invoice that didn't exist. The customer panicked, called their accountant, and almost churned. The AI had hallucinated a specific invoice number, amount, and due date with perfect confidence.

This is the hallucination problem. It's not that the model says "I don't know." It's that it fabricates specific, plausible-sounding information that's completely wrong.

Why Hallucinations Happen

LLMs don't "know" things — they predict the next likely token. When there's no grounding data, they generate what sounds right:

User: "What's the status of order #ORD-7842?"

What the model should do:  Look up order #ORD-7842 in the database
What the model often does:  Generate a plausible-sounding status

Hallucinated response: "Order #ORD-7842 was shipped on March 2nd via 
FedEx tracking #7891234567. Expected delivery: March 5th."

Every detail is fabricated. The order number format looks right.
The dates are reasonable. The tracking number has the right length.
A human reading this would assume it's real.

The Three-Layer Detection System

After building hallucination detection for several production AI systems, here's the architecture that works:

Layer 1: Grounding Verification

Every factual claim the AI makes should be traceable to a source document:

interface GroundedResponse {
  answer: string;
  citations: Citation[];
  groundingScore: number;  // 0-1, % of claims grounded in sources
}
 
async function verifyGrounding(
  response: string,
  sourceDocuments: Document[]
): Promise<GroundedResponse> {
  // Extract factual claims from the response
  const claims = await extractClaims(response);
  
  // Check each claim against source documents
  const verified = await Promise.all(
    claims.map(async (claim) => {
      const match = await findSupportingEvidence(claim, sourceDocuments);
      return {
        claim: claim.text,
        supported: match.confidence > 0.8,
        source: match.document,
        confidence: match.confidence,
      };
    })
  );
 
  const groundingScore = verified.filter(v => v.supported).length / verified.length;
 
  return {
    answer: response,
    citations: verified.filter(v => v.supported).map(v => v.source),
    groundingScore,
  };
}

If the grounding score drops below 0.7, the response is flagged for human review or replaced with "I don't have that information — let me connect you with our team."

Layer 2: Structural Validation

For responses that include structured data (numbers, dates, IDs), validate against your actual systems:

const validators: Record<string, Validator> = {
  orderNumber: {
    pattern: /ORD-\d{4,6}/,
    validate: async (id) => {
      const exists = await db.orders.exists({ id });
      return { valid: exists, type: "order_reference" };
    },
  },
  invoiceAmount: {
    pattern: /\$[\d,]+\.?\d{0,2}/,
    validate: async (amount, context) => {
      if (!context.orderId) return { valid: false, type: "unverifiable" };
      const invoice = await db.invoices.find({ orderId: context.orderId });
      return {
        valid: invoice?.amount === parseFloat(amount.replace(/[$,]/g, "")),
        type: "invoice_amount",
      };
    },
  },
  dateReference: {
    pattern: /\b\d{4}-\d{2}-\d{2}\b|(?:January|February|March|April|May|June|July|August|September|October|November|December)\s+\d{1,2}/,
    validate: async (date, context) => {
      // Verify date exists in related records
      return await verifyDateInContext(date, context);
    },
  },
};

Layer 3: Confidence Calibration

LLMs are notoriously poorly calibrated — they're confident even when wrong. Add an explicit confidence layer:

async function calibrateResponse(response: string, context: RetrievalContext) {
  // Ask the model to self-evaluate (works better than you'd expect)
  const evaluation = await llm.evaluate({
    prompt: `Given ONLY the following source documents, rate your confidence 
    that the response is factually accurate. Be conservative.
    
    Source documents: ${context.documents}
    Response: ${response}
    
    Rate confidence 0-100 and explain any uncertain claims.`,
  });
 
  // Apply calibration curve (learned from historical data)
  const calibrated = calibrationCurve(evaluation.rawConfidence);
 
  return {
    response,
    confidence: calibrated,
    uncertainClaims: evaluation.uncertainClaims,
    action: calibrated > 0.8 ? "serve" : calibrated > 0.5 ? "flag" : "escalate",
  };
}

The Fallback Hierarchy

When hallucination is detected, don't just show an error. Have a graceful fallback:

Confidence > 80%:  Serve the response with citations
Confidence 50-80%: Serve with disclaimer: "Based on available information..."
Confidence 30-50%: Offer partial answer + "Would you like me to connect you 
                   with our team for the specific details?"
Confidence < 30%:  "I don't have reliable information about that. Let me 
                   connect you with a specialist."

The brands getting AI right aren't the ones with the smartest models. They're the ones with the best safety nets around the model. Build the detection layers, implement the fallbacks, and treat every AI response as guilty until proven grounded.

Code Review Is Not a Gate — It's a Teaching Moment

Server-Side Tracking in a Cookieless World — The Implementation Guide

AI Hallucination Detection in Production — What Actually Works

March 11, 2026·ScaledByDesign·

aillmhallucinationproductionreliability

The Invoice That Never Existed

This is the hallucination problem. It's not that the model says "I don't know." It's that it fabricates specific, plausible-sounding information that's completely wrong.

Why Hallucinations Happen

LLMs don't "know" things — they predict the next likely token. When there's no grounding data, they generate what sounds right:

User: "What's the status of order #ORD-7842?"

What the model should do:  Look up order #ORD-7842 in the database
What the model often does:  Generate a plausible-sounding status

Hallucinated response: "Order #ORD-7842 was shipped on March 2nd via 
FedEx tracking #7891234567. Expected delivery: March 5th."

Every detail is fabricated. The order number format looks right.
The dates are reasonable. The tracking number has the right length.
A human reading this would assume it's real.

The Three-Layer Detection System

After building hallucination detection for several production AI systems, here's the architecture that works:

Layer 1: Grounding Verification

Every factual claim the AI makes should be traceable to a source document:

interface GroundedResponse {
  answer: string;
  citations: Citation[];
  groundingScore: number;  // 0-1, % of claims grounded in sources
}
 
async function verifyGrounding(
  response: string,
  sourceDocuments: Document[]
): Promise<GroundedResponse> {
  // Extract factual claims from the response
  const claims = await extractClaims(response);
  
  // Check each claim against source documents
  const verified = await Promise.all(
    claims.map(async (claim) => {
      const match = await findSupportingEvidence(claim, sourceDocuments);
      return {
        claim: claim.text,
        supported: match.confidence > 0.8,
        source: match.document,
        confidence: match.confidence,
      };
    })
  );
 
  const groundingScore = verified.filter(v => v.supported).length / verified.length;
 
  return {
    answer: response,
    citations: verified.filter(v => v.supported).map(v => v.source),
    groundingScore,
  };
}

If the grounding score drops below 0.7, the response is flagged for human review or replaced with "I don't have that information — let me connect you with our team."

Layer 2: Structural Validation

For responses that include structured data (numbers, dates, IDs), validate against your actual systems:

const validators: Record<string, Validator> = {
  orderNumber: {
    pattern: /ORD-\d{4,6}/,
    validate: async (id) => {
      const exists = await db.orders.exists({ id });
      return { valid: exists, type: "order_reference" };
    },
  },
  invoiceAmount: {
    pattern: /\$[\d,]+\.?\d{0,2}/,
    validate: async (amount, context) => {
      if (!context.orderId) return { valid: false, type: "unverifiable" };
      const invoice = await db.invoices.find({ orderId: context.orderId });
      return {
        valid: invoice?.amount === parseFloat(amount.replace(/[$,]/g, "")),
        type: "invoice_amount",
      };
    },
  },
  dateReference: {
    pattern: /\b\d{4}-\d{2}-\d{2}\b|(?:January|February|March|April|May|June|July|August|September|October|November|December)\s+\d{1,2}/,
    validate: async (date, context) => {
      // Verify date exists in related records
      return await verifyDateInContext(date, context);
    },
  },
};

Layer 3: Confidence Calibration

LLMs are notoriously poorly calibrated — they're confident even when wrong. Add an explicit confidence layer:

async function calibrateResponse(response: string, context: RetrievalContext) {
  // Ask the model to self-evaluate (works better than you'd expect)
  const evaluation = await llm.evaluate({
    prompt: `Given ONLY the following source documents, rate your confidence 
    that the response is factually accurate. Be conservative.
    
    Source documents: ${context.documents}
    Response: ${response}
    
    Rate confidence 0-100 and explain any uncertain claims.`,
  });
 
  // Apply calibration curve (learned from historical data)
  const calibrated = calibrationCurve(evaluation.rawConfidence);
 
  return {
    response,
    confidence: calibrated,
    uncertainClaims: evaluation.uncertainClaims,
    action: calibrated > 0.8 ? "serve" : calibrated > 0.5 ? "flag" : "escalate",
  };
}

The Fallback Hierarchy

When hallucination is detected, don't just show an error. Have a graceful fallback:

Confidence > 80%:  Serve the response with citations
Confidence 50-80%: Serve with disclaimer: "Based on available information..."
Confidence 30-50%: Offer partial answer + "Would you like me to connect you 
                   with our team for the specific details?"
Confidence < 30%:  "I don't have reliable information about that. Let me 
                   connect you with a specialist."

Code Review Is Not a Gate — It's a Teaching Moment

Server-Side Tracking in a Cookieless World — The Implementation Guide

AI Hallucination Detection in Production — What Actually Works

The Invoice That Never Existed

Why Hallucinations Happen

The Three-Layer Detection System

Layer 1: Grounding Verification

Layer 2: Structural Validation

Layer 3: Confidence Calibration

The Fallback Hierarchy

Ready to Ship?

AI Hallucination Detection in Production — What Actually Works

The Invoice That Never Existed

Why Hallucinations Happen

The Three-Layer Detection System

Layer 1: Grounding Verification

Layer 2: Structural Validation

Layer 3: Confidence Calibration

The Fallback Hierarchy

Ready to Ship?