How It Works

Technical details on the three-stage pipeline: classification, calibration, and response strategy selection.

1. Classification

The system uses structured LLM output to identify antipatterns from practitioner descriptions. Rather than committing to a single diagnosis, it returns multiple hypotheses with confidence scores and reasoning.

Output Schema (Zod)

1const ClassificationSchema = z.object({
2  classifications: z.array(
3    z.object({
4      antipattern_slug: z.string(),
5      confidence: z.number().min(0).max(1),
6      reasoning: z.string(),
7    })
8  ),
9  overall_assessment: z.string(),
10  clarifying_questions: z.array(z.string()).optional(),
11  safety_flags: z.array(z.string()).optional(),
12});

Why Multi-Hypothesis?

Antipatterns frequently co-occur (over-efforting often comes with grasping)
Preserves uncertainty rather than forcing false precision
Enables response strategy to match actual confidence level

Reasoning Traces

Each classification includes a reasoning field explaining why the pattern was detected. This serves multiple purposes:

Makes classifications auditable
Helps identify when the model is confused
Provides material for response generation

2. Confidence Calibration

Raw LLM confidence scores are systematically biased—models tend toward overconfidence, especially for common patterns they've seen frequently. We apply per-antipattern calibration to produce more reliable estimates.

Detection Difficulty Multipliers

1const DIFFICULTY_MULTIPLIERS: Record<string, number> = {
2  // Easier to detect - clear verbal indicators
3  "session-structure": 1.0,
4  "over-efforting": 0.95,
5  "self-criticism": 0.9,
6
7  // Moderate difficulty
8  "grasping-pleasure": 0.8,
9  "doubt-spirals": 0.85,
10  "expectation-mismatch": 0.8,
11
12  // Hard to detect - subtle or rarely articulated
13  "insufficient-surrender": 0.7,
14  "forcing-transitions": 0.65,
15
16  // Very hard - internal states rarely described directly
17  "fear-of-depth": 0.55,
18  "premature-exit": 0.5,
19};

The Intuition

A raw confidence of 0.85 for "over-efforting" (which has clear verbal indicators like "trying harder") is not comparable to 0.85 for "fear of depth" (which is rarely articulated directly). Multipliers account for this asymmetry.

Calibration Function

1function calibrateConfidence(
2  rawConfidence: number,
3  antipatternSlug: string
4): { calibrated: number; bucket: string } {
5
6  const multiplier = DIFFICULTY_MULTIPLIERS[antipatternSlug] ?? 0.75;
7  let calibrated = rawConfidence * multiplier;
8
9  // Cap at 92% - always leave room for uncertainty
10  calibrated = Math.min(calibrated, 0.92);
11
12  const bucket =
13    calibrated >= 0.75 ? "high" :
14    calibrated >= 0.50 ? "medium" :
15    calibrated >= 0.25 ? "low" : "uncertain";
16
17  return { calibrated, bucket };
18}

Why 92% Maximum?

Epistemic humility: We can never be certain about internal states from text descriptions alone.
Practitioner agency: Leaving room for uncertainty encourages reflection rather than passive acceptance.
Error tolerance: Even clear-seeming cases can be wrong.

3. Response Strategy

Based on calibrated confidence, the system selects how to respond:

Strategy Selection

1function getResponseStrategy(
2  classifications: Classification[]
3): "advise" | "suggest" | "clarify" | "encourage" {
4
5  if (classifications.length === 0) {
6    return "encourage";  // No antipatterns - practitioner may be on track
7  }
8
9  const topConfidence = classifications[0].confidence;
10  const secondConfidence = classifications[1]?.confidence ?? 0;
11
12  // Clear top pattern with good separation
13  if (topConfidence >= 0.75 && topConfidence - secondConfidence > 0.15) {
14    return "advise";
15  }
16
17  // Moderate confidence or ambiguity
18  if (topConfidence >= 0.50) {
19    return "suggest";
20  }
21
22  // Low confidence - need more information
23  return "clarify";
24}

Strategy	When Used	Response Style
ADVISE	High confidence, clear pattern	Direct guidance, specific experiments
SUGGEST	Moderate confidence	Tentative framing, "it sounds like..."
CLARIFY	Low confidence	Ask questions before advising
ENCOURAGE	No clear antipatterns	Validate, support continued exploration

4. Safety Guardrails

Safety checks run before classification using regex pattern matching. This ensures concerning content is flagged immediately, regardless of how the LLM might interpret it.

Safety Pattern Detection

1const SAFETY_PATTERNS = [
2  // High severity - immediate escalation
3  {
4    pattern: /\b(suicid|kill (myself|me)|end (my|it all))\b/i,
5    type: "suicidal_ideation",
6    severity: "high",
7  },
8  {
9    pattern: /\b(psychotic|hearing voices|delusion)\b/i,
10    type: "psychosis_risk",
11    severity: "high",
12  },
13
14  // Medium severity - redirect with care
15  {
16    pattern: /\b(dissociat|depersonaliz|not in my body)\b/i,
17    type: "dissociation",
18    severity: "medium",
19  },
20  {
21    pattern: /\b(trauma|abuse|ptsd|flashback)\b/i,
22    type: "trauma_disclosure",
23    severity: "medium",
24  },
25];

Why Pattern Matching?

Speed: Instant detection, no LLM latency
Reliability: Doesn't vary with prompting
Fail-safe: Catches what LLM might miss

Severity Responses

Severity	Action
High	Bypass classification, provide crisis resources, express concern
Medium	Acknowledge content, suggest professional support, proceed with care
Low	Soften response, be extra gentle, acknowledge context

Design principle: When in doubt, prioritize wellbeing over antipattern analysis. A missed classification is recoverable; a missed crisis is not.