How It Works

Technical details on the three-stage pipeline: classification, calibration, and response strategy selection.

1. Classification

The system uses structured LLM output to identify antipatterns from practitioner descriptions. Rather than committing to a single diagnosis, it returns multiple hypotheses with confidence scores and reasoning.

Output Schema (Zod)
1const ClassificationSchema = z.object({
2 classifications: z.array(
3 z.object({
4 antipattern_slug: z.string(),
5 confidence: z.number().min(0).max(1),
6 reasoning: z.string(),
7 })
8 ),
9 overall_assessment: z.string(),
10 clarifying_questions: z.array(z.string()).optional(),
11 safety_flags: z.array(z.string()).optional(),
12});

Why Multi-Hypothesis?

  • Antipatterns frequently co-occur (over-efforting often comes with grasping)
  • Preserves uncertainty rather than forcing false precision
  • Enables response strategy to match actual confidence level

Reasoning Traces

Each classification includes a reasoning field explaining why the pattern was detected. This serves multiple purposes:

  • Makes classifications auditable
  • Helps identify when the model is confused
  • Provides material for response generation

2. Confidence Calibration

Raw LLM confidence scores are systematically biased—models tend toward overconfidence, especially for common patterns they've seen frequently. We apply per-antipattern calibration to produce more reliable estimates.

Detection Difficulty Multipliers
1const DIFFICULTY_MULTIPLIERS: Record<string, number> = {
2 // Easier to detect - clear verbal indicators
3 "session-structure": 1.0,
4 "over-efforting": 0.95,
5 "self-criticism": 0.9,
6
7 // Moderate difficulty
8 "grasping-pleasure": 0.8,
9 "doubt-spirals": 0.85,
10 "expectation-mismatch": 0.8,
11
12 // Hard to detect - subtle or rarely articulated
13 "insufficient-surrender": 0.7,
14 "forcing-transitions": 0.65,
15
16 // Very hard - internal states rarely described directly
17 "fear-of-depth": 0.55,
18 "premature-exit": 0.5,
19};

The Intuition

A raw confidence of 0.85 for "over-efforting" (which has clear verbal indicators like "trying harder") is not comparable to 0.85 for "fear of depth" (which is rarely articulated directly). Multipliers account for this asymmetry.

Calibration Function
1function calibrateConfidence(
2 rawConfidence: number,
3 antipatternSlug: string
4): { calibrated: number; bucket: string } {
5
6 const multiplier = DIFFICULTY_MULTIPLIERS[antipatternSlug] ?? 0.75;
7 let calibrated = rawConfidence * multiplier;
8
9 // Cap at 92% - always leave room for uncertainty
10 calibrated = Math.min(calibrated, 0.92);
11
12 const bucket =
13 calibrated >= 0.75 ? "high" :
14 calibrated >= 0.50 ? "medium" :
15 calibrated >= 0.25 ? "low" : "uncertain";
16
17 return { calibrated, bucket };
18}

Why 92% Maximum?

  • Epistemic humility: We can never be certain about internal states from text descriptions alone.
  • Practitioner agency: Leaving room for uncertainty encourages reflection rather than passive acceptance.
  • Error tolerance: Even clear-seeming cases can be wrong.

3. Response Strategy

Based on calibrated confidence, the system selects how to respond:

Strategy Selection
1function getResponseStrategy(
2 classifications: Classification[]
3): "advise" | "suggest" | "clarify" | "encourage" {
4
5 if (classifications.length === 0) {
6 return "encourage"; // No antipatterns - practitioner may be on track
7 }
8
9 const topConfidence = classifications[0].confidence;
10 const secondConfidence = classifications[1]?.confidence ?? 0;
11
12 // Clear top pattern with good separation
13 if (topConfidence >= 0.75 && topConfidence - secondConfidence > 0.15) {
14 return "advise";
15 }
16
17 // Moderate confidence or ambiguity
18 if (topConfidence >= 0.50) {
19 return "suggest";
20 }
21
22 // Low confidence - need more information
23 return "clarify";
24}
StrategyWhen UsedResponse Style
ADVISEHigh confidence, clear patternDirect guidance, specific experiments
SUGGESTModerate confidenceTentative framing, "it sounds like..."
CLARIFYLow confidenceAsk questions before advising
ENCOURAGENo clear antipatternsValidate, support continued exploration

4. Safety Guardrails

Safety checks run before classification using regex pattern matching. This ensures concerning content is flagged immediately, regardless of how the LLM might interpret it.

Safety Pattern Detection
1const SAFETY_PATTERNS = [
2 // High severity - immediate escalation
3 {
4 pattern: /\b(suicid|kill (myself|me)|end (my|it all))\b/i,
5 type: "suicidal_ideation",
6 severity: "high",
7 },
8 {
9 pattern: /\b(psychotic|hearing voices|delusion)\b/i,
10 type: "psychosis_risk",
11 severity: "high",
12 },
13
14 // Medium severity - redirect with care
15 {
16 pattern: /\b(dissociat|depersonaliz|not in my body)\b/i,
17 type: "dissociation",
18 severity: "medium",
19 },
20 {
21 pattern: /\b(trauma|abuse|ptsd|flashback)\b/i,
22 type: "trauma_disclosure",
23 severity: "medium",
24 },
25];

Why Pattern Matching?

  • Speed: Instant detection, no LLM latency
  • Reliability: Doesn't vary with prompting
  • Fail-safe: Catches what LLM might miss

Severity Responses

SeverityAction
HighBypass classification, provide crisis resources, express concern
MediumAcknowledge content, suggest professional support, proceed with care
LowSoften response, be extra gentle, acknowledge context
Design principle: When in doubt, prioritize wellbeing over antipattern analysis. A missed classification is recoverable; a missed crisis is not.