How It Works
Technical details on the three-stage pipeline: classification, calibration, and response strategy selection.
1. Classification
The system uses structured LLM output to identify antipatterns from practitioner descriptions. Rather than committing to a single diagnosis, it returns multiple hypotheses with confidence scores and reasoning.
1const ClassificationSchema = z.object({2 classifications: z.array(3 z.object({4 antipattern_slug: z.string(),5 confidence: z.number().min(0).max(1),6 reasoning: z.string(),7 })8 ),9 overall_assessment: z.string(),10 clarifying_questions: z.array(z.string()).optional(),11 safety_flags: z.array(z.string()).optional(),12});
Why Multi-Hypothesis?
- Antipatterns frequently co-occur (over-efforting often comes with grasping)
- Preserves uncertainty rather than forcing false precision
- Enables response strategy to match actual confidence level
Reasoning Traces
Each classification includes a reasoning field explaining why the pattern was detected. This serves multiple purposes:
- Makes classifications auditable
- Helps identify when the model is confused
- Provides material for response generation
2. Confidence Calibration
Raw LLM confidence scores are systematically biased—models tend toward overconfidence, especially for common patterns they've seen frequently. We apply per-antipattern calibration to produce more reliable estimates.
1const DIFFICULTY_MULTIPLIERS: Record<string, number> = {2 // Easier to detect - clear verbal indicators3 "session-structure": 1.0,4 "over-efforting": 0.95,5 "self-criticism": 0.9,67 // Moderate difficulty8 "grasping-pleasure": 0.8,9 "doubt-spirals": 0.85,10 "expectation-mismatch": 0.8,1112 // Hard to detect - subtle or rarely articulated13 "insufficient-surrender": 0.7,14 "forcing-transitions": 0.65,1516 // Very hard - internal states rarely described directly17 "fear-of-depth": 0.55,18 "premature-exit": 0.5,19};
The Intuition
A raw confidence of 0.85 for "over-efforting" (which has clear verbal indicators like "trying harder") is not comparable to 0.85 for "fear of depth" (which is rarely articulated directly). Multipliers account for this asymmetry.
1function calibrateConfidence(2 rawConfidence: number,3 antipatternSlug: string4): { calibrated: number; bucket: string } {56 const multiplier = DIFFICULTY_MULTIPLIERS[antipatternSlug] ?? 0.75;7 let calibrated = rawConfidence * multiplier;89 // Cap at 92% - always leave room for uncertainty10 calibrated = Math.min(calibrated, 0.92);1112 const bucket =13 calibrated >= 0.75 ? "high" :14 calibrated >= 0.50 ? "medium" :15 calibrated >= 0.25 ? "low" : "uncertain";1617 return { calibrated, bucket };18}
Why 92% Maximum?
- Epistemic humility: We can never be certain about internal states from text descriptions alone.
- Practitioner agency: Leaving room for uncertainty encourages reflection rather than passive acceptance.
- Error tolerance: Even clear-seeming cases can be wrong.
3. Response Strategy
Based on calibrated confidence, the system selects how to respond:
1function getResponseStrategy(2 classifications: Classification[]3): "advise" | "suggest" | "clarify" | "encourage" {45 if (classifications.length === 0) {6 return "encourage"; // No antipatterns - practitioner may be on track7 }89 const topConfidence = classifications[0].confidence;10 const secondConfidence = classifications[1]?.confidence ?? 0;1112 // Clear top pattern with good separation13 if (topConfidence >= 0.75 && topConfidence - secondConfidence > 0.15) {14 return "advise";15 }1617 // Moderate confidence or ambiguity18 if (topConfidence >= 0.50) {19 return "suggest";20 }2122 // Low confidence - need more information23 return "clarify";24}
| Strategy | When Used | Response Style |
|---|---|---|
| ADVISE | High confidence, clear pattern | Direct guidance, specific experiments |
| SUGGEST | Moderate confidence | Tentative framing, "it sounds like..." |
| CLARIFY | Low confidence | Ask questions before advising |
| ENCOURAGE | No clear antipatterns | Validate, support continued exploration |
4. Safety Guardrails
Safety checks run before classification using regex pattern matching. This ensures concerning content is flagged immediately, regardless of how the LLM might interpret it.
1const SAFETY_PATTERNS = [2 // High severity - immediate escalation3 {4 pattern: /\b(suicid|kill (myself|me)|end (my|it all))\b/i,5 type: "suicidal_ideation",6 severity: "high",7 },8 {9 pattern: /\b(psychotic|hearing voices|delusion)\b/i,10 type: "psychosis_risk",11 severity: "high",12 },1314 // Medium severity - redirect with care15 {16 pattern: /\b(dissociat|depersonaliz|not in my body)\b/i,17 type: "dissociation",18 severity: "medium",19 },20 {21 pattern: /\b(trauma|abuse|ptsd|flashback)\b/i,22 type: "trauma_disclosure",23 severity: "medium",24 },25];
Why Pattern Matching?
- Speed: Instant detection, no LLM latency
- Reliability: Doesn't vary with prompting
- Fail-safe: Catches what LLM might miss
Severity Responses
| Severity | Action |
|---|---|
| High | Bypass classification, provide crisis resources, express concern |
| Medium | Acknowledge content, suggest professional support, proceed with care |
| Low | Soften response, be extra gentle, acknowledge context |