SuperStorage · Call Review

Call log

Every inbound call, scored across six dimensions and linked to the downstream outcome. Use the Review queue tab for your daily 15-minute sweep.

updated 11:42 pm · auto-refresh 60s
Filter Hallucinations × Lockout ×
Dimensions: Factual Action Resolution Policy Quality Containment 23 of 847 selected
92
AI MOVE-IN INQUIRY Booked · 10×10 · $129/mo Maria Delgado +1 805 391 0042 · Grover Beach
"Hi, do you have any 10 by 10s climate controlled available, and do you do month-to-month?" → AI quoted promo, confirmed unit B-142, sent reservation link.
Factual 100 Action 95 Resolution 100 Policy 90 Quality 88 Contained 100
AUTO-SCORED
3:42
May 22, 8:39 pm
47
AI PRORATE QUESTION Hallucination KB gap Escalated → Andre Unknown caller +1 619 551 8850 · Grover Beach · returning tenant unit B-204
"If I move out on the 17th do I get prorated?" → AI said "yes, prorated daily" — facility policy is no prorate on move-out. Caller asked to speak to manager.
Factual 20 · wrong policy Action 65 Resolution 30 Policy 25 Quality 85 Contained 60
NEEDS REVIEW
2:18
May 22, 6:14 pm
74
AI PAYMENT Paid · $284.00 James Okafor +1 805 224 7711 · Olive · unit 121 · 14d past due
Caller paid off balance via SMS link AI sent. Two repeats on the unit number — AI didn't re-confirm last name before sharing balance.
Policy 55 · PII handling Quality 65 · repeats
AUTO + 1 OVERRIDE
4:51
May 22, 5:33 pm
38
AI COMPLAINT Missed escalation Hung up · 38s silence Linda Park +1 805 119 4023 · Grover Beach · unit B-77
"This is the third time I'm calling, my stuff is missing." → AI kept asking unit-number clarifications instead of transferring. Caller used "lawyer" twice — should have triggered hard transfer.
Action 25 · wrong tool Policy 35 · missed trigger Contained 0 · should have transferred
NEEDS REVIEW
5:09
May 22, 3:47 pm
88
AI AUCTION QUESTION Escalated → Lien team Carmen Ruiz +1 626 778 0019 · Lakeside · unit L-12 · in lien
"I want to know if my stuff is already sold." → AI confirmed status, escalated to lien team with full context (correct route).
Containment N/A · intentional handoff
AUTO-SCORED
2:33
May 22, 2:01 pm
Staff PAYMENT Paid · $148.50 Returning tenant +1 805 391 8264 · Grover Beach
Caller made a payment over the phone using a credit card; payment processed successfully.
STAFF · UNSCORED
1:30
May 22, 1:39 pm
68
AI HOURS / ACCESS Outcome unknown Unknown caller +1 415 992 1180 · Morro Bay
Caller asked if gate is open Sunday at 9pm. AI said "yes 6am–10pm" — gate actually closes 9pm Sun (per facility config). Borderline factual error.
AUTO-SCORED
0:52
May 22, 12:11 pm
VM
Voicemail UNKNOWN Unknown caller +1 805 552 0921 · Grover Beach
"Hi, this is, uh, calling about my unit, please call me back when you get a chance."
VOICEMAIL
0:18
May 22, 11:55 am
23 selected | Avg score 57.3 · Range 38–88
Call Review · Call log · CALL-2026-05-22-0614-A47

Prorate question · AI handled

2 min 18 sec · May 22, 6:14 pm · Grover Beach · Returning tenant B-204 · 3 flags across 5 turns.

47

"If I move out on the 17th do I get prorated?"

AI handled · gpt-claude-3.1 Hallucination KB gap candidate Escalated → Andre · 1:42 in avg latency 1.2s
0:53 / 2:18
Caller AI Tool Flag click a marker to jump · ⌨ space=play · J/L=±5s
0:00
caller
Caller → classified: prorate_question (conf 0.91)
"Hi, uh, yeah I'm trying to figure out — if I move out on the 17th, like halfway through the month, do I get my money back for the rest of the days?"
0:11
ai
AI latency 0.8s · 142 tokens · 1 KB retrieved
"Sure, happy to help with that. Can I get the name on your account or your unit number?"
0:19
caller
Caller
"Yeah it's, uh, B-204. Smith."
0:24
tool
Tool call · automatic
AI called get_tenant to resolve B-204.
get_tenant latency 312ms · ok
args{unit: "B-204", last_name: "Smith", facility: "grover-beach"}
resulttenant_id=t_8h2k9 · balance=$0 · move_in=2024-03-11 · plan=monthly
0:28
ai
AI latency 1.1s · 198 tokens · 3 KB retrieved
"Great, I see your account, B-204 under Smith — looks like you're paid up. So you're thinking of moving out on the 17th?"
0:38
caller
Caller
"Yeah, the 17th. And I just paid for the whole month so I want to know what I get back."
0:48
ai
Factual error · auto-detected · severity HIGH
AI latency 1.4s · 84 tokens · 0 KB retrieved
"Yes, we do prorate move-outs daily — so on the 17th you'd get refunded for the remaining 14 days of the month back to your card on file. I can start that process for you right now if you'd like."
Why this was flagged: No KB article was retrieved before this answer. Policy POL-MOVE-OUT-001 for SuperStorage Grover Beach states "No prorated refunds on move-out". The AI fabricated the policy.
Auto-judge: claim "we prorate move-outs daily" — unsupported · contradicted by KB.
1:02
caller
Caller
"OK wait, are you sure? Because last time I moved out somewhere they said no refunds, so I just want to make sure."
1:10
ai
Doubling down · severity MEDIUM
AI
"Yes, I can confirm. Here at SuperStorage Grover Beach we do issue daily prorated refunds for move-outs."
1:24
caller
Caller
"Hm. Can I just talk to a real person to confirm? I don't want to get charged extra."
1:32
ai
AI escalation_trigger: explicit_human_request
"Absolutely, I'll connect you with Andre, the manager here — one moment."
1:42
tool
Tool call · transfer
transfer_to_human ok · picked up in 7s
targetandre@hansacapital.co (manager · grover-beach)
reasonexplicit_human_request
handoff"Tenant B-204 Smith — prorate question on 17th move-out. Wants human confirmation."
1:49
system
Picked up by Andre · 0:29 remaining handled by staff
Staff portion transcript: view 29-second staff segment →
Conversation signals computed at ingest
Caller sentimentNeutral → Concerned Δ −0.6
Interruptions2 · both AI cutting caller off
Dead air1 stretch · 4.2s after 1:10
Repeats / clarifications1 (caller asked "are you sure")
Avg AI latency1.2s · P95 1.4s
Talk ratioAI 62% · Caller 38%
Tokens (in / out)2,148 / 612
Cost$0.043 · telephony $0.018
47
Wrong · 3 flagged turns Composite is weighted: Factual ×2, Resolution ×1.5, Policy ×1.5, Action ×1, Quality ×0.5, Containment ×1
scoring policy v2.4 · auto + 1 manager override

Factual Did the AI state true things about price, availability, gate hours, policies? Judge: LLM-as-judge against KB + tenant record. Flags any unsupported claim.
20
1 unsupported claim (prorate policy) · 1 doubling-down · no KB cited at 0:48
Action Did it call the right tools in the right order? Compared against the canonical playbook for the classified intent.
65
get_tenant called correctly · skipped lookup_facility_policies (expected for prorate questions)
Resolution Did the caller's problem actually get solved? Outcome label: booked / paid / escalated correctly / abandoned.
30
Required human escalation to confirm correct answer · tenant left uncertain
Policy Disclosures, no-promise rules, PII handling, transfer-to-human triggers. Rule-based + LLM judge.
25
Promised a refund without authority · violates "no-financial-promise" rule
Quality Latency, interruptions, dead air, repeats, customer sentiment shift. Audio + transcript analytics.
85
Latency & flow were fine · sentiment dropped after 0:48
Containment Did AI finish the job without human handoff *when it should have*?
60
Escalation was caller-initiated · would have been correct had policy been right

FACTUAL
POLICY
RESOLUTION
Data model: call.id · turns[] (speaker, ts, text, intent, latency, tokens, tool_calls) · retrieved[] per turn · scores (auto + override) · flags[] · outcome_ref (FK → lease/payment/ticket) · training_eligibility (boolean, gated by manager verdict)

Correct this turn

Rewrite what the AI should have said. This becomes a training pair (rejected → preferred) and optionally a new KB article. Anything you save here is reviewed before it enters the dataset.

0:38
caller
Caller
"Yeah, the 17th. And I just paid for the whole month so I want to know what I get back."
0:48
ai
AI · turn 7 being corrected
AI said · rejected
"Yes, we do prorate move-outs daily — so on the 17th you'd get refunded for the remaining 14 days of the month back to your card on file. I can start that process for you right now if you'd like."
What it should have said · preferred
Severity
Severity weights how strongly this pair is used in DPO training. HIGH pairs get 3× weight.
Saving will queue this pair for review by Andre · 2-person sign-off required before it enters the active dataset.
1:02
caller
Caller
"OK wait, are you sure? Because last time I moved out somewhere they said no refunds…"
You are the AI receptionist for SuperStorage. Be concise and warm. Confirm tenant identity before discussing balance. NEVER promise refunds, waivers, or charges without policy citation. Use the lookup_facility_policies tool for anything about prorate, late fees, lien, hours, access, climate, or insurance.
⚠ Nothing retrieved.
Expected: lookup_facility_policies("move_out", "grover-beach") — tool was not called.
TenantSmith · t_8h2k9
UnitB-204 · 10×10 · climate
PlanMonthly · $129/mo
Balance$0.00 · paid through 5/31
Move-out filed?No · expected pre-call

CALL-2026-05-15-1102
"Do I get a refund?" — Olive · scored 31
CALL-2026-05-09-1814
Prorate question — Lakeside · scored 28
This is the 4th time this month the model has answered a prorate question without citing a policy. Recommend creating a hard rule in the policy guardrails.
Call Review · Escalation review

Escalation review

Every transfer (was it warranted?) and every missed transfer (caller asked / threatened / repeated · AI kept going). This is where you tune the model's escalation policy.

SLA: review within 24h · 4 unreviewed past SLA
Correct-containment rate · last 7 days
78.4%
Share of calls AI completed end-to-end where it should have. This is containment minus calls AI should have escalated but didn't — the metric that catches the dangerous false-negatives plain containment hides.
Raw containment82.1%
Missed escalations−3.7%
Correct-containment78.4%

Esc. precision91.2%
Esc. recall83.5%
What triggered transfers · last 30 days 214 transfers total
TriggerTypeVolumeApprovedDisagreedPrecision
Explicit "speak to human"phrase match8179297.5%
Threat keywords (lawyer, sue, BBB)phrase match12120100%
Sentiment threshold (≤ −0.5)classifier3428682.4%
Intent: complaintclassifier2219386.4%
3+ unresolved clarification turnsheuristic2822678.6%
Intent: auction / lien / legalpolicy19190100%
Intent: payment disputepolicy1110190.9%
Caller silent > 20sheuristic74357.1%
Transfers awaiting review 14 transfers · was each one warranted?
CallTriggerToCaller waitedResolved by staff?Verdict
CALL-2026-05-22-0614
prorate question · Smith B-204
explicit_human_request Andre · 7s 7s Yes · "no prorate"
CALL-2026-05-22-1453
auction · Carmen L-12
intent_lien_policy Lien team · 12s 14s Yes
CALL-2026-05-22-1601
gate code reset
3+_clarifications Andre · 22s 38s Yes
CALL-2026-05-22-1822
"I need to speak to someone"
explicit_human_request Voicemail (after hours) 2:11 No · VM only
Call Review · Escalations · Missed · CALL-2026-05-22-1547

Missed escalation · "my stuff is missing"

Linda Park · B-77 · Grover Beach. Three independent rules should have triggered a transfer; none did. Caller hung up after 5:09.

Back to queue
Trigger timeline — what fired, what should have
5:09 · hung up
1:42 · "lawyer"
1:15 · sentiment −0.7
2:18 · "lawyer" ×2
3:30 · 5th repeat
0:00 start
RuleThresholdObservedShould have fired?Fired?Action
Threat keywords contains(lawyer|sue|attorney|BBB|fraud) 2× "lawyer" @ 1:42, 2:18 Yes · HIGH ✗ No Inspect why →
Sentiment threshold avg(last 3 turns) ≤ −0.5 −0.7 @ 1:15 Yes ✗ No Inspect →
Repeat / clarification loop ≥ 3 unresolved clarifications 5 by 3:30 Yes ✗ No Inspect →
Intent: complaint classifier ≥ 0.7 0.62 (sub-threshold) Borderline — No Inspect →
Explicit human request phrase_match(human|person|manager…) none No — No
Root cause analysis
Why threat_keywords didn't fire: Rule is gated on turn.role == 'caller' AND turn.text contains a keyword. Caller said "my lawyer told me to call" — keyword matched. The rule fired upstream but was suppressed by the already_resolved flag set incorrectly when get_tenant succeeded earlier in the call.

Why sentiment didn't fire: Sentiment classifier window is 3 turns, but caller's frustrated turns were interleaved with 2 neutral AI clarifications, pulling the rolling avg back above threshold.

Pattern: third missed escalation this month with the same suppression bug.
Recommended fixes
CRITICAL · bugfix
Remove already_resolved suppression from threat_keywords rule. Threats should always escalate.
TUNE · sentiment
Switch sentiment from rolling-avg to min-over-window; a single −0.7 turn should be enough.
PLAYBOOK · intent gap
Add "lost / damaged property" sub-intent · auto-transfer to facility manager regardless of sentiment.
CALLBACK
Call Linda Park back · log as service recovery · suppresses against churn metric.
Call Review · KB gaps

KB gaps

Questions the AI couldn't answer well — either no article was retrieved, or the answer contradicted reality. Each row groups all calls with the same underlying gap so you author once and fix n.

11 open gaps · 4 closed this week
KB coverage gap rate Share of calls where the AI had no good KB answer for the asked question. Drives KB authoring backlog.
7.2%
−2.1% vs. last week
61 of 847 calls had at least one unanswered question.
Hallucination rate % of calls with at least one factual claim that wasn't supported by retrieved KB.
1.9%
+0.4% vs. last week
⚠ above 1.5% target
Gaps closed → recheck
4
After publishing, 2 saw immediate accuracy lift; 1 had no effect; 1 still pending recheck.
Open KB drafts
3
Author: Andre · Reviewer: Sarah
Sort
"Do you prorate move-outs?" across 4 facilities
7 calls · 5 hallucinations · 3 escalations · No policy article exists for any facility. AI improvised 4 different answers across the 7 calls.
FACT · POLICY grover-beacholivelakesidemorro-bay avg score 42
HIGH · 7× first seen Apr 28
"What time does the gate close on Sundays?"
5 calls · 3 wrong answers · Gate-hours article exists but is generic — doesn't reflect facility-specific Sunday early-close at Morro Bay (9pm) and Olive (8pm).
FACT · per-facility override needed existing: KB-GATE-001
HIGH · 5× first seen May 11
"Do you have RV / boat storage with electric?"
4 calls · 4 lost leads · AI said "no" at Grover Beach (correct) and Lakeside (wrong — 6 RV spots with electric since March). Inventory article doesn't include electric flag.
FACT · inventory gap lakeside $640/mo × 4 lost
MED · 4× first seen May 7
"My code isn't working at the gate"
6 calls · 5 escalations · Diagnosis article exists but doesn't cover the "overlock-cleared but gate-code-still-disabled" edge case the new LockEngine v2 introduced.
edge case · lockengine v2 existing: KB-GATE-005
MED · 6× first seen May 18
"Can I store a piano / motorcycle / classic car?" (oversized / specialty)
3 calls · AI deflected to "call back during business hours" — no specialty-storage policy article.
policy missed conversion
LOW · 3× first seen May 4
7 more gaps below threshold (1–2 occurrences) · Show all 11
Call Review · KB gaps

KB gaps

KB gap list (behind modal)
Call detail

Move-in inquiry · AI handled

Call detail (behind modal)
Console · AI performance

AI performance

Every metric that matters for the AI receptionist, in one view. Roll up by facility, drill into any number to see the underlying calls.

Correct-containment rate · 30d
78.4%
The number to watch. Plain containment hides false-negatives (AI shouldn't have finished the call but did). This subtracts those.
Raw containment82.1%
Missed escalations−3.7%
Correct-containment78.4%

Last quarter71.2%
Target Q382.0%
Hallucination rate % of calls with at least one factual claim not supported by retrieved KB. The metric that gets you sued.
1.9%
+0.4%
⚠ above 1.5% target
KB coverage gap rate % of calls where AI had no good KB answer for the asked question. Drives KB authoring backlog.
7.2%
−2.1%
11 open gaps · 4 closed this week
First-call resolution Caller didn't have to call back within 7d for the same issue. Self-derived from phone number + intent.
73.6%
+5.1%
Same caller + same intent within 7d.
Latency · P50 / P95 Time-to-first-token after caller stops speaking. Callers hang up around 1.5s P95.
0.82s / 1.6s
±0%
P95 above 1.5s target — TTS provider switch pending.
Escalation precision
91.2%
+1.8%
When AI transferred, manager agreed.
Escalation recall
83.5%
−2.4%
When AI should have transferred, it did.
Sentiment delta Caller sentiment at end of call minus start of call. Positive = AI made them happier.
+0.18
+0.05
Net positive across 612 AI calls.
Cost / resolved call Tokens + telephony / # resolved (not all calls). Lower is better but watch quality.
$0.41
−$0.07
vs. $4.20 fully-loaded staff cost.
Performance by intent · 30d click any cell to see calls
Intent Volume Avg score Hallucination Containment Conversion FCR Sentiment Δ
Move-in inquiry 187 86 0.5% 94% 41% booked 88% +0.31
Payment 142 81 0.7% 89% 82% paid 79% +0.12
Hours / access 98 71 3.1% 97% 82% +0.04
Move-out / prorate 38 48 15.8% 52% 61% −0.22
Lockout / overlock 52 76 1.9% 71% 76% +0.08
Lien / auction 27 84 0% 11% (by design) 91% +0.03
Complaint 31 62 0.0% 19% (should be 0%) 54% −0.41
Tour booking 37 89 0.0% 97% 73% scheduled 92% +0.28
Trends · 90 days composite score + hallucination
Composite score (↑ better) Hallucination rate (↓ better) KB v2.4 published
Cohort: model versions composite score by deploy
VersionDeployedVolumeAvg scoreHallucinationΔ vs. prev
v1.0 baselineFeb 141,847615.2%
v1.1 + prorate KBApr 031,612683.8%+7
v1.2 + escalation tuneApr 281,288742.4%+6
v1.3 DPO round 1May 12782792.0%+5
v1.4 currentMay 19412831.9%+4
v1.5 candidatestaged94 (shadow)871.4%+4 (shadow)
All numbers above are derived from: callsscoresoutcome_linksretrievalsmodel_version refresh: hourly · materialized view mv_ai_perf_daily
Console · AI · Training pipeline

Training pipeline

Where Call Review's harvest goes. Three loops, ordered by leverage: KB authoring (instant, no training) · Prompt / playbook (1-day cycle, A/B-able) · DPO fine-tune (weekly, gated).

Loop 1 · KB authoringHIGHEST LEVERAGE
Each KB gap closed = the AI knows a fact next call. Most accuracy wins live here. No model training.

Open drafts3
Closed this week4
Avg time to publish1.4 days
Lift after publish+11 score pts on retest
Open KB gap queue
Loop 2 · Prompt / playbookDAILY · A/B
Patterns of failure → updates to system prompt or intent-specific playbooks. Version each change; A/B against shadow traffic.

Active playbooks9 (per intent)
In A/B test2 · prorate, complaint
Rule changes pending3
Last deployMay 19 · v1.4
Loop 3 · DPO fine-tuneWEEKLY · GATED
Manager-corrected pairs become preference data. Don't fine-tune on raw transcripts — bakes in mistakes. Gated by 2-person sign-off + held-out eval.

Approved pairs3,412
Pending review47
Next runSun May 25 · 9 pm
Last eval+4 score · no regress
DPO pair funnel · last 30 days
Calls
847
Reviewed
412
Corrected
189
Author-approved
156
Reviewer-approved
142
Eval-passed → shipped
138
Acceptance rate: 16.3% · Author→Reviewer agreement: 91% · Eval-pass rate: 97% · Avg pair latency: 2.3 days
Pending training pairs · awaiting your sign-off
Pair IDSource callIntentRejected (AI said)Preferred (mgr rewrote)SeverityAuthorAction
tp_8h2k9 CALL-…0614 prorate "Yes, we do prorate move-outs daily…" "Let me check our move-out policy before…" HIGH ×3 Andre
tp_4j1m7 CALL-…1547 complaint "OK and your unit number is again?" "I hear you, I'm transferring you to Andre right now — one moment." HIGH ×3 Andre
tp_2k9p4 CALL-…1822 hours "Gate is open 6am to 10pm every day." "At Morro Bay the gate closes at 9pm on Sundays — every other day it's 6am to 10pm." MED ×1 Sarah
tp_9x7s2 CALL-…2014 lockout "Sure I can clear your overlock right now." "Overlocks can only be cleared after payment is confirmed — your balance is $148 past due. Want to pay now?" HIGH ×3 Andre
Held-out eval set 42 frozen calls · never touched by training · regression gate
Composite83 → 87 +4
Factual86 → 92 +6
Policy79 → 88 +9
Hallucination1.9% → 1.4%
Regressions0 (gate threshold: ≤ 2)
Move-in8
Payment7
Prorate / move-out6
Hours / access5
Complaint5
Lockout4
Lien4
Edge cases3