Claude Sonnet 4.5 output token usage increased 52% vs 7-day average
Current impact
$3,100
vs baseline
Forecasted impact / mo
$3,600
if trend continues
Change vs baseline
+52.0%
Output Tokens
Confidence
76%
likely cause attribution
Event Context
Usage Timeline — Output Tokens
Actual vs 7-day rolling baseline
Usage Metric Breakdown
Current vs baseline — all contributing metrics
Input Tokens
1.4M
Baseline: 800k
+75%
Output Tokens
710k
Baseline: 300k
+137%
Cache Read Tokens
496k
Baseline: 420k
+18%
Cache Write Tokens
220k
Baseline: 180k
+22%
API Requests
4k
Baseline: 2k
+81%
Root Cause Analysis
1 confirmed · 2 need confirmation
Calls to /api/contracts/analyze: 1,400/day → 2,140/day (+53%)
Avg completion_tokens per call: 1,800 → 2,900 (+61%)
Interpreted as: Longer output suggests contracts are more complex or the response format changed. Check if the analysis prompt template was modified.
Was the contract analysis prompt or response format changed around Apr 20?
Call volume to /api/contracts/analyze jumped 53% on Apr 20 with no prior ramp.
Did an internal rollout or feature flag change happen on Apr 20?
Likely cause
Increased usage of AI Contract Review workflow after internal rollout
76% confidence · 2 signals awaiting confirmation
Connect a deployment webhook to auto-correlate releases with future spend events.
Suggested Next Steps
Owner: Legal Ops / Product Engineering
Confirm internal rollout scope with Legal Ops team
Enable prompt caching on /api/contracts/analyze to cut input token costs
Review output length — shorter responses may be viable for initial triage
Set cache write/read ratio target of 80% to optimize costs
Related dimensions