What It Measures
Compute CPI tracks the changing cost of a representative "basket" of AI work, expressed as an index (base=100) and an inflation rate (MoM/YoY).
Like the Consumer Price Index measures the cost of household goods over time, Compute CPI measures the cost of common AI workloads. This enables organizations to understand AI cost trends, forecast budgets, and benchmark procurement decisions.
The Basket
The index is built on six workload categories, each representing a common pattern of AI usage:
| Category | Input Tokens | Output Tokens | Weight |
|---|---|---|---|
| Chat / Drafting | 2,000 | 500 | 20% |
| Summarization | 10,000 | 500 | 25% |
| Classification | 500 | 50 | 20% |
| Coding | 3,000 | 1,000 | 15% |
| Judgment / Reasoning | 5,000 | 2,000 | 10% |
| Long-Context Synthesis | 50,000 | 1,000 | 10% |
These weights reflect typical institutional usage patterns. The Civic CPI (coming Q2 2026) will use different weights optimized for public sector workloads.
Calculation
For each workload category i, we calculate cost as:
The weighted basket cost is:
The index value is calculated relative to the baseline period:
Model Selection
Each workload category uses prices from representative models in the appropriate tier:
| Tier | Used For | Representative Models |
|---|---|---|
| Budget | Classification | GPT-4o-mini, Gemini Flash, Claude Haiku |
| General | Chat, Summarization | GPT-4o, Gemini Pro, Claude Sonnet |
| Frontier | Coding | GPT-4o, Claude Sonnet, Gemini Pro |
| Reasoning | Judgment | o1, o3-mini, DeepSeek-R1 |
| Long-Context | Synthesis | Gemini Pro (2M), Claude (200K) |
Costs are averaged across models in each tier to reduce sensitivity to any single provider's pricing decisions.
Subindices
| Ticker | Name | What It Tracks |
|---|---|---|
$JUDGE |
Judgment CPI | Cost of reasoning-intensive workloads |
$LCTX |
LongContext CPI | Cost of high-context workloads |
$BULK |
Budget Tier | Cost of cheapest throughput models |
$FRONT |
Frontier Tier | Cost of best capability models |
Index Series
The index series allows comparison against multiple base periods, providing different perspectives on AI cost trends:
| Series | Ticker | Base Period | Use Case |
|---|---|---|---|
| Since Launch | $CPI-L |
February 2025 | Long-term trend analysis |
| Year-over-Year | $CPI-Y |
365 days ago | Annual comparison |
| Quarter-to-Date | $CPI-Q |
Start of current quarter | Quarterly budgeting |
| Month-to-Date | $CPI-M |
Start of current month | Monthly tracking |
| Week-over-Week | $CPI-W |
7 days ago | Short-term changes |
All series use the same basket and methodology—only the comparison period changes. A value of 100 means costs are unchanged from the base period; values below 100 indicate deflation.
Methodology Variants
Different organizations have different workload mixes. Methodology variants apply alternative weightings to the same basket categories:
| Variant | Ticker | Focus |
|---|---|---|
| General Purpose | $CPI-GEN |
Balanced workload mix (default weights) |
| Frontier Heavy | $CPI-FRO |
Emphasis on coding (35%) and reasoning (25%) |
| Budget Optimized | $CPI-BUD |
Cost-conscious: classification (30%), chat (30%) |
| Reasoning Focus | $CPI-REA |
Heavy reasoning (45%), long context (15%) |
| Enterprise Mix | $CPI-ENT |
Summarization (30%), classification (25%) |
Compare your organization's workload mix to these variants to find the most relevant inflation measure for your use case.
Trend Analysis
Beyond point-in-time measurements, we provide trend analysis to help forecast future costs:
- Direction: Deflating, stable, or inflating based on recent velocity
- Velocity: Rate of change in CPI points per month
- Acceleration: Change in velocity (is deflation speeding up or slowing?)
- 30-day Forecast: Projected CPI value if current trend continues
- Confidence: High/medium/low based on data availability
Direction thresholds: velocity < -1 = deflating, velocity > +1 = inflating.
Yield Curve
The yield curve shows annualized deflation (or inflation) rates at different time horizons—similar to bond yield curves but measuring AI compute cost changes:
| Horizon | Calculation |
|---|---|
| 1W, 1M, 3M, 6M, 1Y | ((current - historical) / historical) × (365 / days) × 100 |
Interpretation: Negative rates indicate deflation (costs falling), positive rates indicate inflation. A "normal" curve shows steeper deflation at shorter horizons that moderates over longer periods.
Spreads
Spreads measure the premium paid for specific capabilities:
| Ticker | Name | Calculation |
|---|---|---|
$COG-P |
Cognition Premium | $FRONT − $BULK |
$JDG-P |
Judgment Premium | $JUDGE − $FRONT |
$CTX-P |
Context Premium | $LCTX − $FRONT |
Exchange Rates
Cognitive Exchange Rates show the relative cost between model tiers, expressed as token equivalents. This makes opportunity cost instantly visible—like forex cross-rates for AI compute.
Base Currency: $UTIL (Utility Token)
The base is Gemini Flash, representing cheap utility compute. All other tiers are expressed as multiples of this base cost.
Calculation:
Rate = TierBlendedCost / BaseBlendedCost
BlendedCost = (InputCost × 0.75) + (OutputCost × 0.25)
A rate of "1 $FRONT = 64 $UTIL" means one frontier token costs as much as 64 utility tokens. This helps teams understand the opportunity cost of using expensive models for tasks that could run on cheaper ones.
Build Cost Index (Persona Baskets)
Different teams have different workload mixes. The Build Cost Index tracks inflation for three representative build patterns, each with its own basket weightings.
| Persona | Ticker | Description |
|---|---|---|
| Startup Builder | $START |
Building AI-first products: 50% coding, 30% RAG context, 20% routing |
| Agentic Team | $AGENT |
Running autonomous agents: 70% reasoning, 20% tool use, 10% final output |
| Throughput | $THRU |
High-volume processing: 80% extraction, 20% classification |
Each persona sees different inflation depending on which model tiers they rely on most heavily. A team building agents (heavy reasoning) will see different cost pressure than a team doing high-volume extraction (mostly budget tier).
Data Sources
| Source | Data | Purpose |
|---|---|---|
| OpenRouter API | Live prices | Primary source for current spot rates |
| LiteLLM Database | Comprehensive pricing | 2000+ models for coverage depth |
| pydantic/genai-prices | Historical prices | Backfill for MoM/YoY calculations |
| simonw/llm-prices | Historical archive | Cross-reference and validation |
Historical Methodology
Baseline: February 2026 = 100. This baseline is immutable once set.
Historical Reconstruction: To enable MoM and YoY calculations from day one, we backfilled historical data using archived prices from pydantic/genai-prices and simonw/llm-prices.
Model Substitution: When exact historical models aren't available, we use the closest equivalent from the same provider and tier. For example, historical data may use gemini-1.5-pro where current data uses gemini-2.5-pro.
Reconstructed Flag: Historical snapshots are marked with
reconstructed: true to distinguish them from live calculations.
Update Schedule
- Daily: Index values updated at 06:00 UTC
- Quarterly: Full reports with analysis and commentary
- As needed: Methodology revisions (versioned and documented)
Independence
Occupant does not accept referral fees, sponsored rankings, or payments from model providers. The index is funded independently and maintained as a public resource.
Our incentive is accuracy and utility, not revenue from recommendations.
Limitations
- Basket assumptions: The workload definitions and weights reflect our best estimate of typical usage. Your organization's actual usage may differ.
- Model selection: We track major commercial models. Open-source and self-hosted options are not included in the current methodology.
- Quality normalization: We group models by tier but do not adjust for quality differences within tiers.
- Latency and throughput: The index measures cost only, not performance characteristics like speed or availability.
Future Development
- Civic CPI: Weights optimized for public sector workloads (intake, eligibility, appeals, compliance)
- Quality-adjusted indices: Incorporating capability scores into cost calculations
- Regional indices: Tracking price differences across deployment regions
- API access: Programmatic access for researchers and governance teams
Contact
Questions about methodology? Suggestions for improvement? Interested in collaboration?