Opus Rich Output: Cost & Token Impact Analysis

Output

💰 Opus Rich Output: Cost & Token Impact

The Short Answer: Yes, It Costs More

Rich HTML output means more output tokens, which directly increases cost. But the impact varies significantly by what you're comparing.

Token Comparison: Same Content, Different Formats

Output Type	~Tokens	Relative
Plain text answer	~200	1x
Markdown formatted	~250	1.25x
Basic HTML kontask	~500	2.5x
Rich styled kontask (like YOLO card)	~1500-3000	7-15x

Pricing Impact (Opus 4.5)

Metric	Rate
Input tokens	$15 / 1M tokens
Output tokens	$75 / 1M tokens
Cache read	$1.50 / 1M tokens
Cache write	$18.75 / 1M tokens

Key insight: Output is 5x more expensive than input. Rich HTML hits the expensive side.

Real Cost Examples

Scenario	Output Tokens	Cost
Plain "Yes, task completed"	50	$0.00375
Basic kontask card	500	$0.0375
Rich YOLO-style card	2500	$0.1875

The YOLO card costs ~50x more than a plain text response in output tokens alone.

Haiku vs Opus: The Full Picture

Model	Input	Output	Rich Card Cost
Haiku 3.5	$0.80/M	$4/M	~$0.01
Opus 4.5	$15/M	$75/M	~$0.19

Opus is ~19x more expensive per rich card. But Haiku often can't produce the rich output anyway.

Rate Limits

Rich output affects rate limits in two ways:

Tokens per minute (TPM) - More output = faster exhaustion
Requests per minute (RPM) - Unchanged by output size

With rich output, you'll hit TPM limits before RPM limits.

💡 Optimization Strategies

Use Haiku for simple tasks - Status checks, yes/no questions
Reserve Opus for rich output - When visual quality matters
Cache aggressively - CLAUDE.md templates get cached, reducing input costs
Consider hybrid - Haiku for logic, Opus for presentation

Summary

Rich HTML kontask output is 7-15x more tokens than plain text. With Opus at $75/M output tokens, a fancy card costs ~$0.15-0.20 vs $0.004 for plain text. The visual quality may be worth it for user-facing output, but use Haiku ($4/M output) for high-volume internal tasks.

Quick Actions

Original Request

No layout configured

Details

Type General

Status Completed

Scope vibetools

Tags coststokensopusperformance

Created 7 Jan 2026, 8:48 am

Updated 7 Jan 2026, 8:48 am

Timing

⏱️ 46.6s

0 tools

● TTFE 0% ● Thinking 100% ● Tools 0%

Created By claude

Raw Data

{
  "id": "308582bb-98ac-4e8b-90fb-d874bdc38de4",
  "type": "general",
  "status": "completed",
  "title": "Opus Rich Output: Cost & Token Impact Analysis",
  "description": "How rich HTML kontask output affects Claude API usage and costs",
  "context": {
    "output": "<div style=\"background:#1e1e1e;color:#e0e0e0;padding:20px;border-radius:8px;font-family:sans-serif;\"><h2 style=\"color:#fff;margin:0 0 20px 0;\">💰 Opus Rich Output: Cost & Token Impact</h2><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#ef4444;\">The Short Answer: Yes, It Costs More</h3><p style=\"color:#e0e0e0;line-height:1.7;margin:0;\">Rich HTML output means <strong>more output tokens</strong>, which directly increases cost. But the impact varies significantly by what you're comparing.</p></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#61afef;\">Token Comparison: Same Content, Different Formats</h3><table style=\"width:100%;border-collapse:collapse;margin-top:8px;\"><tr style=\"border-bottom:1px solid #3d3d3d;\"><th style=\"text-align:left;padding:10px;color:#7f848e;\">Output Type</th><th style=\"text-align:right;padding:10px;color:#7f848e;\">~Tokens</th><th style=\"text-align:right;padding:10px;color:#7f848e;\">Relative</th></tr><tr style=\"border-bottom:1px solid #3d3d3d;\"><td style=\"padding:10px;color:#e0e0e0;\">Plain text answer</td><td style=\"padding:10px;text-align:right;color:#22c55e;\">~200</td><td style=\"padding:10px;text-align:right;color:#22c55e;\">1x</td></tr><tr style=\"border-bottom:1px solid #3d3d3d;\"><td style=\"padding:10px;color:#e0e0e0;\">Markdown formatted</td><td style=\"padding:10px;text-align:right;color:#22c55e;\">~250</td><td style=\"padding:10px;text-align:right;color:#22c55e;\">1.25x</td></tr><tr style=\"border-bottom:1px solid #3d3d3d;\"><td style=\"padding:10px;color:#e0e0e0;\">Basic HTML kontask</td><td style=\"padding:10px;text-align:right;color:#f97316;\">~500</td><td style=\"padding:10px;text-align:right;color:#f97316;\">2.5x</td></tr><tr><td style=\"padding:10px;color:#e0e0e0;\">Rich styled kontask (like YOLO card)</td><td style=\"padding:10px;text-align:right;color:#ef4444;\">~1500-3000</td><td style=\"padding:10px;text-align:right;color:#ef4444;\">7-15x</td></tr></table></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#61afef;\">Pricing Impact (Opus 4.5)</h3><table style=\"width:100%;border-collapse:collapse;margin-top:8px;\"><tr style=\"border-bottom:1px solid #3d3d3d;\"><th style=\"text-align:left;padding:10px;color:#7f848e;\">Metric</th><th style=\"text-align:right;padding:10px;color:#7f848e;\">Rate</th></tr><tr style=\"border-bottom:1px solid #3d3d3d;\"><td style=\"padding:10px;color:#e0e0e0;\">Input tokens</td><td style=\"padding:10px;text-align:right;color:#98c379;\">$15 / 1M tokens</td></tr><tr style=\"border-bottom:1px solid #3d3d3d;\"><td style=\"padding:10px;color:#e0e0e0;\">Output tokens</td><td style=\"padding:10px;text-align:right;color:#e5c07b;\">$75 / 1M tokens</td></tr><tr style=\"border-bottom:1px solid #3d3d3d;\"><td style=\"padding:10px;color:#e0e0e0;\">Cache read</td><td style=\"padding:10px;text-align:right;color:#22c55e;\">$1.50 / 1M tokens</td></tr><tr><td style=\"padding:10px;color:#e0e0e0;\">Cache write</td><td style=\"padding:10px;text-align:right;color:#f97316;\">$18.75 / 1M tokens</td></tr></table><p style=\"color:#9ca3af;margin:12px 0 0 0;font-size:0.9em;\"><strong>Key insight:</strong> Output is 5x more expensive than input. Rich HTML hits the expensive side.</p></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#61afef;\">Real Cost Examples</h3><table style=\"width:100%;border-collapse:collapse;margin-top:8px;\"><tr style=\"border-bottom:1px solid #3d3d3d;\"><th style=\"text-align:left;padding:10px;color:#7f848e;\">Scenario</th><th style=\"text-align:right;padding:10px;color:#7f848e;\">Output Tokens</th><th style=\"text-align:right;padding:10px;color:#7f848e;\">Cost</th></tr><tr style=\"border-bottom:1px solid #3d3d3d;\"><td style=\"padding:10px;color:#e0e0e0;\">Plain \"Yes, task completed\"</td><td style=\"padding:10px;text-align:right;\">50</td><td style=\"padding:10px;text-align:right;color:#22c55e;\">$0.00375</td></tr><tr style=\"border-bottom:1px solid #3d3d3d;\"><td style=\"padding:10px;color:#e0e0e0;\">Basic kontask card</td><td style=\"padding:10px;text-align:right;\">500</td><td style=\"padding:10px;text-align:right;color:#f97316;\">$0.0375</td></tr><tr><td style=\"padding:10px;color:#e0e0e0;\">Rich YOLO-style card</td><td style=\"padding:10px;text-align:right;\">2500</td><td style=\"padding:10px;text-align:right;color:#ef4444;\">$0.1875</td></tr></table><p style=\"color:#9ca3af;margin:12px 0 0 0;font-size:0.9em;\">The YOLO card costs ~50x more than a plain text response in output tokens alone.</p></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#c678dd;\">Haiku vs Opus: The Full Picture</h3><table style=\"width:100%;border-collapse:collapse;margin-top:8px;\"><tr style=\"border-bottom:1px solid #3d3d3d;\"><th style=\"text-align:left;padding:10px;color:#7f848e;\">Model</th><th style=\"text-align:right;padding:10px;color:#7f848e;\">Input</th><th style=\"text-align:right;padding:10px;color:#7f848e;\">Output</th><th style=\"text-align:right;padding:10px;color:#7f848e;\">Rich Card Cost</th></tr><tr style=\"border-bottom:1px solid #3d3d3d;\"><td style=\"padding:10px;color:#e0e0e0;\">Haiku 3.5</td><td style=\"padding:10px;text-align:right;\">$0.80/M</td><td style=\"padding:10px;text-align:right;\">$4/M</td><td style=\"padding:10px;text-align:right;color:#22c55e;\">~$0.01</td></tr><tr><td style=\"padding:10px;color:#e0e0e0;\">Opus 4.5</td><td style=\"padding:10px;text-align:right;\">$15/M</td><td style=\"padding:10px;text-align:right;\">$75/M</td><td style=\"padding:10px;text-align:right;color:#ef4444;\">~$0.19</td></tr></table><p style=\"color:#9ca3af;margin:12px 0 0 0;font-size:0.9em;\"><strong>Opus is ~19x more expensive per rich card.</strong> But Haiku often can't produce the rich output anyway.</p></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#98c379;\">Rate Limits</h3><p style=\"color:#e0e0e0;line-height:1.7;margin:0 0 12px 0;\">Rich output affects rate limits in two ways:</p><ol style=\"color:#e0e0e0;line-height:1.8;margin:0;padding-left:24px;\"><li><strong>Tokens per minute (TPM)</strong> - More output = faster exhaustion</li><li><strong>Requests per minute (RPM)</strong> - Unchanged by output size</li></ol><p style=\"color:#9ca3af;margin:12px 0 0 0;font-size:0.9em;\">With rich output, you'll hit TPM limits before RPM limits.</p></div><div style=\"background:#1a365d;padding:16px;border-radius:6px;border-left:4px solid #3b82f6;\"><h4 style=\"margin:0 0 8px 0;color:#fff;\">💡 Optimization Strategies</h4><ul style=\"color:#e0e0e0;line-height:1.8;margin:0;padding-left:20px;\"><li><strong>Use Haiku for simple tasks</strong> - Status checks, yes/no questions</li><li><strong>Reserve Opus for rich output</strong> - When visual quality matters</li><li><strong>Cache aggressively</strong> - CLAUDE.md templates get cached, reducing input costs</li><li><strong>Consider hybrid</strong> - Haiku for logic, Opus for presentation</li></ul></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-top:16px;\"><h4 style=\"margin:0 0 8px 0;color:#fff;\">Summary</h4><p style=\"color:#9ca3af;margin:0;line-height:1.7;\">Rich HTML kontask output is <strong>7-15x more tokens</strong> than plain text. With Opus at $75/M output tokens, a fancy card costs ~$0.15-0.20 vs $0.004 for plain text. The visual quality may be worth it for user-facing output, but use Haiku ($4/M output) for high-volume internal tasks.</p></div></div>",
    "requestedAt": "2026-01-07T02:32:00.000Z",
    "requestId": "01715e82-7a9c-4bdf-b2ea-7e127c299977",
    "choices": [
      {
        "label": "Show my usage",
        "value": "Show me my actual token usage and costs from recent sessions",
        "primary": true
      },
      {
        "label": "Optimize templates",
        "value": "How can we make kontask templates more token-efficient?"
      },
      {
        "label": "Haiku for simple",
        "value": "Configure konsole to use Haiku for simple queries automatically"
      }
    ],
    "turnTiming": {
      "totalMs": 46572,
      "ttfeMs": 49,
      "thinkingMs": 46523,
      "toolExecutionMs": 0,
      "toolCallCount": 0,
      "thinkingPct": 100,
      "toolsPct": 0,
      "ttfePct": 0
    }
  },
  "createdBy": "claude",
  "createdAt": "2026-01-06T22:48:53.748Z",
  "updatedAt": "2026-01-06T22:48:59.946Z",
  "requestId": "01715e82-7a9c-4bdf-b2ea-7e127c299977",
  "scope": "vibetools",
  "tags": [
    "costs",
    "tokens",
    "opus",
    "performance"
  ],
  "targetUser": "claude"
}