Understanding AI Tokens & Costs

Tip 1: Write a Brief, Not a Chat

Imagine hiring an expert contractor. You would send a detailed brief upfront — not a chain of back-and-forth emails asking one question at a time.

Expensive

6 emails back and forth

Cost-effective

1 detailed brief

Scroll to compare

Back-and-forth approach

Tokens: 0 Cost: $0

Single-shot brief

Tokens: 0 Cost: $0

Use a free tool (like ChatGPT free tier or Gemini) to iterate on your prompt. Refine the wording, test approaches, get it right. Then paste the final version into your paid tool for execution.

Free tool (iterate)

→

Refine

→

Paid tool (execute once)

When using Cowork or Claude Code, ask the AI to create a plan before doing work. Review carefully. Only approve once you are satisfied.

1. Plan

→

2. Review checkpoint

→

3. Execute

Key takeaway One well-crafted prompt almost always costs less than a back-and-forth conversation that arrives at the same result.

Tip 2: Tell the AI How Long to Talk

Output tokens cost 5x more than input tokens on Sonnet ($15 per million vs $3 per million). Most people never specify a length, so the AI writes 400 words when 3 sentences would do.

Why output costs dominate your bill

What you type (input)

$3

per million tokens

What AI writes (output)

$15

per million tokens (5x more)

Scroll to see 3 versions

Key takeaway Output tokens cost 5x more than input tokens. A 10-word instruction like "keep it under 50 words" can save you 80% on output costs.

Tip 3: Be Specific, Cut Filler

Write like you are emailing a busy executive — every word earns its place. The real savings come from removing filler paragraphs and adding specific requirements upfront.

Edit and compare

Wordy prompt

Tokens: 0 | Cost: $0

Specific prompt

Tokens: 0 | Cost: $0

Exercise: Trim the fat

Click on filler words and phrases to remove them. Watch the token count drop.

Original: 0 tokens | Current: 0 tokens

0% saved

This is NOT about being rude. "Please summarize" and "Summarize" differ by 1 token (fractions of a cent). The real savings come from removing filler paragraphs and adding specific requirements so the AI does not have to guess.

Key takeaway Specific prompts save tokens twice: your input is shorter, AND the AI's output is more focused because it knows exactly what you want.

Tip 4: Batch Related Questions

Do not send 3 separate emails when one email with 3 numbered questions will do.

Scroll to compare

3 separate messages

Total: 0 tokens | $0

1 batched message

Total: 0 tokens | $0

When NOT to batch If your questions are about completely different topics, use separate sessions instead. Mixing unrelated topics means carrying irrelevant context. See Tip 5.

Key takeaway Batching related questions into one message avoids paying the re-read tax multiple times for the same context.

Tip 5: New Task = New Session

When you start a new meeting, you do not bring the transcript of the last meeting and read it aloud first. The same applies to AI conversations.

Same session, new topic

Room full of old papers

Fresh session

Clean slate

Scroll to see the waste pile up

0Total tokens sent

0Wasted (old topic)

0Useful (new topic)

0%Waste ratio

Decision guide: Should you start a new session?

Are you changing topics?

↓

YES

Start a new session

NO

Is the conversation getting long? (10+ messages)

↓

YES

Start fresh with a summary of key decisions

NO

Continue the conversation

Key takeaway When you switch to a new topic, start a new session. Every token of old, irrelevant history is money wasted.

Tip 6: Pick the Right Model for the Job

You would not hire a lawyer to proofread a lunch order. The same logic applies to AI models. Simpler tasks should use cheaper, faster models.

Scroll to see the tiers

Haiku — The Intern

Fast, good enough for routine work. Quick lookups, simple rewrites, format conversions, brainstorming.

$0.80 / $4.00 per MTok (input / output)

Sonnet — The Generalist

Handles 90% of your work. Writing emails, summarizing reports, code help, analysis.

$3.00 / $15.00 per MTok (input / output)

Opus — The Specialist

Expensive, save for hard problems. Complex strategy, nuanced legal review, multi-step reasoning.

$15.00 / $75.00 per MTok (input / output)

Match each task to the right model

Click a task, then click the model bucket where it belongs. After placing all 6, you will see the results.

Haiku ($0.80/$4)

Sonnet ($3/$15)

Opus ($15/$75)

Key takeaway Use the cheapest model that can handle the task. Most routine work belongs on Haiku or Sonnet. Save Opus for genuinely complex problems.

Tip 7: Watch Your File and Image Uploads

Bringing a filing cabinet to a meeting when you only need one page. That is what uploading entire files does to your token bill.

Scroll to see the scenarios

Rule of thumb Images and files get converted to tokens too. A single high-res screenshot can cost as much as 800 words of text. Before uploading: crop, extract only what you need, and prefer pasting text over screenshots of text.

Key takeaway Only send the AI what it needs to see. Crop images, extract relevant pages, and paste text instead of images of text.

How Does AI Actually Read Your Text?

What you type vs. what the AI sees

Quick intuition check

Token Playground

The Re-Read Tax: Why Message #10 Costs More Than Message #1

Cost Dashboard (Sonnet)

7 Ways to Cut Your AI Costs

Tip 1: Write a Brief, Not a Chat

Tip 2: Tell the AI How Long to Talk

Why output costs dominate your bill

Tip 3: Be Specific, Cut Filler

Edit and compare

Exercise: Trim the fat

Tip 4: Batch Related Questions

Tip 5: New Task = New Session

Decision guide: Should you start a new session?

Tip 6: Pick the Right Model for the Job

Match each task to the right model

Tip 7: Watch Your File and Image Uploads

Quick Reference Cheat Sheet

AI Token Cost-Saving Cheat Sheet

Model Pricing Reference

Model	Input / MTok	Output / MTok	Output multiplier
Claude Haiku	$0.80	$4.00	5x more
Claude Sonnet	$3.00	$15.00	5x more
Claude Opus	$15.00	$75.00	5x more