You type in plain English. But the AI does not read words the way you do. It breaks your text into small chunks called tokens — and every single token costs money.
What you type vs. what the AI sees
What you type
Summarize the attached project report for the steering committee.
What the AI sees
You read words
=
AI reads tokens
A token is roughly 3–4 characters. Common short words are one token. Longer or unusual words get split into multiple tokens. Every token the AI reads or writes adds to your bill.
Quick intuition check
Which sentence uses the most tokens? Click to guess, then reveal the answer.
Token Playground
Type or paste any text below to see approximately how many tokens it uses and what it would cost.
Try an example:
Token breakdown
Select model for pricing
Input / Output cost per million tokens
Estimated AI response length
2.0x your input
0.5x (short reply)5x (detailed reply)
0Input tokens
0Est. output tokens
$0Input cost
$0Output cost
$0Total cost
These are approximations for learning purposes. Actual tokenization uses an algorithm called BPE (Byte Pair Encoding). The goal here is to build your intuition, not provide exact numbers.
The Re-Read Tax: Why Message #10 Costs More Than Message #1
Every time you send a message in a conversation, the AI re-reads every previous message from the start. It is like writing a longer and longer letter each time, copying everything said before and adding your new sentence at the end.
Scroll to explore this simulation
Conversation
Cost Dashboard (Sonnet)
0Turns
0Tokens this turn
0Total tokens spent
$0Cumulative cost
Turn
This turn
Cost
Cumul.
Cumul. $
Key insight
Every message you send includes the ENTIRE conversation history. The AI re-reads everything from the start, every single time. This is why a 20-message conversation costs far more than 20 times the cost of a single message.
7 Ways to Cut Your AI Costs
Now that you understand how tokens and the re-read tax work, here are concrete strategies to spend less while getting better results.
Tip 1: Write a Brief, Not a Chat
Imagine hiring an expert contractor. You would send a detailed brief upfront — not a chain of back-and-forth emails asking one question at a time.
Expensive
6 emails back and forth
Cost-effective
1 detailed brief
Scroll to compare
Back-and-forth approach
Tokens: 0Cost: $0
Single-shot brief
Tokens: 0Cost: $0
Use a free tool (like ChatGPT free tier or Gemini) to iterate on your prompt. Refine the wording, test approaches, get it right. Then paste the final version into your paid tool for execution.
Free tool (iterate)
→
Refine
→
Paid tool (execute once)
When using Cowork or Claude Code, ask the AI to create a plan before doing work. Review carefully. Only approve once you are satisfied.
1. Plan
→
2. Review checkpoint
→
3. Execute
Key takeaway One well-crafted prompt almost always costs less than a back-and-forth conversation that arrives at the same result.
Tip 2: Tell the AI How Long to Talk
Output tokens cost 5x more than input tokens on Sonnet ($15 per million vs $3 per million). Most people never specify a length, so the AI writes 400 words when 3 sentences would do.
Why output costs dominate your bill
What you type (input)
$3
per million tokens
What AI writes (output)
$15
per million tokens (5x more)
Scroll to see 3 versions
Key takeaway Output tokens cost 5x more than input tokens. A 10-word instruction like "keep it under 50 words" can save you 80% on output costs.
Tip 3: Be Specific, Cut Filler
Write like you are emailing a busy executive — every word earns its place. The real savings come from removing filler paragraphs and adding specific requirements upfront.
Edit and compare
Wordy prompt
Tokens: 0 | Cost: $0
Specific prompt
Tokens: 0 | Cost: $0
Exercise: Trim the fat
Click on filler words and phrases to remove them. Watch the token count drop.
Original: 0 tokens | Current: 0 tokens
0% saved
This is NOT about being rude. "Please summarize" and "Summarize" differ by 1 token (fractions of a cent). The real savings come from removing filler paragraphs and adding specific requirements so the AI does not have to guess.
Key takeaway Specific prompts save tokens twice: your input is shorter, AND the AI's output is more focused because it knows exactly what you want.
Tip 4: Batch Related Questions
Do not send 3 separate emails when one email with 3 numbered questions will do.
Scroll to compare
3 separate messages
Total: 0 tokens | $0
1 batched message
Total: 0 tokens | $0
When NOT to batch If your questions are about completely different topics, use separate sessions instead. Mixing unrelated topics means carrying irrelevant context. See Tip 5.
Key takeaway Batching related questions into one message avoids paying the re-read tax multiple times for the same context.
Tip 5: New Task = New Session
When you start a new meeting, you do not bring the transcript of the last meeting and read it aloud first. The same applies to AI conversations.
Same session, new topic
Room full of old papers
Fresh session
Clean slate
Scroll to see the waste pile up
0Total tokens sent
0Wasted (old topic)
0Useful (new topic)
0%Waste ratio
Decision guide: Should you start a new session?
Are you changing topics?
↓
YES
Start a new session
NO
Is the conversation getting long? (10+ messages)
↓
YES
Start fresh with a summary of key decisions
NO
Continue the conversation
Key takeaway When you switch to a new topic, start a new session. Every token of old, irrelevant history is money wasted.
Tip 6: Pick the Right Model for the Job
You would not hire a lawyer to proofread a lunch order. The same logic applies to AI models. Simpler tasks should use cheaper, faster models.
Scroll to see the tiers
Haiku — The Intern
Fast, good enough for routine work. Quick lookups, simple rewrites, format conversions, brainstorming.
$0.80 / $4.00 per MTok (input / output)
Sonnet — The Generalist
Handles 90% of your work. Writing emails, summarizing reports, code help, analysis.
$3.00 / $15.00 per MTok (input / output)
Opus — The Specialist
Expensive, save for hard problems. Complex strategy, nuanced legal review, multi-step reasoning.
$15.00 / $75.00 per MTok (input / output)
Match each task to the right model
Click a task, then click the model bucket where it belongs. After placing all 6, you will see the results.
Haiku ($0.80/$4)
Sonnet ($3/$15)
Opus ($15/$75)
Key takeaway Use the cheapest model that can handle the task. Most routine work belongs on Haiku or Sonnet. Save Opus for genuinely complex problems.
Tip 7: Watch Your File and Image Uploads
Bringing a filing cabinet to a meeting when you only need one page. That is what uploading entire files does to your token bill.
Scroll to see the scenarios
Rule of thumb Images and files get converted to tokens too. A single high-res screenshot can cost as much as 800 words of text. Before uploading: crop, extract only what you need, and prefer pasting text over screenshots of text.
Key takeaway Only send the AI what it needs to see. Crop images, extract relevant pages, and paste text instead of images of text.
Quick Reference Cheat Sheet
A one-page summary. Print it and keep it at your desk.
AI Token Cost-Saving Cheat Sheet
1. Write a brief, not a chat
One detailed prompt beats 6 back-and-forth messages. Save up to 79% of tokens.
2. Tell the AI how long to talk
Output costs 5x more than input. "Keep it under 50 words" can save 80%+ on output.
3. Be specific, cut filler
Remove vague filler. Add specific requirements. Shorter input, more focused output.
4. Batch related questions
One message with 3 questions beats 3 separate messages. Avoid the re-read tax.
5. New task = new session
Switching topics? Start fresh. Do not carry irrelevant history forward.
6. Pick the right model
Haiku for simple tasks, Sonnet for most work, Opus only for complex reasoning.
7. Watch file and image uploads
Crop images, extract relevant pages, paste text instead of screenshots. A full PDF can cost 15x more than the 2 pages you need.