Stop burning tokens. Use this checklist before every session.
Start every session with:
Build X with these constraints:
- Respond concisely, code only
- No markdown files unless I ask
- No summaries or documentation
- Small commits as you go
- Turn off thinking mode for simple tasks
Bad: 500+ line CLAUDE.md with every possible rule Good: <100 lines with only critical project-specific rules
Every line in CLAUDE.md is sent with EVERY request.
Each MCP server costs tokens even when not used:
- Supabase + GitHub + Chrome = ~75k tokens per request
- Only enable MCPs you're actively using
- Disable them when done
Use thinking mode for:
- Complex architecture decisions
- Hard debugging problems
- Performance optimization
- Unclear requirements
Don't use thinking mode for:
- Writing boilerplate
- Simple refactors
- Documentation
- Bug fixes with clear solutions
- Implementing well-defined features
Thinking mode = 2x token cost.
When context gets full, Claude offers to "compress" the conversation.
Always say NO. Start a new chat instead.
Compression burns massive tokens and loses quality.
Bad:
Here's the error:
[paste 100 lines of stack trace]
Good: Screenshot the error, drag into chat.
Images cost less than long text.
Bad (wastes tokens on back-and-forth):
"Add the missing features"
Good (one shot):
"Add timeout to runner.ts using Promise.race. Accept timeoutMs param, reject after timeout."
When Claude creates agents, they use separate token pools.
Let it create agents for:
- Code review
- Documentation generation
- Test writing
- Refactoring
Add to CLAUDE.md or say upfront:
Do not create:
- README files
- Documentation
- Summary files
- .md files
Unless I explicitly ask.
Opus costs significantly more than Sonnet.
Use Opus for:
- Complex system design
- Hard architectural decisions
- Critical debugging
Use Sonnet for:
- Feature implementation
- Refactoring
- Bug fixes
- Most coding tasks
Don't let context grow to 200k tokens.
Start new chat when:
- Switching to different feature
- Context feels bloated
- Claude starts repeating itself
- You've been in same chat >30 minutes
Sequential thinker, code reasoner = extra tokens.
Only use for genuinely complex problems.
Add [feature] to [file]:
- [specific requirement 1]
- [specific requirement 2]
- [specific requirement 3]
Keep response concise. Code only, no explanations unless I ask.
Bug: [one sentence description]
File: [filename]
Expected: [behavior]
Actual: [behavior]
[screenshot or minimal code snippet]
Fix it.
Refactor [file/function]:
- [specific change 1]
- [specific change 2]
No explanations needed.
❌ Let context compress ❌ Use thinking mode for everything ❌ Keep unused MCP servers enabled ❌ Let Claude write documentation unprompted ❌ Use vague prompts that require back-and-forth ❌ Paste huge error logs (use images) ❌ Stay in same chat for hours ❌ Use Opus for simple tasks ❌ Have 500+ line CLAUDE.md files
| Action | Approximate Cost |
|---|---|
| Simple feature with thinking | 50-100k tokens |
| Simple feature without thinking | 20-40k tokens |
| Documentation generation | 30-50k tokens |
| Context compression | 100-200k tokens |
| Long CLAUDE.md (500 lines) | +10k per request |
| 3 unused MCP servers | +75k per request |
| Using Opus vs Sonnet | 3-5x cost |
- Turn off thinking mode (saves 50%)
- Keep CLAUDE.md under 100 lines (saves 10k/request)
- Disable unused MCPs (saves 25k-75k/request)
- Be specific in prompts (saves back-and-forth)
- Start new chats frequently (avoid compression)
Before starting:
- CLAUDE.md is minimal (<100 lines)
- Only necessary MCPs enabled
- Thinking mode off by default
- Clear, specific task defined
During session:
- Using images for errors
- Being specific in requests
- Not letting context grow too large
- Using agents for subtasks
After session:
- Disable MCPs you won't need next time
- Clean up any auto-generated docs
Bottom line: Be intentional. Every token costs money. Most people burn 2-3x more tokens than necessary through poor habits.
Use this guide. Save tokens. Build more.