Skip to content

Rodrigotari1/optimize-claude

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Claude Code Token Optimization Guide

Stop burning tokens. Use this checklist before every session.

Before Starting a Session

1. Set Clear Rules Upfront

Start every session with:

Build X with these constraints:
- Respond concisely, code only
- No markdown files unless I ask
- No summaries or documentation
- Small commits as you go
- Turn off thinking mode for simple tasks

2. Keep CLAUDE.md Minimal

Bad: 500+ line CLAUDE.md with every possible rule Good: <100 lines with only critical project-specific rules

Every line in CLAUDE.md is sent with EVERY request.

3. Disable Unnecessary MCP Servers

Each MCP server costs tokens even when not used:

  • Supabase + GitHub + Chrome = ~75k tokens per request
  • Only enable MCPs you're actively using
  • Disable them when done

4. Turn Off Thinking Mode by Default

Use thinking mode for:

  • Complex architecture decisions
  • Hard debugging problems
  • Performance optimization
  • Unclear requirements

Don't use thinking mode for:

  • Writing boilerplate
  • Simple refactors
  • Documentation
  • Bug fixes with clear solutions
  • Implementing well-defined features

Thinking mode = 2x token cost.

During the Session

5. Never Let Claude Compress Context

When context gets full, Claude offers to "compress" the conversation.

Always say NO. Start a new chat instead.

Compression burns massive tokens and loses quality.

6. Use Images Instead of Text

Bad:

Here's the error:
[paste 100 lines of stack trace]

Good: Screenshot the error, drag into chat.

Images cost less than long text.

7. Be Specific in Prompts

Bad (wastes tokens on back-and-forth):

"Add the missing features"

Good (one shot):

"Add timeout to runner.ts using Promise.race. Accept timeoutMs param, reject after timeout."

8. Use Agents for Subtasks

When Claude creates agents, they use separate token pools.

Let it create agents for:

  • Code review
  • Documentation generation
  • Test writing
  • Refactoring

9. Prevent Auto-Documentation

Add to CLAUDE.md or say upfront:

Do not create:
- README files
- Documentation
- Summary files
- .md files

Unless I explicitly ask.

10. Don't Overuse Opus

Opus costs significantly more than Sonnet.

Use Opus for:

  • Complex system design
  • Hard architectural decisions
  • Critical debugging

Use Sonnet for:

  • Feature implementation
  • Refactoring
  • Bug fixes
  • Most coding tasks

11. Start New Chats Frequently

Don't let context grow to 200k tokens.

Start new chat when:

  • Switching to different feature
  • Context feels bloated
  • Claude starts repeating itself
  • You've been in same chat >30 minutes

12. Avoid Reasoning MCPs for Simple Tasks

Sequential thinker, code reasoner = extra tokens.

Only use for genuinely complex problems.

Prompt Templates

Efficient Feature Request

Add [feature] to [file]:
- [specific requirement 1]
- [specific requirement 2]
- [specific requirement 3]

Keep response concise. Code only, no explanations unless I ask.

Efficient Debugging

Bug: [one sentence description]
File: [filename]
Expected: [behavior]
Actual: [behavior]

[screenshot or minimal code snippet]

Fix it.

Efficient Refactor

Refactor [file/function]:
- [specific change 1]
- [specific change 2]

No explanations needed.

What NOT to Do

❌ Let context compress ❌ Use thinking mode for everything ❌ Keep unused MCP servers enabled ❌ Let Claude write documentation unprompted ❌ Use vague prompts that require back-and-forth ❌ Paste huge error logs (use images) ❌ Stay in same chat for hours ❌ Use Opus for simple tasks ❌ Have 500+ line CLAUDE.md files

Token Cost Examples

Action Approximate Cost
Simple feature with thinking 50-100k tokens
Simple feature without thinking 20-40k tokens
Documentation generation 30-50k tokens
Context compression 100-200k tokens
Long CLAUDE.md (500 lines) +10k per request
3 unused MCP servers +75k per request
Using Opus vs Sonnet 3-5x cost

Quick Wins

  1. Turn off thinking mode (saves 50%)
  2. Keep CLAUDE.md under 100 lines (saves 10k/request)
  3. Disable unused MCPs (saves 25k-75k/request)
  4. Be specific in prompts (saves back-and-forth)
  5. Start new chats frequently (avoid compression)

Session Checklist

Before starting:

  • CLAUDE.md is minimal (<100 lines)
  • Only necessary MCPs enabled
  • Thinking mode off by default
  • Clear, specific task defined

During session:

  • Using images for errors
  • Being specific in requests
  • Not letting context grow too large
  • Using agents for subtasks

After session:

  • Disable MCPs you won't need next time
  • Clean up any auto-generated docs

Bottom line: Be intentional. Every token costs money. Most people burn 2-3x more tokens than necessary through poor habits.

Use this guide. Save tokens. Build more.

About

Stop burning tokens

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published