Idea for implementing more token efficient editing #131
Replies: 5 comments 12 replies
-
|
Hi @chknd1nner , really appreciate your input, thanks for it! Quick analysis of current status quo and our work on editing:We are currently working on several large improvements to the editing, will be merged within the next few days. The main improvements will come from easier identification of symbols in editing tools (using qualified names instead of locations). Your approach is a welcome extension that we have also been discussing already. We generally want to get rid of the ReplaceLinesTool, it is too fragile and LLMs are too dumb for using it properly. Claude Code and other similar systems often use string-matching based replacement, giving the entire old and new string. This is much less fragile but very token consuming. The symbol based replacement doesn't need the old symbol, so it's already better in terms of consumption. We need to see how well it works. We are moving fast towards first quantitative evaluations, where we will use subsets of SWE-verified. Once these are set up, we can experimentally see which editing approaches really work best. Your approach:The editing you propose still has a place even if symbol based editing starts working perfectly, as it could help us get rid of ReplaceLinesTool and still permit efficient, surgical operations. So I think it should be implemented and tested out. If you want to do that, feel free to go ahead! I'd suggest you write a new editing tool, disable the ReplaceLinesTool and experiment manually to see if you see improvements. Otherwise, we will likely implement something along these lines ourselves soon. In the implementation you need to make sure that there is exactly one match of the context before and context above, and that all whitespace and so is written correctly. Comment on ExperimentingA very simple way for you to test things out would be to write a new mode file in |
Beta Was this translation helpful? Give feedback.
-
|
What I meant is that the replace symbol body is often better than what Claude code is doing because even though it replaces the whole symbol, it doesn't need to also output the old code. So for larger operations it's definitely better. But for performing smaller edits within a large symbol it's not suited, I agree. Our hope was that the LLM would be good in using replace-lines for doing that, but it doesn't work very well. That's precisely why I think your approach is promising. For larger edits or refactorings, symbol replacement will still be the tool to go. But for smaller edits, I think something like your proposal will be best. One thing you can consider is to allow passing a symbol name in your tool to restrict the edits to a single symbol body. How would you prefer to proceed - do you want to prepare a PR and continue from there? Otherwise I'll take a look at your fork and will check how to incorporate the new editing tool. Btw, the branch better-symbol-editing is essentially ready and will likely be merged today |
Beta Was this translation helpful? Give feedback.
-
|
Got it. That makes sense. After my last experience with PR I think I should let you examine the code in the feature branch on my fork first. You're the expert on your own project lol. It's not ready for PR. I just noticed Claude left a lot of sloppy comments in the new class from stages 0-3 of the implementation lol. Anyway, I can't do any work for now as your last 41 commits broke my Serena! I restarted Claude and was greeted by this wierd log window that I never got before. The config yaml template format is now totally different. As a result, my Serena MCP server is completely broken until I can unravel what's wrong which I don't have time for today. I wouldn't prefer to add a restriction on symbol as that would defeat the purpose of the tool which is to be able to make small line edits across a whole file with token efficiency. The way I was working:
I take your point about major refactoring. My workflow above is definitely suited to (and more efficient) for small disparate edits as opposed to massive changes. Then, I think maybe the replace_symbol_body tool would be better. |
Beta Was this translation helpful? Give feedback.
-
@chknd1nner, you would think so, but it turns out models (particularly Sonnet 4) are not smart enough to understand the concepts behind more "exotic" replace tools. |
Beta Was this translation helpful? Give feedback.
-
|
Closing this since we recently implemented various measures to improve editing, including a regex based tool. Further discussions should happen in a separate thread |
Beta Was this translation helpful? Give feedback.




Uh oh!
There was an error while loading. Please reload this page.
-
I've been working on this with Claude a while. I had a whole PRD and technical requirements document written up with code listings for the implementation all ready to go.
But I think after you showed how much better you understand the Serena codebase than I do, I'll just give you the summary of the idea I came up with and let you run with it if you see merit in the concept.
Let me know if you want to see the full documentation I had prepared, I made the validation method particularly comprehensive.
The Problem
Current LLM coding agents are token-hungry monsters when editing code. To change a single line in a 200-line class, they must:
Result: 6,000 tokens for a 1-line change 💸
The Solution
Chunk-based differential editing - send only what changed, not the entire symbol.
Instead of full replacement:
Use targeted chunks:
Why It's Brilliant
Context-based positioning (Plays to LLM strength in pattern and semantic understanding. Minimises weaknesses--Line counting needed for traditional universal diff formats)
80-95% token reduction in typical scenarios
LLM-friendly format designed specifically for AI code generation
Atomic Validation - The Secret Sauce
The real innovation is comprehensive validation before any changes:
Phase 1: Validate ALL chunks before touching anything
Context matching (flexible whitespace handling)
Old lines verification (exact content matching)
Symbol boundary checks (no edits outside symbol range)
Cross-platform line ending consistency (detects and preserves CRLF/LF/CR)
End-of-symbol context validation
Insertion/deletion logic verification
Phase 2: Apply all changes atomically
Only executes if ALL chunks pass validation
Single write operation with preserved line endings
All-or-nothing guarantee - no partial corrupted edits
The line ending detection is particularly robust - handles Windows CRLF, Unix LF, and legacy Mac CR formats, preventing the "massive diff due to line ending changes" problem that breaks git workflows.
The Impact
Longer coding sessions without hitting token limits
Dramatically lower API costs
Ability to work with larger codebases
Zero risk of corrupted partial edits through atomic validation
Cross-platform compatibility without line ending chaos
Beta Was this translation helpful? Give feedback.
All reactions