feat: Persistent LLM profiles by VascoSch92 · Pull Request #1882 · OpenHands/software-agent-sdk

VascoSch92 · 2026-02-02T22:13:19Z

Summary

The PR propose a solution for issue #781

The class ProfileManager is introduced, whose sole job is to save, store, and update profiles on disk. Every time we work with persistent profiles, it should be done through this API to ensure a consistent and reliable way to store, load, and update profiles.

The class can then be used to implement a profile selection solution for the Openhands-CLI and in LLMRegistry to save or load registries coming from different experiments or runs.

Changed Script

manager.py

The script contains the ProfileManager class, which has the following characteristics:

The class manages loading, saving, deleting, and updating persistent profiles.
The class uses a configuration file as a sort of index to load files lazily, preventing the need to load the entire profile store at once, thereby saving time and memory.
The solution works fine in settings where there are no multiple ProfileManager instances working in the same directory (e.g., Openhands-CLI). Otherwise, concurrency issues could arise. Since I assume these are mostly corner cases, I haven't implemented any countermeasures yet. Does that make sense?

config.py

The script contains Pydantic models used to validate the configuration file associated with a profile store. These models are backward compatible.

llm_registry.py

Several modifications were made to the LLMRegistry class; however, the API of the class did not change, ensuring backward compatibility. Shortly:

Two new parameters were added: load_from_disk and profile_manager_path.
- load_from_disk is a boolean flag that determines whether to use profiles persisted on disk.
- If load_from_disk is True, profile_manager_path specifies the location of the profiles.
Two variables were chosen instead of one because of the default path for the profile manager.
The property usage_to_llm is deprecated. Currently, it returns the internal state _usage_to_llm rather than a copy, which is not ideal. If access is needed, a copy or proxy map should be returned. The property was used only in a few tests, and since we do not want to change the API, it was simply deprecated.
The get method now lazily loads profiles from the store when necessary.
A new method, export_profiles, was added to save the registry. This is very helpful when experimenting or sharing LLM profiles between coworkers.

Checklist

If the PR is changing/adding functionality, are there tests to reflect this?
If there is an example, have you run the example to make sure that it works?
If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
Is the github CI passing?

all-hands-bot

Review Summary

This PR introduces a solid foundation for persistent LLM profiles. The implementation is generally well-structured with good test coverage. However, there are several issues that should be addressed:

Critical Issues:

Profile files committed to repository root should be relocated
Missing error handling in lazy loading logic could cause incorrect exceptions

Important Issues:

Deprecated property still exposes mutable internal state
Incomplete cache invalidation
Backward compatibility concerns with list_usage_ids()

See inline comments for details.

.openhands/profiles/claude-sonnet-4-5-20250929.json

all-hands-bot · 2026-02-02T22:16:04Z

openhands-sdk/openhands/sdk/llm/llm_registry.py

+        removed_in="1.7.0",
+        details="The method is deprecated and will be removed in next versions",
+    )
    def usage_to_llm(self) -> dict[str, LLM]:


🟠 Important: Deprecated property returns mutable internal state.

The deprecation note says "a copy or proxy map should be returned" but the implementation still returns the internal dict directly. This allows callers to mutate _usage_to_llm.

Also, the deprecation message says "The method is deprecated" but this is a property, not a method.

Suggested change

def usage_to_llm(self) -> dict[str, LLM]:

@deprecated(

deprecated_in="1.6.0",

removed_in="1.7.0",

details="This property is deprecated and will be removed in future versions. Use list_usage_ids() and get() instead.",

)

def usage_to_llm(self) -> dict[str, LLM]:

"""Access a copy of the internal usage-ID-to-LLM mapping."""

return self._usage_to_llm.copy()

For clarity, usage_id is an id used for the LLM when computing conversation stats, e.g. we wanted to be able to say "the agent LLM used 120k tokens total", "the condenser LLM used 45k tokens total".

I think they're not profile ids 🤔 WDYT?

My point was that allowing an external caller to change an internal state breaks the API contract.

Could you point me to an example where this property is used? I searched the repo and it seems it's never referenced.

If you still want to keep this property, I suggest returning:

a copy of the dictionary (ok but not ideal)

a MappingProxyType (better solution)

Ah ok, I don't think we care about the property, if it's not used let's kill it!

all-hands-bot · 2026-02-02T22:16:04Z

openhands-sdk/openhands/sdk/llm/llm_registry.py

+            llm_instance = self._usage_to_llm[usage_id]
+        elif self.profile_manager and usage_id in self.profile_manager.get_id_usages():
+            profile_name = next(
+                p.name
+                for p in self.profile_manager.config.profiles
+                if p.usage_id == usage_id
+            )
+            llm_instance = self.profile_manager.load_profile(profile_name)


🔴 Critical: Missing error handling for lazy loading.

If no profile matches the usage_id, this next() call will raise StopIteration instead of the intended KeyError. The in check above should prevent this, but it's better to be defensive.

Consider using:

profile_name = next( (p.name for p in self.profile_manager.config.profiles if p.usage_id == usage_id), None ) if profile_name: llm_instance = self.profile_manager.load_profile(profile_name) self._usage_to_llm[usage_id] = llm_instance

all-hands-bot · 2026-02-02T22:16:05Z

openhands-sdk/openhands/sdk/llm/llm_registry.py

+            if self.profile_manager is not None
+            else set()
+        )



🟠 Important: Backward compatibility concern.

This method now returns both in-memory AND on-disk usage IDs. This is a behavior change that could break existing code expecting only in-memory profiles. Consider:

Documenting this change clearly in the docstring

Mentioning it in the PR description as a breaking change

Adding a parameter to control this behavior if needed for compatibility

I see that there is a corner case where this can happens. However, the user can put load_from_disk = False and the code will return just in-memory profiles.

Note that the default for load_from_disk is False.

all-hands-bot · 2026-02-02T22:16:05Z

openhands-sdk/openhands/sdk/llm/profiles/manager.py

+    @cache
+    def get_default_profile(self) -> LLM | None:
+        """Return the default LLM profile."""
+        default_profile_name = self.config.default_profile
+        if default_profile_name is None:
+            return None
+        return self.load_profile(default_profile_name)


🟡 Suggestion: Cache invalidation is incomplete.

The @cache decorator caches the default profile, but the cache is only cleared in set_default_profile(). It should also be cleared when:

The default profile is deleted (currently forbidden, but good for future-proofing)

The default profile is updated via save_profile()

Consider clearing the cache in save_profile() and delete_profile() if they affect the default profile.

While delete_profile() doesn't require cache eviction, it’s best to defensively clear the cache in save_profile().

openhands-sdk/openhands/sdk/llm/profiles/config.py

examples/01_standalone_sdk/36_llm_profiles.py

all-hands-bot · 2026-02-02T22:16:05Z

openhands-sdk/openhands/sdk/llm/profiles/manager.py

+
+    Note that the profile manager also uses a configuration file containing metadata
+    about the profiles. This configuration file is primarily used to lazily load
+    profiles from disk.


🟠 Important: Concurrency not addressed.

The PR description mentions: "concurrency issues could arise" when multiple ProfileManager instances work in the same directory. While you note this is a corner case, consider:

Adding a clear warning in the docstring

At minimum, implementing file-based locking for config writes

Or documenting that this is explicitly NOT thread-safe

This is especially important for CLI usage where users might run multiple commands concurrently.

Suggested docstring addition:

**Thread Safety**: This class is NOT thread-safe. Multiple ProfileManager instances operating on the same directory concurrently may cause race conditions when reading/writing the config file. Ensure only one instance per directory is active at a time.

enyst · 2026-02-02T23:37:07Z

openhands-sdk/openhands/sdk/llm/profiles/manager.py

+            self.config_path.write_text(json.dumps(config.model_dump(), indent=2))
+
+    def _load_config(self) -> ProfilesConfig:
+        return ProfilesConfig.from_json(self.config_path)


I wonder if it's a config file necessary? We could maybe just load the profiles from their directory

Yes, this could be an option but I think this depends on what we want to solve and which user-case this class want to address.

Here's what I'm thinking:

we have generally avoided to make config files in the SDK

we do save/load things (conversations, events, hooks I think), though enabled by user choice - I believe saving/loading LLM profiles is fine

but I think maybe a config file could be left to applications, not in the SDK.

I'd be happy to be proven wrong though. There's an argument that doing it here would assure consistency between applications (CLI and web apps), but I don't know if this is the time to change that... WDYT?

enyst · 2026-02-02T23:38:01Z

.openhands/profiles/gpt4.json

+  "enable_encrypted_reasoning": true,
+  "prompt_cache_retention": "24h",
+  "extended_thinking_budget": 200000,
+  "usage_id": "gpt_agent",


I think this could be condenser or any other LLM

enyst · 2026-02-02T23:45:57Z

examples/01_standalone_sdk/36_llm_profiles.py

+    code_agent = Agent(llm=code_llm, tools=[Tool(name=TerminalTool.name)])  # noqa: F841
+
+    summary_llm = registry.get("gpt_agent")
+    summary_agent = Agent(llm=summary_llm, tools=[Tool(name=TerminalTool.name)])  # noqa: F841


I see, I think this is confusing in the codebase, but it seems to me that usage_ids are orthogonal to profiles. It's a different feature, and if we merge them into one I'm not sure we can achieve both

stats per "usage" (e.g. total stats for the agent, no matter what LLMs it used)

the user can choose any profile for agent

Did that make sense?

Yes, I had some thoughts about that too.

The problem is that there is no a 1-to-1 correspondence between usage_ids and profiles, i.e., you could have one profile with multiple usage ids.

For that reason the usage_id in the ProfileMetadata is optional. On the other hand, the profiles config asks for a list of profiles metadata with different names as the name is used to retrieve the correct profile.

@OpenHands please do a /codereview-roasted and look up ALL the usage of usage_ids (including all LLM-specific IDs), then zoom out a bit and understand the purpose of this PR. Do you think if the current implementation in this PR is a good idea? If not, what do you think would be the best approach?

Currently it seems with this PR, we will have LLMRegistry -> LLMProfile -> LLM and each of this concept have their own notion of "IDs". Any way to simplify this hierarchy? I imagine a profile is just a thin wrapper around an LLM (with a user-specified name), and LLMRegistry should just be a collection of LLM | LLMProfile - do you think if this is a good idea?

I'm on it! xingyaoww can track my progress at all-hands.dev

Code Review Summary: PR #1882 "feat: Persistent LLM profiles"

Taste Rating: 🔴 Needs Improvement

I performed a comprehensive code review of the PR and analyzed all usages of usage_id, profile_id, and related IDs across the codebase. Here's my assessment:

The ID Hierarchy Problem

The PR introduces a confusing hierarchy with multiple overlapping ID concepts:

ID Location Purpose

usage_id LLM Role identification (agent, condenser, etc.)

profile_id LLM (new) Persisted configuration name

registry_id LLMRegistry Instance UUID for logging

The core issue: These concepts are orthogonal but the PR conflates them. Specifically, ensure_default_profile() uses usage_id as the default profile_id, creating false coupling.

Critical Issues

ensure_default_profile() conflates usage_id with profile_id - Multiple usages can share one profile (agent and condenser both using gpt-5-mini), and one usage can switch between profiles.

profile_ref boolean flag is a code smell - If you need a flag to track how an object was created, your serialization logic is too complex.

Manual discriminated union in LLM validator - The kind: profile_ref pattern is implemented manually, creating tight coupling between LLM and LLMRegistry.

ConversationState breaks encapsulation - It manually patches serialized payloads after model_dump().

Recommended Simplification

Your intuition is correct: a profile should be a thin wrapper around an LLM. I'd go further:

A profile IS an LLM configuration stored on disk. Period.

# Proposed simpler API: class LLMProfileStore: """Standalone utility for persisting LLM configurations.""" def save(self, name: str, llm: LLM, include_secrets: bool = False) -> Path: ... def load(self, name: str) -> LLM: ... def list(self) -> list[str]: ... # Registry stays simple - just in-memory by usage_id: registry = LLMRegistry() registry.add(llm) # keyed by llm.usage_id

This keeps:

LLM simple (no profile_ref flag, no kind discriminator)

LLMRegistry simple (just in-memory lookup by usage_id)

Profile persistence explicit and separate

Verdict

❌ Needs rework - The fundamental design conflates orthogonal concepts. The PR should:

Keep usage_id and profile_id strictly separate - they serve different purposes

Make profile persistence a standalone utility - not embedded in LLMRegistry or LLM

Use explicit load/save rather than magic serialization - store.load("gpt-5-mini") is clearer than kind: profile_ref discriminators

Note: This was a review-only task. No code changes were made.

View full conversation

^ I kinda like this idea - @VascoSch92 @enyst wdyt?

enyst · 2026-02-02T23:47:09Z

.openhands/profiles/config.json

+    },
+    {
+      "name": "gpt4",
+      "file": "gpt4.json",


If we use the rule "profile_id.json is the filename", then maybe we don't need a config file?

Yes. The only point is: where we store the info about the default_profile, i.e., how do we know which one is the default profile?

VascoSch92 · 2026-02-03T07:59:25Z

Hey @enyst, thank you very much for the feedback!

Here are a couple of thoughts to help clarify the situation. I’d love to know what you think about the following:

Goal: We want a solution that allows the user to load or switch profiles on the fly.

Context: A solution proposed by @xingyaoww [here] suggested allowing the LLMRegistry to look into persistent files. This makes sense, as the LLMRegistry is used in LocalConversation to manage different profiles.

However, the LLMRegistry uses usage_id as a primary key, while the proposed ProfileManager uses name. This could lead to inconsistencies since there isn't a one-to-one mapping between the two.

A possible Solution: We could update the ProfileManager to work with two parameters: name and usage_id. This would mean:

We can save the same LLM with different usage_id values. (This should also work for saving Agents, and we could potentially remove the Config-related Pydantic models).

This would require updating the LLMRegistry API if we want the ProfileManager to support it. Specifically, the .get() method would now require the name of the profile as an additional parameter.

Makes sense?

enyst · 2026-02-04T12:52:29Z

Thank you @VascoSch92 for thinking it through! To be honest, this can get complicated fast.

I think what matters is:

(start with) - agent -> gpt-5
(switch to) - agent-> sonnet-45
(at restore) - agent -> will be sonnet-45

where agent is usage_id, and those are profile ids. This will be within a single run, and we'd be able to compute total cost for agent, which gets displayed on conversations windows and so on.

If we can do that, we're good IMHO!

Now, the same goes for other buckets of usage.

My current conclusion playing with it, is that it's simpler to:

use only profiles, not LLMs in conversations code etc, so we should have a default profile no matter what the user does
use agent for agent, condenser for condenser, specific other components are also named
- I did not allow for override of these names anymore
I think that implies to save profile_id in the conversation base_state.json, so that we can restore the profile. 🤔

Then, it seems to me, we have the question: how about remote?

currently, a client needs to send LLMs serialized (so the "contents" of the profile)
- of course, since the remote is agent server, and it doesn't yet know profiles
I think maybe we could send profile id from the client, it seems necessary? e.g. user selects a profile -> start conversation / switch to that profile

xingyaoww · 2026-02-04T14:29:20Z

examples/01_standalone_sdk/36_llm_profiles.py

+    )
+
+
+def main() -> None:


Let's break the main function into scripts directly and save a level of indentation.

@OpenHands please implement this change and /remember this preference in AGENTS.md

I'm on it! xingyaoww can track my progress at all-hands.dev

Summary

I've implemented the requested changes and pushed them to the agent-sdk-18-profile-manager branch:

Checklist

Removed if __name__ == "__main__": guard from examples/01_standalone_sdk/36_llm_profiles.py - The code now runs directly at module level, saving one level of indentation (consistent with other examples like 01_hello_world.py)

Documented the preference in AGENTS.md - Added a new "Example Scripts" section under <CODE> that states: "Example scripts in examples/ should run code directly at module level without wrapping in if __name__ == "__main__": guards. This saves a level of indentation and keeps examples concise."

Pre-commit checks passed - All linting, formatting, and type checks passed

Changes pushed to the PR branch

The changes are concise and focused only on what was requested. The PR has been updated with commit 572e7196.

View full conversation

VascoSch92 and others added 3 commits December 26, 2025 09:44

feat - llm-profiles

c6e22a4

Merge branch 'OpenHands:main' into llm-profiles

cb47dac

minor changes

1dc3595

VascoSch92 requested review from enyst and xingyaoww February 2, 2026 22:13

VascoSch92 added the enhancement New feature or request label Feb 2, 2026

VascoSch92 added 2 commits February 2, 2026 23:15

feat - llm-profiles

9f14cce

minor changes

56cc73c

VascoSch92 force-pushed the llm-profiles branch from 1dc3595 to 56cc73c Compare February 2, 2026 22:15

Merge remote-tracking branch 'origin/llm-profiles' into llm-profiles

cdb2c2c

all-hands-bot reviewed Feb 2, 2026

View reviewed changes

minor changes

4e98bca

enyst reviewed Feb 2, 2026

View reviewed changes

VascoSch92 added 2 commits February 3, 2026 14:47

updates after all-hands-bot review

660d6ce

rename example because of numbering

d303acf

neubig mentioned this pull request Feb 3, 2026

Add the ability to specify a fallback language model #1885

Open

VascoSch92 requested a review from enyst February 3, 2026 15:51

xingyaoww reviewed Feb 4, 2026

View reviewed changes

-    def usage_to_llm(self) -> dict[str, LLM]:
+    @deprecated(
+        deprecated_in="1.6.0",
+        removed_in="1.7.0",
+        details="This property is deprecated and will be removed in future versions. Use list_usage_ids() and get() instead.",
+    )
+    def usage_to_llm(self) -> dict[str, LLM]:
+        """Access a copy of the internal usage-ID-to-LLM mapping."""
+        return self._usage_to_llm.copy()

ID	Location	Purpose
`usage_id`	`LLM`	Role identification (agent, condenser, etc.)
`profile_id`	`LLM` (new)	Persisted configuration name
`registry_id`	`LLMRegistry`	Instance UUID for logging

Conversation

VascoSch92 commented Feb 2, 2026

Summary

Changed Script

manager.py

config.py

llm_registry.py

Checklist

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Feb 4, 2026

Choose a reason for hiding this comment

Code Review Summary: PR #1882 "feat: Persistent LLM profiles"

Taste Rating: 🔴 Needs Improvement

The ID Hierarchy Problem

Critical Issues

Recommended Simplification

Verdict

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

VascoSch92 commented Feb 3, 2026

Uh oh!

enyst commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Feb 4, 2026

enyst commented Feb 4, 2026 •

edited

Loading