-
Notifications
You must be signed in to change notification settings - Fork 17
#2487 Saving the msg before run the pipelines #2602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…rn 2 pages for 23 msg returning error when highlight_message_id is defined
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! 🚀 New features to boost your workflow:
|
📝 WalkthroughWalkthroughThis pull request adds message URL generation capabilities to chat messages and threads message identity (ID and URL) through the pipeline system to enable message reuse and proper tracking. It modifies Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/experiments/views/experiment.py (1)
911-916: Add test coverage and consider refining the page calculation to account for orphans.The
min(page, paginator.num_pages)approach safely preventsEmptyPageexceptions on the initial page calculation, which is the immediate issue. However, the page calculation at lines 911-913 doesn't account for theorphans=3parameter, which causes Django's Paginator to merge small remainder pages and reduce the total page count.With
page_size=10andorphans=3: 23 messages produce 2 pages (10 + 13), but the naive calculation(messages_before // page_size) + 1would compute 3 pages. Themin()prevents a crash by capping to the last available page, but the highlighted message may not actually be visible if it's on an earlier page.Consider:
- Adding test coverage for the highlight behavior with various message counts
- Refining the calculation to properly account for orphans, or validating that the capped page contains the highlighted message
- This change currently has no test coverage.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
apps/chat/bots.pyapps/chat/models.pyapps/chat/tests/test_chat.pyapps/experiments/views/experiment.pyapps/pipelines/nodes/base.pyapps/pipelines/nodes/history_middleware.pyapps/pipelines/nodes/llm_node.pyapps/pipelines/nodes/nodes.pyapps/pipelines/tests/test_nodes.pyapps/pipelines/tests/test_pipeline_history.pyapps/pipelines/tests/test_template_node.py
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use Ruff for Python linting and formatting with line length of 120 characters
Files:
apps/chat/tests/test_chat.pyapps/pipelines/nodes/base.pyapps/pipelines/nodes/history_middleware.pyapps/pipelines/tests/test_template_node.pyapps/pipelines/tests/test_nodes.pyapps/chat/bots.pyapps/experiments/views/experiment.pyapps/pipelines/nodes/llm_node.pyapps/chat/models.pyapps/pipelines/nodes/nodes.pyapps/pipelines/tests/test_pipeline_history.py
⚙️ CodeRabbit configuration file
**/*.py: Do not review for Python compatibility below 3.13
Flag heavy AI/ML imports at module level (langchain_*, openai, anthropic, google, boto3, pandas, numpy). These should be lazy-loaded inside methods to keep Django startup fast.
Files:
apps/chat/tests/test_chat.pyapps/pipelines/nodes/base.pyapps/pipelines/nodes/history_middleware.pyapps/pipelines/tests/test_template_node.pyapps/pipelines/tests/test_nodes.pyapps/chat/bots.pyapps/experiments/views/experiment.pyapps/pipelines/nodes/llm_node.pyapps/chat/models.pyapps/pipelines/nodes/nodes.pyapps/pipelines/tests/test_pipeline_history.py
**/{apps,config}/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/{apps,config}/**/*.py: Use lazy imports for heavy AI/ML libraries (langchain_google_vertexai, langchain_google_genai, etc.) to maintain fast Django startup time
UseTYPE_CHECKINGimports for type hints when lazy-loading heavy libraries
Files:
apps/chat/tests/test_chat.pyapps/pipelines/nodes/base.pyapps/pipelines/nodes/history_middleware.pyapps/pipelines/tests/test_template_node.pyapps/pipelines/tests/test_nodes.pyapps/chat/bots.pyapps/experiments/views/experiment.pyapps/pipelines/nodes/llm_node.pyapps/chat/models.pyapps/pipelines/nodes/nodes.pyapps/pipelines/tests/test_pipeline_history.py
**/tests/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/tests/*.py: Organize tests intests/directories with separate files fortest_models.py,test_views.py, andtest_api.py
Check for existing fixtures inconftest.pyfiles and use existing factories inapps/utils/factories/before creating new test data generation
Mock external services (LLM providers, etc.) in tests rather than making live API calls
Files:
apps/chat/tests/test_chat.pyapps/pipelines/tests/test_template_node.pyapps/pipelines/tests/test_nodes.pyapps/pipelines/tests/test_pipeline_history.py
**/{apps,config}/**/models.py
📄 CodeRabbit inference engine (AGENTS.md)
**/{apps,config}/**/models.py: Inherit team-scoped models fromBaseTeamModelwhich automatically includes team FK and audit timestamps
UseVersionsMixinandVersionsObjectManagerMixinfor models requiring version tracking
Use@audit_fieldsdecorator withAuditingManagerfor tracking field changes on important models
Usepublic_id = models.UUIDField(default=uuid.uuid4, unique=True)for external API identifiers
UseBaseModelfor non-team-scoped models when team-scoping is not required
All data must be scoped to teams viaBaseTeamModelwith no cross-team data access without explicit permission
Files:
apps/chat/models.py
🧠 Learnings (10)
📚 Learning: 2024-10-18T07:02:17.007Z
Learnt from: SmittieC
Repo: dimagi/open-chat-studio PR: 735
File: apps/chat/bots.py:291-291
Timestamp: 2024-10-18T07:02:17.007Z
Learning: In `apps/chat/bots.py`, the `PipelineBot.process_input` method returns a `PipelineState` TypedDict where the 'ai_message_id' key always exists.
Applied to files:
apps/pipelines/nodes/base.pyapps/pipelines/tests/test_template_node.pyapps/chat/bots.pyapps/pipelines/nodes/llm_node.py
📚 Learning: 2024-10-18T07:00:25.261Z
Learnt from: SmittieC
Repo: dimagi/open-chat-studio PR: 735
File: apps/pipelines/models.py:183-190
Timestamp: 2024-10-18T07:00:25.261Z
Learning: In `apps/pipelines/models.py`, within the `Pipeline` model's `create_new_version` method, only the `pipeline` field needs to be updated for node versions using `bulk_update`, as per the design.
Applied to files:
apps/pipelines/nodes/base.pyapps/pipelines/tests/test_pipeline_history.py
📚 Learning: 2024-11-18T14:52:42.041Z
Learnt from: SmittieC
Repo: dimagi/open-chat-studio PR: 876
File: apps/service_providers/llm_service/state.py:181-187
Timestamp: 2024-11-18T14:52:42.041Z
Learning: In `apps/service_providers/llm_service/state.py`, the `ChatExperimentState` class is intentionally different from `ExperimentAssistantState`, and their method signatures for `save_message_to_history` may differ.
Applied to files:
apps/pipelines/nodes/history_middleware.pyapps/chat/bots.pyapps/pipelines/nodes/llm_node.pyapps/pipelines/tests/test_pipeline_history.py
📚 Learning: 2024-10-18T08:35:40.277Z
Learnt from: snopoke
Repo: dimagi/open-chat-studio PR: 740
File: apps/pipelines/models.py:252-262
Timestamp: 2024-10-18T08:35:40.277Z
Learning: In `apps/pipelines/models.py`, the `PipelineChatMessages.as_langchain_messages()` method intentionally returns messages in reverse order, so the sequence is correct as implemented.
Applied to files:
apps/pipelines/tests/test_nodes.pyapps/chat/models.py
📚 Learning: 2024-10-18T08:41:26.808Z
Learnt from: snopoke
Repo: dimagi/open-chat-studio PR: 740
File: apps/pipelines/models.py:252-252
Timestamp: 2024-10-18T08:41:26.808Z
Learning: In the `as_langchain_messages` method of `PipelineChatMessages`, the messages are returned in the reverse order `[AIMessage, HumanMessage, SystemMessage]` due to where this method is called. The `SystemMessage` represents the conversation summary and is only included if it exists.
Applied to files:
apps/chat/bots.pyapps/chat/models.py
📚 Learning: 2024-10-08T09:45:11.463Z
Learnt from: snopoke
Repo: dimagi/open-chat-studio PR: 698
File: apps/experiments/models.py:598-603
Timestamp: 2024-10-08T09:45:11.463Z
Learning: In the `get_version` method of the `Experiment` class in `apps/experiments/models.py`, minor improvements to error handling that only enhance error messages without adding significant value may not be necessary.
Applied to files:
apps/experiments/views/experiment.py
📚 Learning: 2024-10-18T08:35:37.836Z
Learnt from: snopoke
Repo: dimagi/open-chat-studio PR: 740
File: apps/pipelines/models.py:9-9
Timestamp: 2024-10-18T08:35:37.836Z
Learning: In `apps/pipelines/models.py`, both `langchain_core.messages` and `langchain.schema` are acceptable import paths for `AIMessage` and related classes.
Applied to files:
apps/chat/models.py
📚 Learning: 2024-10-18T06:48:49.750Z
Learnt from: SmittieC
Repo: dimagi/open-chat-studio PR: 735
File: apps/experiments/models.py:720-724
Timestamp: 2024-10-18T06:48:49.750Z
Learning: In the `apps/experiments/models.py` file, exception handling is not required in the `_copy_pipeline_to_new_version` method, and exceptions can be allowed to propagate.
Applied to files:
apps/pipelines/tests/test_pipeline_history.py
📚 Learning: 2024-12-04T07:25:35.489Z
Learnt from: SmittieC
Repo: dimagi/open-chat-studio PR: 950
File: apps/service_providers/tests/test_runnables_cancellation.py:91-98
Timestamp: 2024-12-04T07:25:35.489Z
Learning: In `apps/service_providers/llm_service/history_managers.py`, the `ExperimentHistoryManager.for_llm_chat` method intentionally accepts an `experiment` parameter separate from `session.experiment` because the experiment passed to the history manager may not always be the same as `session.experiment`.
Applied to files:
apps/pipelines/tests/test_pipeline_history.py
📚 Learning: 2025-12-19T12:29:01.450Z
Learnt from: CR
Repo: dimagi/open-chat-studio PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-19T12:29:01.450Z
Learning: Applies to **/tests/*.py : Mock external services (LLM providers, etc.) in tests rather than making live API calls
Applied to files:
apps/pipelines/tests/test_pipeline_history.py
🧬 Code graph analysis (6)
apps/pipelines/nodes/history_middleware.py (1)
apps/pipelines/nodes/nodes.py (1)
get_history(280-303)
apps/pipelines/tests/test_template_node.py (1)
apps/pipelines/nodes/nodes.py (1)
RenderTemplate(108-160)
apps/pipelines/tests/test_nodes.py (2)
apps/chat/models.py (1)
get_langchain_messages_until_marker(57-70)apps/pipelines/nodes/nodes.py (1)
get_history_mode(277-278)
apps/chat/bots.py (2)
apps/chat/models.py (2)
ChatMessageType(87-115)get_absolute_url(290-301)apps/service_providers/tracing/service.py (1)
set_input_message_id(268-270)
apps/chat/models.py (6)
apps/teams/utils.py (1)
get_slug_for_team(87-101)apps/web/templatetags/default_tags.py (1)
absolute_url(17-19)apps/pipelines/models.py (1)
get_absolute_url(175-176)apps/documents/models.py (1)
get_absolute_url(245-246)apps/trace/models.py (1)
get_absolute_url(53-54)apps/assistants/models.py (1)
get_absolute_url(83-84)
apps/pipelines/nodes/nodes.py (2)
apps/chat/models.py (1)
get_langchain_messages_until_marker(57-70)apps/pipelines/models.py (1)
get_langchain_messages_until_marker(584-590)
🪛 Ruff (0.14.10)
apps/pipelines/nodes/nodes.py
280-280: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Seer Code Review
🔇 Additional comments (15)
apps/pipelines/tests/test_nodes.py (1)
124-126: LGTM! Test correctly validates exclude_message_id parameter.The test appropriately verifies that
get_langchain_messages_until_markeris called withexclude_message_id=Nonefor the GLOBAL history type, matching the updated signature inapps/chat/models.py.apps/chat/models.py (4)
7-7: LGTM! Imports support the new URL generation feature.The added imports (
reverse,get_slug_for_team,absolute_url) are appropriate for implementing message URL generation functionality.Also applies to: 15-15, 18-18
57-57: LGTM! Backward-compatible signature extension.The addition of the optional
exclude_message_idparameter with a default value ofNonemaintains backward compatibility while enabling message exclusion functionality.
72-76: LGTM! Clean message exclusion logic.The exclusion logic correctly filters out the specified message when
exclude_message_idis provided, maintaining the existing iteration behavior otherwise.
226-226: Consider the implications of None in message_url.The
get_absolute_url()method can returnNonewhen the message hasn't been saved, lacks a chat/team, or isn't associated with an experiment session (lines 290-292). ThisNonevalue will be included in the LangChain dict'sadditional_kwargs.Verify that downstream consumers of this LangChain dict can handle
Nonevalues formessage_urlappropriately. The test updates inapps/chat/tests/test_chat.py(lines 49, 68) showNoneis expected for unsaved messages, which is correct.apps/pipelines/nodes/nodes.py (2)
131-132: LGTM! Template context correctly extended with message metadata.The addition of
input_message_idandinput_message_urlto the template rendering context enables templates to reference the originating message, which aligns with the PR objectives.
292-294: LGTM! Correctly forwards exclude_message_id to session history.The parameter is properly forwarded to
get_langchain_messages_until_marker, enabling message exclusion for global/session history types.apps/pipelines/tests/test_template_node.py (1)
35-35: LGTM! Test comprehensively validates input_message_url in templates.The test correctly:
- Passes
input_message_urlin thePipelineState- References it in the template string
- Asserts its presence in the rendered output
This validates the end-to-end flow of message URL propagation through the template rendering system.
Also applies to: 40-41, 48-49
apps/pipelines/nodes/history_middleware.py (1)
40-40: LGTM! Correctly excludes current input message from history.This change prevents duplicate messages by excluding the current input message (identified by
input_message_id) when reconstructing history in thebefore_agentphase. This aligns with the PR objective to avoid duplicate messages in the LLM node.apps/chat/tests/test_chat.py (1)
49-49: LGTM! Tests correctly expect None for unsaved messages.The test expectations for
"message_url": Noneare correct since:
- These messages are not saved to the database (no
chat_id)ChatMessage.get_absolute_url()returnsNonefor unsaved messages (per line 291 inapps/chat/models.py)This validates the expected behavior for in-memory message objects.
Also applies to: 68-68
apps/pipelines/nodes/llm_node.py (2)
27-27: LGTM! StateSchema extended with input_message_id.The addition of the optional
input_message_idfield toStateSchemaenables the agent to track and propagate the input message identifier, which is then used by the history middleware to prevent duplicate messages.
43-43: LGTM! Correctly propagates input_message_id to agent.The
input_message_idis properly extracted from the pipeline state and passed to the agent inputs, enabling downstream components (likehistory_middleware.pyline 40) to exclude this message from history retrieval.apps/pipelines/tests/test_pipeline_history.py (1)
184-190: Verify why global history tests require expanded available_apps.The distinction between test types is clear—
test_global_historyandtest_llm_with_global_historyusePipelineBotwhile other tests userunnable.invoke()directly. However, the original explanation referencingget_absolute_url()andget_slug_for_team()cannot be verified in the codebase.Confirm whether
apps.usersandapps.oauthare actually necessary for these tests or if they can be removed. If they are required, document the specific reason (e.g., migrations, model constraints, or execution path dependencies).apps/chat/bots.py (2)
81-93: LGTM!The early message save pattern is well-implemented. Saving the human message before pipeline execution enables downstream nodes to access
input_message_idandinput_message_urlfor features like template rendering. The conditional checks are appropriate.
178-204: LGTM!The refactored
_save_outputscleanly handles both scenarios:
- Updating metadata on an existing message (avoiding duplicate message creation)
- Falling back to creating a new message when
existing_human_messageis NoneThe selective save with
update_fields=["metadata"]is efficient, and the trace service integration correctly covers both code paths.
| def get_absolute_url(self): | ||
| if not self.chat_id or not self.chat.team_id or not hasattr(self.chat, "experiment_session"): | ||
| return None | ||
| uri = reverse( | ||
| "chatbots:chatbot_session_view", | ||
| kwargs={ | ||
| "team_slug": get_slug_for_team(self.chat.team_id), | ||
| "experiment_id": self.chat.experiment_session.experiment.public_id, | ||
| "session_id": self.chat.experiment_session.external_id, | ||
| }, | ||
| ) | ||
| return absolute_url(uri + f"?message_id={self.id}", is_secure=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add defensive error handling for URL generation.
The get_absolute_url() method has a few potential issues:
-
Line 291: Using
hasattr(self.chat, "experiment_session")may not catch all cases whereexperiment_sessionisNone. Consider using a try-except block or checkinggetattr(self.chat, 'experiment_session', None). -
Line 296:
get_slug_for_team()can raiseValueErrorifteam_idisNoneor negative (perapps/teams/utils.pylines 86-100). While the line 291 check forteam_idshould prevent this, consider wrapping in try-except for robustness. -
Query parameter: Line 301 uses
self.idwhich should always be set at this point due to thechat_idcheck, but consider adding a comment clarifying this assumption.
🔎 Suggested improvements
def get_absolute_url(self):
- if not self.chat_id or not self.chat.team_id or not hasattr(self.chat, "experiment_session"):
+ if not self.chat_id or not self.chat.team_id:
+ return None
+
+ experiment_session = getattr(self.chat, 'experiment_session', None)
+ if not experiment_session:
return None
- uri = reverse(
- "chatbots:chatbot_session_view",
- kwargs={
- "team_slug": get_slug_for_team(self.chat.team_id),
- "experiment_id": self.chat.experiment_session.experiment.public_id,
- "session_id": self.chat.experiment_session.external_id,
- },
- )
- return absolute_url(uri + f"?message_id={self.id}", is_secure=True)
+
+ try:
+ uri = reverse(
+ "chatbots:chatbot_session_view",
+ kwargs={
+ "team_slug": get_slug_for_team(self.chat.team_id),
+ "experiment_id": experiment_session.experiment.public_id,
+ "session_id": experiment_session.external_id,
+ },
+ )
+ # self.id is guaranteed to exist at this point due to chat_id check
+ return absolute_url(uri + f"?message_id={self.id}", is_secure=True)
+ except (ValueError, AttributeError):
+ # Handle cases where team_id is invalid or experiment_session attributes are missing
+ return None🤖 Prompt for AI Agents
In apps/chat/models.py around lines 290 to 301, replace the fragile hasattr
check and unguarded slug generation with defensive handling: use
getattr(self.chat, "experiment_session", None) and verify it is not None; wrap
the get_slug_for_team(...) call in a try/except catching ValueError and return
None on failure (or log and return None) so invalid team IDs don't raise; keep
returning absolute_url(...) when everything is valid and add a short comment
noting that self.id is expected to be set because chat_id exists.
| input_message_id: int | None | ||
| input_message_url: int | None | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect type annotation for input_message_url.
input_message_url is typed as int | None, but URLs are strings. Based on the related code in apps/chat/models.py, get_absolute_url() returns a string.
🔎 Proposed fix
input_message_id: int | None
- input_message_url: int | None
+ input_message_url: str | None📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| input_message_id: int | None | |
| input_message_url: int | None | |
| input_message_id: int | None | |
| input_message_url: str | None | |
🤖 Prompt for AI Agents
In apps/pipelines/nodes/base.py around lines 115 to 117, the type annotation for
input_message_url is incorrect (currently int | None); change it to str | None
(or Optional[str]) to reflect that URLs are strings (get_absolute_url() returns
a string). Update the annotation accordingly and run type checks to ensure no
other code expects an int.
| return self.history_mode or PipelineChatHistoryModes.SUMMARIZE | ||
|
|
||
| def get_history(self, session: ExperimentSession) -> list[BaseMessage]: | ||
| def get_history(self, session: ExperimentSession, exclude_message_id: int = None) -> list[BaseMessage]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix type hint for exclude_message_id parameter.
The exclude_message_id parameter should be explicitly typed as int | None rather than just int to match its default value of None and avoid implicit Optional as flagged by Ruff (RUF013).
🔎 Proposed fix
- def get_history(self, session: ExperimentSession, exclude_message_id: int = None) -> list[BaseMessage]:
+ def get_history(self, session: ExperimentSession, exclude_message_id: int | None = None) -> list[BaseMessage]:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def get_history(self, session: ExperimentSession, exclude_message_id: int = None) -> list[BaseMessage]: | |
| def get_history(self, session: ExperimentSession, exclude_message_id: int | None = None) -> list[BaseMessage]: |
🧰 Tools
🪛 Ruff (0.14.10)
280-280: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
🤖 Prompt for AI Agents
In apps/pipelines/nodes/nodes.py around line 280, the function signature for
get_history declares exclude_message_id: int = None which mismatches its
default; change the type hint to an explicit union (exclude_message_id: int |
None = None) so the annotation reflects that None is allowed (or use
Optional[int] if project style prefers), then run linters/type-checks to ensure
no other annotations or imports need adjustment.
…rn 2 pages for 23 msg returning error when highlight_message_id is defined
Technical Description
Migrations
Demo
Docs and Changelog