Skip to content

Fix #570: Bug: encode_node mutates its argument#641

Open
JiwaniZakir wants to merge 1 commit intoFalkorDB:stagingfrom
JiwaniZakir:fix/570-bug-encode-node-mutates-its-argument
Open

Fix #570: Bug: encode_node mutates its argument#641
JiwaniZakir wants to merge 1 commit intoFalkorDB:stagingfrom
JiwaniZakir:fix/570-bug-encode-node-mutates-its-argument

Conversation

@JiwaniZakir
Copy link
Copy Markdown

@JiwaniZakir JiwaniZakir commented Apr 2, 2026

Closes #570

Fix encode_node in api/entities/entity_encoder.py to return a copy of the node's data rather than mutating the original object.

Changes

api/entities/entity_encoder.pyencode_node

  • Replaced n.labels.remove('Searchable') + return vars(n) with a shallow copy via vars(n).copy(), then filters 'Searchable' out of result['labels'] using a list comprehension.
  • The original Node object is no longer touched.

tests/test_entity_encoder.py (new file)

  • Adds four focused tests for encode_node covering the corrected behavior.

Motivation

The original implementation called n.labels.remove('Searchable') directly on the node object passed in, permanently stripping the label from the caller's data. Two concrete failure modes resulted:

  1. Silent data corruption — any code holding a reference to the same Node would observe the label gone after a single call to encode_node.
  2. ValueError on second encode — calling encode_node on the same node twice raised ValueError: list.remove(x): x not in list because 'Searchable' no longer existed in n.labels.

The fix builds the output dict from a copy, leaving the source node intact.

Testing

Four pytest cases in tests/test_entity_encoder.py verify the fix directly:

  • test_encode_node_removes_searchable — result dict excludes 'Searchable' and retains other labels (e.g. 'File').
  • test_encode_node_does_not_mutate_originaln.labels still contains 'Searchable' after encoding.
  • test_encode_node_twice_does_not_raise — second call completes without ValueError and returns a clean result.
  • test_encode_node_without_searchable — nodes with no 'Searchable' label pass through unchanged.

All four tests fail against the original implementation and pass with the fix applied.


This PR was created with AI assistance (Claude). The changes were reviewed by quality gates and a critic model before submission.

Summary by CodeRabbit

  • Bug Fixes

    • Entity encoder no longer mutates the original node's labels during encoding, preserving data integrity.
  • Tests

    • Added comprehensive test coverage for entity encoding behavior, including label filtering and multiple invocation scenarios.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 2, 2026

📝 Walkthrough

Walkthrough

The encode_node function in api/entities/entity_encoder.py was modified to avoid mutating the input node's labels attribute by creating a defensive copy of the node's state before filtering out the 'Searchable' label. Corresponding test coverage was added to validate the non-mutating behavior across multiple scenarios.

Changes

Cohort / File(s) Summary
encode_node Implementation
api/entities/entity_encoder.py
Modified encode_node to create a copy of vars(n) and filter the 'Searchable' label into a new list instead of mutating the original node's labels directly.
Test Coverage
tests/test_entity_encoder.py
Added comprehensive test suite validating that encode_node removes the 'Searchable' label from encoded output without mutating the input node, handles repeated calls correctly, and preserves labels when 'Searchable' is absent.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A node once changed with careless hand,
Its labels scrambled, oh what a brand!
But now with copies, safe and sound,
The original node stays safe and profound!
Hop on, dear coder, mutation's gone! 🌟

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main fix: preventing mutation of the input node argument in encode_node, which directly addresses issue #570.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/test_entity_encoder.py (1)

2-2: Remove unused sys import.

The sys module is imported but never used in this file.

Proposed fix
 import importlib.util
-import sys
 import os
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_entity_encoder.py` at line 2, Remove the unused import of the sys
module in tests/test_entity_encoder.py: delete the line importing `sys` (the
unused symbol `sys`) so the file no longer contains an unused import.
api/entities/entity_encoder.py (1)

5-5: Rename ambiguous variable l to label.

Ruff E741 flags l as ambiguous because it can be confused with 1 or I in certain fonts.

Proposed fix
-    result['labels'] = [l for l in n.labels if l != 'Searchable']
+    result['labels'] = [label for label in n.labels if label != 'Searchable']
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/entities/entity_encoder.py` at line 5, Replace the ambiguous loop
variable "l" in the list comprehension that builds result['labels'] from
n.labels with a clearer name like "label": update the expression that reads
result['labels'] = [l for l in n.labels if l != 'Searchable'] to use "label"
(e.g., result['labels'] = [label for label in n.labels if label !=
'Searchable']) so the comprehension and filter reference n.labels and the string
'Searchable' unchanged but no longer use the ambiguous identifier "l".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_entity_encoder.py`:
- Around line 7-14: Replace the dynamic import block that loads entity_encoder
via importlib with a normal import: use from api.entities.entity_encoder import
encode_node; update references that currently use encode_node from the _mod
variable to use the direct import, and ensure entity_encoder.py (and its
dependencies Node/Edge/Path from falkordb) continue to import normally; you can
ignore api/__init__.py and api/info.py side effects because their logging/Redis
logic does not run on import.

---

Nitpick comments:
In `@api/entities/entity_encoder.py`:
- Line 5: Replace the ambiguous loop variable "l" in the list comprehension that
builds result['labels'] from n.labels with a clearer name like "label": update
the expression that reads result['labels'] = [l for l in n.labels if l !=
'Searchable'] to use "label" (e.g., result['labels'] = [label for label in
n.labels if label != 'Searchable']) so the comprehension and filter reference
n.labels and the string 'Searchable' unchanged but no longer use the ambiguous
identifier "l".

In `@tests/test_entity_encoder.py`:
- Line 2: Remove the unused import of the sys module in
tests/test_entity_encoder.py: delete the line importing `sys` (the unused symbol
`sys`) so the file no longer contains an unused import.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8839a9cc-a893-44b1-b4de-8ec3cad9bc90

📥 Commits

Reviewing files that changed from the base of the PR and between ff57c2a and 9d2fcf7.

📒 Files selected for processing (2)
  • api/entities/entity_encoder.py
  • tests/test_entity_encoder.py

Comment on lines +7 to +14
# Import entity_encoder directly without triggering api/__init__.py
_spec = importlib.util.spec_from_file_location(
"entity_encoder",
os.path.join(os.path.dirname(__file__), "..", "api", "entities", "entity_encoder.py"),
)
_mod = importlib.util.module_from_spec(_spec)
_spec.loader.exec_module(_mod)
encode_node = _mod.encode_node
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check what api/__init__.py does that would cause issues when importing entity_encoder
cat api/__init__.py 2>/dev/null || echo "File not found"

# Check if other test files also use this pattern
rg -l "importlib.util.spec_from_file_location" tests/

Repository: FalkorDB/code-graph

Length of output: 336


🏁 Script executed:

# Check what api/entities/__init__.py does
cat api/entities/__init__.py 2>/dev/null || echo "File not found"

# Check the test file itself to understand the context
head -30 tests/test_entity_encoder.py

# Check if there are any initialization or side effects in the modules imported by api/__init__.py
ls -la api/*.py

Repository: FalkorDB/code-graph

Length of output: 1641


🏁 Script executed:

# Check entity_encoder.py to see what it actually imports
head -30 api/entities/entity_encoder.py

# Check if graph.py or llm.py have problematic initialization
grep -n "^import\|^from" api/graph.py | head -20
grep -n "^import\|^from" api/llm.py | head -20

# Check api/info.py as it's imported first
head -20 api/info.py

Repository: FalkorDB/code-graph

Length of output: 1614


🏁 Script executed:

# Check if api/info.py has module-level code that executes connections
cat api/info.py

# Also check if any other modules in api/ might have initialization code
grep -n "redis\|Redis\|connection\|connect" api/info.py

Repository: FalkorDB/code-graph

Length of output: 5453


Remove the importlib workaround and use standard imports instead.

The dynamic import pattern is unnecessary. entity_encoder.py only depends on falkordb (imports Node, Edge, Path), and api/__init__.py has no problematic side effects at import time—it merely imports modules and calls logging.basicConfig(). The Redis connection logic in api/info.py only executes when functions are called, not during module import.

Use: from api.entities.entity_encoder import encode_node

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_entity_encoder.py` around lines 7 - 14, Replace the dynamic import
block that loads entity_encoder via importlib with a normal import: use from
api.entities.entity_encoder import encode_node; update references that currently
use encode_node from the _mod variable to use the direct import, and ensure
entity_encoder.py (and its dependencies Node/Edge/Path from falkordb) continue
to import normally; you can ignore api/__init__.py and api/info.py side effects
because their logging/Redis logic does not run on import.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: encode_node mutates its argument

1 participant