Skip to content

Conversation

@AumPatel1
Copy link
Contributor

Completed a TODO where llm is called for dataset generation keeping the orginal manual logic there as fallback

…arator instances

- Add _MetricDefinition class to separate metric type (definition) from metric value (measurement)
- Implement comparator caching via _get_shared_comparator() to ensure only one comparator object exists per unique (method, target, epsilon) combination
- All solutions using the same comparison method now share the same comparator instance, reducing memory usage
- Maintain full backward compatibility - no changes to Metric class API
- Remove TODO comment as the refactoring addresses the concern about mixing metric and metric value concerns
- Add epsilon to __eq__ method to ensure metrics with different epsilon values are correctly differentiated
- Add epsilon to __hash__ method to maintain hash contract (must include all fields used in __eq__)
- Fix redundant condition in Metric.__gt__ method

This fixes critical bugs identified in code review that could cause incorrect metric comparisons.
- Reformat code to comply with black formatting standards
- Fixes CI formatting check failure
@greptile-apps
Copy link

greptile-apps bot commented Dec 2, 2025

Greptile Overview

Greptile Summary

This PR implements LLM-based synthetic data generation for test inputs and refactors the metric system to separate metric definitions from values.

Key changes:

  • plexe/tools/datasets.py: Completed TODO by adding LLM calls to create_input_sample() for generating realistic synthetic data with proper fallback to manual logic
  • plexe/internal/models/entities/metric.py: Refactored to separate _MetricDefinition from Metric class and added comparator caching to reduce memory usage
  • Documentation: Added comprehensive codebase structure reference and model building walkthrough

Critical issue:

  • The _MetricDefinition.__eq__ and __hash__ methods don't include epsilon, which will cause metrics with different epsilon values to be incorrectly treated as equal

Confidence Score: 3/5

  • This PR has one critical logic bug that needs to be fixed before merging
  • The LLM integration in datasets.py is well-implemented with proper error handling and fallback logic. However, the metric refactoring in metric.py has a critical bug where epsilon is missing from equality/hash methods, which will cause incorrect behavior when comparing metrics with different epsilon values. The documentation additions are excellent and safe.
  • plexe/internal/models/entities/metric.py:163-175 - Fix missing epsilon in equality and hash methods

Important Files Changed

File Analysis

Filename Score Overview
plexe/tools/datasets.py 4/5 Added LLM-based synthetic data generation with fallback to manual logic - implementation looks good with proper error handling
plexe/internal/models/entities/metric.py 3/5 Refactored to separate metric definition from value with comparator caching, but epsilon is missing from _MetricDefinition equality/hash

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +163 to +175
def __eq__(self, other) -> bool:
"""Check if two metric definitions are equal."""
if not isinstance(other, _MetricDefinition):
return False
return (
self.name == other.name
and self.comparator.comparison_method == other.comparator.comparison_method
and self.comparator.target == other.comparator.target
)

def __hash__(self) -> int:
"""Hash the metric definition."""
return hash((self.name, self.comparator.comparison_method, self.comparator.target))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: epsilon is missing from both __eq__ and __hash__ methods. Two _MetricDefinition instances with different epsilon values will be treated as equal and have the same hash, causing incorrect comparator sharing.

Suggested change
def __eq__(self, other) -> bool:
"""Check if two metric definitions are equal."""
if not isinstance(other, _MetricDefinition):
return False
return (
self.name == other.name
and self.comparator.comparison_method == other.comparator.comparison_method
and self.comparator.target == other.comparator.target
)
def __hash__(self) -> int:
"""Hash the metric definition."""
return hash((self.name, self.comparator.comparison_method, self.comparator.target))
def __eq__(self, other) -> bool:
"""Check if two metric definitions are equal."""
if not isinstance(other, _MetricDefinition):
return False
return (
self.name == other.name
and self.comparator.comparison_method == other.comparator.comparison_method
and self.comparator.target == other.comparator.target
and self.comparator.epsilon == other.comparator.epsilon
)
def __hash__(self) -> int:
"""Hash the metric definition."""
return hash((self.name, self.comparator.comparison_method, self.comparator.target, self.comparator.epsilon))
Prompt To Fix With AI
This is a comment left during a code review.
Path: plexe/internal/models/entities/metric.py
Line: 163:175

Comment:
**logic:** `epsilon` is missing from both `__eq__` and `__hash__` methods. Two `_MetricDefinition` instances with different epsilon values will be treated as equal and have the same hash, causing incorrect comparator sharing.

```suggestion
    def __eq__(self, other) -> bool:
        """Check if two metric definitions are equal."""
        if not isinstance(other, _MetricDefinition):
            return False
        return (
            self.name == other.name
            and self.comparator.comparison_method == other.comparator.comparison_method
            and self.comparator.target == other.comparator.target
            and self.comparator.epsilon == other.comparator.epsilon
        )
    
    def __hash__(self) -> int:
        """Hash the metric definition."""
        return hash((self.name, self.comparator.comparison_method, self.comparator.target, self.comparator.epsilon))
```

How can I resolve this? If you propose a fix, please make it concise.

@AumPatel1 AumPatel1 closed this Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant