-
Notifications
You must be signed in to change notification settings - Fork 221
LLM use for generating sythetic data #156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…arator instances - Add _MetricDefinition class to separate metric type (definition) from metric value (measurement) - Implement comparator caching via _get_shared_comparator() to ensure only one comparator object exists per unique (method, target, epsilon) combination - All solutions using the same comparison method now share the same comparator instance, reducing memory usage - Maintain full backward compatibility - no changes to Metric class API - Remove TODO comment as the refactoring addresses the concern about mixing metric and metric value concerns
- Add epsilon to __eq__ method to ensure metrics with different epsilon values are correctly differentiated - Add epsilon to __hash__ method to maintain hash contract (must include all fields used in __eq__) - Fix redundant condition in Metric.__gt__ method This fixes critical bugs identified in code review that could cause incorrect metric comparisons.
- Reformat code to comply with black formatting standards - Fixes CI formatting check failure
Greptile OverviewGreptile SummaryThis PR implements LLM-based synthetic data generation for test inputs and refactors the metric system to separate metric definitions from values. Key changes:
Critical issue:
Confidence Score: 3/5
Important Files ChangedFile Analysis
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 files reviewed, 1 comment
| def __eq__(self, other) -> bool: | ||
| """Check if two metric definitions are equal.""" | ||
| if not isinstance(other, _MetricDefinition): | ||
| return False | ||
| return ( | ||
| self.name == other.name | ||
| and self.comparator.comparison_method == other.comparator.comparison_method | ||
| and self.comparator.target == other.comparator.target | ||
| ) | ||
|
|
||
| def __hash__(self) -> int: | ||
| """Hash the metric definition.""" | ||
| return hash((self.name, self.comparator.comparison_method, self.comparator.target)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: epsilon is missing from both __eq__ and __hash__ methods. Two _MetricDefinition instances with different epsilon values will be treated as equal and have the same hash, causing incorrect comparator sharing.
| def __eq__(self, other) -> bool: | |
| """Check if two metric definitions are equal.""" | |
| if not isinstance(other, _MetricDefinition): | |
| return False | |
| return ( | |
| self.name == other.name | |
| and self.comparator.comparison_method == other.comparator.comparison_method | |
| and self.comparator.target == other.comparator.target | |
| ) | |
| def __hash__(self) -> int: | |
| """Hash the metric definition.""" | |
| return hash((self.name, self.comparator.comparison_method, self.comparator.target)) | |
| def __eq__(self, other) -> bool: | |
| """Check if two metric definitions are equal.""" | |
| if not isinstance(other, _MetricDefinition): | |
| return False | |
| return ( | |
| self.name == other.name | |
| and self.comparator.comparison_method == other.comparator.comparison_method | |
| and self.comparator.target == other.comparator.target | |
| and self.comparator.epsilon == other.comparator.epsilon | |
| ) | |
| def __hash__(self) -> int: | |
| """Hash the metric definition.""" | |
| return hash((self.name, self.comparator.comparison_method, self.comparator.target, self.comparator.epsilon)) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: plexe/internal/models/entities/metric.py
Line: 163:175
Comment:
**logic:** `epsilon` is missing from both `__eq__` and `__hash__` methods. Two `_MetricDefinition` instances with different epsilon values will be treated as equal and have the same hash, causing incorrect comparator sharing.
```suggestion
def __eq__(self, other) -> bool:
"""Check if two metric definitions are equal."""
if not isinstance(other, _MetricDefinition):
return False
return (
self.name == other.name
and self.comparator.comparison_method == other.comparator.comparison_method
and self.comparator.target == other.comparator.target
and self.comparator.epsilon == other.comparator.epsilon
)
def __hash__(self) -> int:
"""Hash the metric definition."""
return hash((self.name, self.comparator.comparison_method, self.comparator.target, self.comparator.epsilon))
```
How can I resolve this? If you propose a fix, please make it concise.
Completed a TODO where llm is called for dataset generation keeping the orginal manual logic there as fallback