Conversation
CodSpeed Performance ReportMerging #97 will degrade performances by 10.85%Comparing Summary
Benchmarks breakdown
|
|
I know there might be some performance hit and I 'll take a look at that. I will provide details for this PR later, but basically, Refactoring and better support for scalar functions like ID, toLower, toUpper, Trim, Type, Coalesce, Size (return and left side condition). Support for better aggregation as well, including Collect, Relationships, list comprehensive syntax. And lastly Condition on list such as ANY, ALL, None, Single |
Commit 1: Support list predicates all, any, none, single, size, relationshipssupport list_expression and relationships functionThe list expression can be the entity_id which is a list by itself or through list_expression : "relationships"i "(" entity_id ")" -> relationships_function
| entity_id -> entity_listThe MATCH (a)-[r*1..3]->(c)
WHERE size(relationships(r)) = 2
RETURN rBasically, it just return the value (already a list). But this should be the base sample for future support. This is an extension of class RelationshipsFunction(ListExpression)support all/none/single/anyThis works on list_expression. One of the sample is MATCH (a)-[r*2]->(c)
WHERE all(edge IN r WHERE edge.weight > 5)
RETURN rThese classes are just conditions. And they accept nested compound condition. MATCH (a)-[r*2]->(c)
WHERE all(edge IN r WHERE edge.weight > 5 AND edge.type = "friend")
RETURN rSIZE functionwe support size to compute size of a list. In this commit, it is an instance of Comdition, but it should be a Scalar Function. Introduction to Scope.the signatures of these function are quite the same as Condition, except that they accept scope (dict) parameter. Consider the following query, the scope variable is the "edge", element of r, which is not MATCH (a)-[r*2]->(c)
WHERE all(edge IN r WHERE edge.weight > 5 AND edge.type = "friend")
RETURN r |
Commit 2: refactor ID(A) to Scalar FunctionID and ScalarFunctionWe have an intermediate class call ScalarFunction, it is an extension of Condition (idk why I inherit it as a Condition, but it doesn't really matter for now) Under Scalar we have For example in the lookup, we just call the scalar function for data_path in data_paths:
if isinstance(data_path, ScalarFunction):
# Evaluate scalar function for each match
ret = []
for match in true_matches:
result_value = data_path(
match,
self._target_graph,
self._return_edges,
scope=None
)
ret.append(result_value)
# Use str(data_path) as key: "ID(A)", "size(r)", etc.
result[str(data_path)] = ret[offset_limit]
processed_paths.add(data_path)
processed_paths.add(str(data_path))
continue |
Commit 3+4: refactor aggregation to classesClass AggregationFunction and aggregation.
MATCH (n)-[r]->()
RETURN n.name, AVG(r.value), r.value
ORDER BY AVG(r.value) DESCScalarFunction and others
def test_string_functions_with_where(self):
"""Test string functions in WHERE clause"""
host = nx.DiGraph()
host.add_node("a", name="ALICE")
host.add_node("b", name="BOB")
qry = """
MATCH (n)
WHERE toLower(n.name) = "alice"
RETURN n.name
"""
res = GrandCypher(host).run(qry)
assert set(res["n.name"]) == {"ALICE"}introduction to EntityAttributeGetter
def evaluate(self, match: Match, host: nx.DiGraph,
return_edges: dict = None, scope: dict = None):
"""
Evaluate this entity reference against a match.
Priority order for resolution:
1. Scope variables (highest priority - for list predicates)
2. Node mappings (standard case)
3. Edge mappings (for edge references)
4. None (not found)
Args:
match: The current match containing node mappings
host: The graph to query
return_edges: Optional edge mappings for edge references
scope: Optional scope dictionary for list predicate variables
Returns:
The attribute value if found, None otherwise
"""
# 1. Check scope first (highest priority for list predicates)
if scope and self.entity in scope:
element = scope[self.entity]
if self.attribute:
# Scope variable with attribute access: e.related
return element.get(self.attribute) if isinstance(element, dict) else None
# Simple scope variable: e
return element
# 2. Check node mappings (standard case)
if self.entity in match.node_mappings:
node_id = match.node_mappings[self.entity]
if self.attribute:
# Node with attribute: n.name
return host.nodes[node_id].get(self.attribute)
# Simple node reference: n - return full node dictionary
return dict(host.nodes[node_id])
# 3. Check edge mappings (for edge references)
if return_edges and self.entity in return_edges:
edge_mapping = return_edges[self.entity]
host_edges = match.mth.edge(*edge_mapping).edges
return get_edge_from_host(host, host_edges, self.attribute)
return NoneCommit 5: docslet's check docs/architecture docs |
More work to be done
|
2fdc232 to
acc89ca
Compare
acc89ca to
4646928
Compare
|
@khoale88 sorry I haven't been commenting — I HAVE been following this closely, this is really exciting work, and I think gets us away from a lot of the hacks this repo was using earlier on and puts us on much firmer ground. THANK YOU! I'm wondering how you're thinking about approaching functions like size('hello') // should return 5It looks like the function should support it, though I think the grammar expects a list right now? I'm super excited about this PR!!! |
|
you are right @j6k4m8 , syntax support should be easy, let me add that ... after the break :D. |
No description provided.