-
Notifications
You must be signed in to change notification settings - Fork 9
Description
The CompactNetwork is ultimately just a CSR sparse matrix. I had no idea at the time, and now I do, and I feel kind of embarassed. There's probably already csr matrices already in crates.io and I didn't even think to look. Ah well, older and wiser now I guess.
Anyway, if we could take in a csr sparse matrix directly, especially if it's zero copy, I suspect leiden's speed improvements will increase dramatically. We waste so much time copying one set of edgelist-with-nodes-as-strings from python's gil into our own version of Edge, which we then use to build our CompactNetwork, and for some people their network was already in CSR matrix form AND had integral nodes AND they were already doing their own bookkeeping AND we didn't need to make this many dang copies.
Anyway.
Success Gates:
leiden(...)orleiden_csr(...)andmodularity(...) callable that usesscipy.sparse.csr_matrixas an input type- cannot require a specific version of numpy or scipy (ideally we can support multiple major versions of each, with unbounded minor/bugfix versioning). In other words, numpy 1.x and numpy 2.x.
- zero copy of the matrix, while still not running afoul of the gil. unsure if this is possible, but it is the goal.
- zero copy to use the csr matrix as the CompactNetwork in our current leiden/hierarchical_leiden/modularity
- Performance benchmarks including:
- time for
graspologic.leidento compute a network partitioning from a network that was originally innetworkx.Graph()form with string nodes (actual node values should be integer values) - time for
graspologic_native.leidento compute a network partitioning from a network that was originally in Vector form with string nodes (actual node values should be integer values) - time for
grasologic_native.leiden_csrto compute anetwork partitioning from a network that is entirely csr sparse matrix (no strings involved!)
- time for