Skip to content

Support CSR sparse matrices from scipy.sparse #56

@daxpryce

Description

@daxpryce

The CompactNetwork is ultimately just a CSR sparse matrix. I had no idea at the time, and now I do, and I feel kind of embarassed. There's probably already csr matrices already in crates.io and I didn't even think to look. Ah well, older and wiser now I guess.

Anyway, if we could take in a csr sparse matrix directly, especially if it's zero copy, I suspect leiden's speed improvements will increase dramatically. We waste so much time copying one set of edgelist-with-nodes-as-strings from python's gil into our own version of Edge, which we then use to build our CompactNetwork, and for some people their network was already in CSR matrix form AND had integral nodes AND they were already doing their own bookkeeping AND we didn't need to make this many dang copies.

Anyway.

Success Gates:

  • leiden(...) or leiden_csr(...) and modularity(...) callable that uses scipy.sparse.csr_matrix as an input type
  • cannot require a specific version of numpy or scipy (ideally we can support multiple major versions of each, with unbounded minor/bugfix versioning). In other words, numpy 1.x and numpy 2.x.
  • zero copy of the matrix, while still not running afoul of the gil. unsure if this is possible, but it is the goal.
  • zero copy to use the csr matrix as the CompactNetwork in our current leiden/hierarchical_leiden/modularity
  • Performance benchmarks including:
    • time for graspologic.leiden to compute a network partitioning from a network that was originally in networkx.Graph() form with string nodes (actual node values should be integer values)
    • time for graspologic_native.leiden to compute a network partitioning from a network that was originally in Vector form with string nodes (actual node values should be integer values)
    • time for grasologic_native.leiden_csr to compute anetwork partitioning from a network that is entirely csr sparse matrix (no strings involved!)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions