Skip to content

Optimize Variant Classification Functions to Avoid ORM Relationship Loading #657

@bencap

Description

@bencap

The variant classification functions in classification.py currently use ORM relationship membership checks (variant in functional_range.variants) to determine which functional classification range a variant belongs to. This approach has significant performance implications:

  • Eager Loading Required: The score set API must eagerly load the variants relationship on all ScoreCalibrationFunctionalClassification objects to avoid N+1 queries
  • Memory Overhead: Loading all variants for all ranges means loading potentially thousands of variant records into memory even when only checking a single variant
  • O(n) Lookup: The membership check requires iterating through all variants in a range, resulting in O(n) time complexity
  • API Performance Degradation: Score set API endpoints are slowed down due to the required relationship loading

It seems prudent to pre-resolve these variant classifications to reduce DB fetches and minimize the amount of variant data to hold in memory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    app: backendTask implementation touches the backendtype: enhancementEnhancement to an existing feature

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions