Add tiling QC metric for tile-boundary segmentation artifacts#1157
Draft
Add tiling QC metric for tile-boundary segmentation artifacts#1157
Conversation
Cells segmented in tiles get cut at tile borders, producing fragments with artificially straight edges. This adds: - `sq.experimental.tl.calculate_tiling_qc`: per-cell scoring via collinearity-based straight-edge detection (max_straight_edge_ratio, cardinal_alignment_score, cut_score). Scores stored in .obs of a QC AnnData table linked to the labels element via spatialdata_attrs. Algorithm parameters recorded in .uns["tiling_qc"]. - `sq.experimental.pl.tiling_qc`: diagnostic plot via spatialdata-plot (renders labels coloured by score; tile grid emerges from the data). - Cell-aware tiling infrastructure (_tiling.py) for scalable labels-only tile extraction without materialising full arrays. - Test fixture with 400x400 dask-backed ellipsoid cells cut by a 3x3 tile grid, with ground-truth cut/intact classification. - 35 tests (unit, integration, visual regression). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
- Bump fixture from 40 cells on 400x400 to 120 cells on 600x600 for more visible tile-grid pattern in diagnostic plots - Pin spatialdata-plot>=0.3.3 for correct continuous color rendering - Regenerate visual reference images - Use _IMAGE_SIZE constant in centroid bounds test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1157 +/- ##
==========================================
- Coverage 73.56% 72.57% -0.99%
==========================================
Files 44 47 +3
Lines 6929 7359 +430
Branches 1174 1246 +72
==========================================
+ Hits 5097 5341 +244
- Misses 1347 1507 +160
- Partials 485 511 +26
🚀 New features to boost your workflow:
|
- JIT-compile the two-pointer collinearity scan with @njit for ~10-50x speedup on the per-cell hot path - Cap contour points at 500 via arc-length resampling to bound O(n²) - Handle contour closure: scan 3 rotations so straight segments crossing the start/end junction are not split - Vectorise _resample_contour with np.searchsorted (no Python loops) - Replace _zero_non_owned loop with single np.isin pass - Add tqdm progress bar that tracks cells (not tiles), updates on completion for correct parallel reporting - Extract _SCORE_COLUMNS / _NAN_SCORES constants to deduplicate - Precompute segment lengths once across rotations Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Drop 29 tests that over-tested private internals. Keep 4 behavioural tests (output structure, metric discrimination, tiling invariant, error handling) and 3 visual regression tests (one per score column). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
- Replace joblib parallelisation with dask.delayed + dask.compute for native integration with dask-backed zarr data. Tiles are scheduled as delayed tasks; the dask scheduler handles chunk caching and worker management. - Add n_jobs parameter (default -1 = all CPUs) using a threaded scheduler, and an optional dask.distributed.Client parameter for cluster execution. Warn via logger when both are specified. - Add affinity-aware cpu_count() to squidpy/_utils.py that respects cgroup limits (SLURM, Docker, taskset) via os.sched_getaffinity, replacing multiprocessing.cpu_count throughout the codebase. - Fix NameError in pl.tiling_qc (**kwargs referenced after removal), keep spatialdata_plot import lazy, remove unused typing imports. - Replace assert with raise ValueError in verify_coverage. - Add nogil=True to numba collinearity scan for thread parallelism. - Use public API (sq.experimental.tl/pl) in tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Compresses low scores toward zero so tile-boundary artifacts are more visually prominent without changing the stored metric values. Users can pass norm=Normalize() for a linear scale. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reference images from CI runner (ubuntu, py3.12). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reference images from CI runner (ubuntu, py3.12). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ier, nhood_outlier_fraction) Two-stage spatial-context pipeline after per-tile scoring: - smoothed_cut_score: cut_score x mean(k=10 neighbor cut_scores) - is_outlier: MAD-based threshold on smoothed scores (nmads param, default 3) - nhood_outlier_fraction: fraction of k-neighbors that are outliers Also: update plot defaults to nhood_outlier_fraction/RdYlGn_r, add clean-dataset and few-cells edge case tests, generate visual references for new columns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
…cells) Replace rejection sampling with grid-based placement: deterministic, no collision checking needed, 5x more cells at the same memory footprint. Regenerate all visual references for the denser fixture. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…y plot titles - is_outlier now requires both per-cell cut_score AND spatial smoothed score to exceed their respective MAD thresholds (AND when both enabled) - Separate parameters: outlier_use_cut, outlier_use_smoothed, nmads_cut (default 1.5), nmads_smoothed (default 3) - Validation: error on both gates disabled or non-positive nmads - Plot titles mapped to human-friendly names per score column - Regenerate visual references Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
sq.experimental.tl.calculate_tiling_qc— per-cell scoring that detects cells artificially cut by tile boundaries during segmentation, using collinearity-based straight-edge detection on contourssq.experimental.pl.tiling_qc— diagnostic plot via spatialdata-plot where the tile grid emerges from high-scoring cells without requiring tile-border metadata_tiling.py) for scalable labels-only tile extraction — will be shared with [EXPERIMENTAL]: Integrate cp-measure #982 when merged.obsof a QC AnnData table ({labels_key}_qc) with properspatialdata_attrslinking and algorithm params in.uns["tiling_qc"]Metrics
max_straight_edge_ratiocardinal_alignment_scorecut_score