ADD: reject_by_annotation to NoisyChannels #180

Randomidous · 2026-01-29T15:55:40Z

PR Description

Following a discussion here and there, this PR implements mne's reject_by_annotation to exclude BAD-annotated time segments from channel quality assessment. Bad segments (e.g., participant movement during breaks) are only temporarily excluded from statistical analysis while preserving the original continuous data structure in the output.

Note:

This has not been tested
I recommend looking at the changes commit by commit, since there is also a formatting commit changing a lot of things :)
closes: Add a "reject_by_annotation" parameter for finding bad channels #108
closes: [Feature suggestion] Allow for relevant annotation selection during processing. #133

Merge Checklist

the PR has been reviewed and all comments are resolved
all CI checks pass
(if applicable): the PR description includes the phrase closes #<issue-number> to automatically close an issue
(if applicable): the changes are documented in the changelog changelog.rst
(if applicable): new contributors have added themselves to the authors list in the CITATION.cff file

codecov · 2026-01-29T15:59:12Z

Codecov Report

❌ Patch coverage is 85.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.73%. Comparing base (34d4773) to head (c01217e).

Files with missing lines	Patch %	Lines
pyprep/find_noisy_channels.py	85.00%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #180      +/-   ##
==========================================
- Coverage   98.09%   97.73%   -0.36%     
==========================================
  Files           7        7              
  Lines         734      752      +18     
==========================================
+ Hits          720      735      +15     
- Misses         14       17       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Randomidous · 2026-01-30T12:46:00Z

Regarding continuity concerns:

Filtering happens BEFORE segment extraction (lines 98-101):

if do_detrend:
    self.raw_mne._data = removeTrend(
        self.raw_mne.get_data(), self.sample_rate, matlab_strict=matlab_strict
    )

The high-pass detrending is applied to the full continuous raw_mne object first. Then at line 160-163, segments are
extracted with reject_by_annotation. This avoids filter edge artifacts at concatenation boundaries.
Different use case: The documentation correctly notes this is for excluding known-bad segments (e.g., movement
artifacts during breaks) from statistical analysis, not for handling actual recording discontinuities. The data
structure remains continuous.

Remaining Concerns:

Bandpass filtering in _get_filtered_data() (lines 171-190): This 1-50 Hz filter is applied to self.EEGData, which
is already the concatenated data. If a BAD annotation spans a significant portion of the recording, the concatenation
boundaries could produce filter artifacts.
Correlation windows spanning boundaries: If a 1-second correlation window happens to straddle where two clean
segments were concatenated, the correlation could be affected by the artificial discontinuity (though the impact is
likely minimal if the detrending was already done).
RANSAC: Signal prediction operates on the concatenated filtered data, potentially affected by the above.

The MATLAB PREP concerns are mostly addressed for the intended use case (excluding known-bad segments like movement
artifacts), because:

The primary filtering (detrending) happens on continuous data before extraction
You're not actually concatenating separate recordings with true discontinuities
The remaining filtering on concatenated data uses relatively narrow-band filters that are less prone to severe edge
artifacts

However, there's a theoretical edge case where the bandpass filtering on concatenated data could introduce minor
artifacts. This would only be significant if:

Large portions of data are marked BAD
Multiple small clean segments get concatenated together

^^^
This could be tested.

For typical use (excluding a few isolated bad segments), this should not materially affect channel quality assessment.

pyprep/find_noisy_channels.py

sappelhoff · 2026-01-30T14:56:47Z

tests/test_find_noisy_channels.py

+    # Without rejection, sample count should be unchanged
+    nd_no_reject = NoisyChannels(raw_tmp, do_detrend=False, reject_by_annotation=None)
+    assert nd_no_reject.n_samples == original_samples


This one pretty much covers test_reject_by_annotation_default from above, right? So I vote to remove test_reject_by_annotation_default

docs/matlab_differences.rst

sappelhoff

nice work @Randomidous !

Few comments apart from the inline one(s).

We should explain what we mean by "bad" segments by linking to https://mne.tools/stable/auto_tutorials/preprocessing/20_rejecting_bad_data.html#the-reject-by-annotation-parameter in the docs via something like

See the MNE tutorial on rejecting data spans:
:ref:`mne:tut-reject-data-spans`.

(this should work due to interphinx)

And in the docstrings we can be more brief and say that it's any annotation starting with BAD or bad

Thanks for your "caveats" comment. I think we should add a "test" to pyprep that warns a user IF (i) reject_by_annotation is not None AND (ii) the bad segments are "plenty", i.e., not a few "long" bad segments (block breaks), but potentially loads of small bad segments (where somebody tried to filter out e.g., muscle activations via https://mne.tools/stable/generated/mne.preprocessing.annotate_muscle_zscore.html#mne.preprocessing.annotate_muscle_zscore)

sappelhoff

One more point I forgot: I don't think we need the "NaN" option 🤔 in fact, I think it's unnecessarily complicating this. "Omit" should be sufficient.

Oh and of course let's test this a little before merging it. I suggest you run it on your dataset that you are currently working on and compare the performance / sanity WITH and without reject_by_annotation.

Co-authored-by: Stefan Appelhoff <[email protected]>

…/pyprep into break-it-down-for-me

sappelhoff · 2026-01-30T15:16:33Z

docs/matlab_differences.rst


-This parameter has no equivalent in MATLAB PREP and is not affected by the
-``matlab_strict`` setting.
+This parameter has no equivalent in MATLAB PREP. When ``matlab_strict`` is set to ``True``, ``reject_by_annotation`` is automatically set to ``None``.


forgot to say, but this will of course need code adjustments 👼

Way ahead of you 😁

sappelhoff · 2026-01-30T15:28:35Z

pyprep/find_noisy_channels.py

+                "reject_by_annotation is not available in MATLAB PREP. "
+                f"Setting reject_by_annotation to None (was '{reject_by_annotation}')."


Suggested change

"reject_by_annotation is not available in MATLAB PREP. "

f"Setting reject_by_annotation to None (was '{reject_by_annotation}')."

"reject_by_annotation is not available in MATLAB PREP. "

f"Setting reject_by_annotation to None (was '{reject_by_annotation}'). "

"To use reject_by_annotation, matlab_strict must be set to False."

This sounds more like an error message than a warning, aye?

Randomidous added 8 commits January 29, 2026 16:36

ADD: reject_by_annotation to NoisyChannels

9e0f168

add to reference

b273289

add to pipeline

d1443a2

add test

d75739d

change ', optional' to ' | None' for consistency

f687b51

self.add_creds(me)

3b29440

update change log

1e2fb2f

update matlab diffs

3768232

Randomidous commented Jan 30, 2026

View reviewed changes

pyprep/find_noisy_channels.py Show resolved Hide resolved