ADD: bad_by_PSD method #179

Randomidous · 2026-01-29T11:06:32Z

PR Description

Building on @nabilalibou's work (#145), this PR adds functionality for finding bad channels using Power Spectral Density (PSD). This method is not part of the original MATLAB PREP pipeline and only runs when matlab_strict=False.

Detection Criteria

A channel is flagged as "bad-by-PSD" if either:

Abnormally high band power: The channel has an outlier z-score (>3.0) in any of three frequency bands:
- Low (1-15 Hz): delta, theta, alpha
- Mid (15-30 Hz): beta
- High (30-45 Hz): gamma
1/f violation: The high-frequency band has more power than the low-frequency band, which violates the typical 1/f spectral profile of EEG and often indicates muscle artifact or poor electrode contact.

Implementation Details

PSD is computed using Welch's method over 1-45 Hz (configurable via fmin/fmax)
Default frequency range excludes 50/60 Hz line noise
Uses MAD-based robust z-scoring (scaled by 1.4826 to convert to SD units)
Only flags positive z-scores (excess power), as abnormally low power may reflect normal topographic variation

Changes from previous PR (#145)

Detects only high PSD (not low), since low power can be normal
Uses MAD-based robust statistics instead of standard z-scoring
Splits spectrum into three frequency bands instead of total power
Adds 1/f violation criterion to catch muscle artifacts
Includes bad_by_psd in the bad channels dictionary
Fixes KeyError in Reference.robust_reference() by adding bad_by_psd to the noisy channels tracking dict

closes: Add bad_by_PSD() #145

Merge Checklist

the PR has been reviewed and all comments are resolved
all CI checks pass
(if applicable): the PR description includes the phrase closes #<issue-number> to automatically close an issue
(if applicable): the changes are documented in the changelog changelog.rst
(if applicable): new contributors have added themselves to the authors list in the CITATION.cff file

docs/changelog.rst

sappelhoff · 2026-01-29T13:45:50Z

pyprep/find_noisy_channels.py

+        # Sum log PSD across frequencies for each channel to get total power
+        total_log_psd = np.sum(log_psd, axis=1)


I think in this step we lose some info. Imagine a channel that has an abnormally high PSD in low freqs, but lower than normal in high freqs --> the total power will then look like that of a channel that is medium in high and low 🤔

just as a rule of thumb, we could try dividing into 3 "band power" bins (and only within bins summing up to total log psd):

0-15Hz

15-30Hz

30-45Hz

a channel would be bad if the bin in 30-45Hz has more power than that of 0-15Hz.

a channel could also be bad if it is abnormal from other channels in any one of these three bins

a channel could also be bad if any ratio of bands is abnormal (1/2, 1/3, 2/3)

Removed third criterion

pyprep/find_noisy_channels.py

docs/matlab_differences.rst

Co-authored-by: Stefan Appelhoff <[email protected]>

Randomidous · 2026-01-30T13:37:26Z

After testing with some artifact-ladden and somewhat clean data, I figured that the threshs need to be adapted or some criteria (eyeing bad_by_ratio atm) removed.

See plots below:
Noisy data:

Less noisy data (note still 7 channels removed):

Randomidous · 2026-01-30T13:40:50Z

Removing bad_by_ratio criterion:

What it did: For each channel, it computed the ratio of power between frequency bands (low/mid, low/high, mid/high), then flagged
channels where any of these ratios was unusual compared to other channels.

The rationale: The idea was to catch channels with abnormal spectral shape even if absolute power was normal. For example:

A channel with normal total power but unusually flat spectrum (similar power across bands)
A channel with an unusual bump in one band relative to others

Why it's problematic:

There's natural variation in spectral shape across the scalp (frontal vs occipital channels have different alpha power, etc. Imagine doing workload analysis and using this method)
Three separate ratio tests with OR logic means 3x the chance of false positives (I should have known better)
If a channel passes bad_by_band (absolute power is normal in all bands), the spectral shape is probably fine too - ratio
deviations at that point are likely just normal topographic variation

The remaining criteria are more robust:

bad_by_band: Catches genuinely abnormal power (too high or too low)
bad_by_1f_violation: Catches channels where high-freq > low-freq power, which violates the fundamental 1/f characteristic of EEG and indicates noise/artifact

codecov · 2026-01-30T13:52:21Z

Codecov Report

❌ Patch coverage is 98.07692% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 98.08%. Comparing base (34d4773) to head (1292494).

Files with missing lines	Patch %	Lines
pyprep/find_noisy_channels.py	98.07%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #179      +/-   ##
==========================================
- Coverage   98.09%   98.08%   -0.01%     
==========================================
  Files           7        7              
  Lines         734      785      +51     
==========================================
+ Hits          720      770      +50     
- Misses         14       15       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Randomidous · 2026-01-30T14:02:00Z

Without the ratio criterion, we still see a lot of channels being removed for noisy data. Some of these are clearly bad but others I'd rather keep in the dataset. So I will dig a bit deeper.

Randomidous · 2026-01-30T14:03:42Z

Without the ratio criterion, we still see a lot of channels being removed for noisy data. Some of these are clearly bad but others I'd rather keep in the dataset. So I will dig a bit deeper.

I mean, if a cleaning method fails on noisy data, then what's the point 🤔

Randomidous · 2026-01-30T14:08:35Z

Considering this atm:

only mark union of bad channels?
raise z-score thresh (on which grounds?)
only consider positive z-scores? (since lower than average is not too sus)

Randomidous · 2026-01-30T14:12:28Z

Will check only flagging channels with abnormally HIGH power (positive z-scores). Excess power is more reliably indicative of problems (muscle artifact, noise, bad contact causing interference). Low power is ambiguous.
Getting closer and closer to the HF noise criterion.

Randomidous · 2026-01-30T14:27:46Z

This seems to work splendidly!

Below are bad channels only identified by bad_by_PSD in grey:

These are all bads by PSD (as compared to unique for PSD):

Running it on the same clean data from earlier only returns a single bad channel by PSD:

Randomidous · 2026-01-30T15:02:06Z

The current implementation seems promising and will be tested on a complete dataset next week :)

sappelhoff

nice work here @Randomidous looking forward to seeing the test results next week.

sappelhoff

@nabilalibou please feel free to have a look at this PR that is in the same vein of what you worked on in 2024.

Please also supply us with your details so we can add you to CITATION.cff, to give you some credit for getting this started at some point.

Follow this example, please:

    - given-names: Roy Eric
      family-names: Wieske
      affiliation: 'Biopsychology and Neuroergonomics, Technische Universität Berlin, Berlin, Germany'
      orcid: 'https://orcid.org/0009-0006-2018-1074'

nabilalibou and others added 8 commits January 29, 2026 11:06

ADD prototype of bad_by_PSD() method

b5d9f5f

fix bad_by_PSD

2ce9ac1

avoid 50Hz

f5c1f8a

add test for bad_by_PSD

72ae6f8

update matlab diffs

fc9d3ab

add self as author

eff0a71

update changelog

157506f

add to authors

65de129

sappelhoff reviewed Jan 29, 2026

View reviewed changes

docs/changelog.rst Outdated Show resolved Hide resolved

sappelhoff reviewed Jan 29, 2026

View reviewed changes

pyprep/find_noisy_channels.py Show resolved Hide resolved

sappelhoff reviewed Jan 29, 2026

View reviewed changes

docs/matlab_differences.rst Show resolved Hide resolved

Randomidous and others added 5 commits January 29, 2026 19:11

Add author to change log

6bc4c76

Co-authored-by: Stefan Appelhoff <[email protected]>

split spectrum into three bands; add additional criteria

5a4597d

update tests

c1519a9

refine matlab diff details

80effa5

find bad by PSD by default

8c25932

Randomidous added 2 commits January 30, 2026 14:45

don't use ratio criterion

7a04d83

add to noisy dict; should fix tests

26c82d8

only consider positive z-scores

c4fa6c8

update test to reflect changed criterion

1292494

sappelhoff reviewed Jan 30, 2026

View reviewed changes

		# Sum log PSD across frequencies for each channel to get total power
		total_log_psd = np.sum(log_psd, axis=1)

ADD: bad_by_PSD method #179

Are you sure you want to change the base?

ADD: bad_by_PSD method #179

Uh oh!

Conversation

Randomidous commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Description

Detection Criteria

Implementation Details

Changes from previous PR (#145)

Merge Checklist

Uh oh!

Uh oh!

sappelhoff Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

sappelhoff Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Randomidous Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Randomidous Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Randomidous commented Jan 30, 2026

Uh oh!

Randomidous commented Jan 30, 2026

Uh oh!

codecov bot commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Randomidous commented Jan 30, 2026

Uh oh!

Randomidous commented Jan 30, 2026

Uh oh!

Randomidous commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Randomidous commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Randomidous commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Randomidous commented Jan 30, 2026

Uh oh!

sappelhoff left a comment

Choose a reason for hiding this comment

Uh oh!

sappelhoff left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Randomidous commented Jan 29, 2026 •

edited

Loading

sappelhoff Jan 29, 2026 •

edited

Loading

codecov bot commented Jan 30, 2026 •

edited

Loading

Randomidous commented Jan 30, 2026 •

edited

Loading

Randomidous commented Jan 30, 2026 •

edited

Loading

Randomidous commented Jan 30, 2026 •

edited

Loading