Fix data split #277

niekdejonge · 2025-08-18T11:09:55Z

The data split was splitting separately for pos and neg. This has been fixed here.

In addition I made some changes to the training pipeline.
I removed the StoreTrainingData class. This previously handled the storing and loading of the data split, but I now found pretty confusing and unnecessary complex. Instead I now have a single split_data function. We don't store pos and neg separately anymore, but just handle the split during loading.

The file names are now stored/generated in SettingsMS2DeepScore. This means passing the spectrum file name to SettingsMS2DeepScore as well. I needed a place to store these file names and this seemed most logical. The added benefit is that the user can, but doesn't have to specify any of the results folders. And the settings actually store the input and output file locations, which is nice if you want to trace back the origin of older models.

sonarqubecloud · 2025-08-18T11:14:28Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
91.1% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

…inside trained_models_folder

niekdejonge added 5 commits August 18, 2025 11:59

Added spectrum file path to SettingsMS2Deepscore.py.

b4531b3

Replace StoreTrainingData with split data if necessary

557c700

Update tests

adea510

Remove not needed file

ab5055d

update CHANGELOG.md

b6ef2fb

niekdejonge requested a review from florian-huber August 18, 2025 11:09

update README.md

3470716

niekdejonge added 3 commits August 18, 2025 14:12

Change testing_spectra_file_name -> test_spectra_file_name and store …

f724ff0

…inside trained_models_folder

Already create dir during split_data_if_necessary

de3aebf

Remove double trained_models folder

a676153

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix data split #277

Fix data split #277

Uh oh!

niekdejonge commented Aug 18, 2025

Uh oh!

sonarqubecloud bot commented Aug 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix data split #277

Are you sure you want to change the base?

Fix data split #277

Uh oh!

Conversation

niekdejonge commented Aug 18, 2025

Uh oh!

sonarqubecloud bot commented Aug 18, 2025

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants