Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.



The data split was splitting separately for pos and neg. This has been fixed here.
In addition I made some changes to the training pipeline.
I removed the StoreTrainingData class. This previously handled the storing and loading of the data split, but I now found pretty confusing and unnecessary complex. Instead I now have a single split_data function. We don't store pos and neg separately anymore, but just handle the split during loading.
The file names are now stored/generated in SettingsMS2DeepScore. This means passing the spectrum file name to SettingsMS2DeepScore as well. I needed a place to store these file names and this seemed most logical. The added benefit is that the user can, but doesn't have to specify any of the results folders. And the settings actually store the input and output file locations, which is nice if you want to trace back the origin of older models.