-
Notifications
You must be signed in to change notification settings - Fork 9
Description
Describe the bug
When using a real-life phosphoproteomics benchmark collection, alphadia pipeline fails with error in the transfer learning step.
To Reproduce
Steps to reproduce the behavior:
1., download the following files from PRIDE archive (acc: PXD014525, sample that failed is marked bold)
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_Cobimetinib_0-5uM_01.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_Cobimetinib_0-5uM_02.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_Cobimetinib_0-5uM_03.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_Cobimetinib_5uM_01.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_Cobimetinib_5uM_02.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_Cobimetinib_5uM_03.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_EGF_01.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_EGF_02.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_EGF_03.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_NoSerum_01.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_NoSerum_02.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_NoSerum_03.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_PD0325901_0-5uM_01.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_PD0325901_0-5uM_02.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_PD0325901_0-5uM_03.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_PD0325901_5uM_01.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_PD0325901_5uM_02.raw
20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_PD0325901_5uM_03.raw
2., download UP000005640_9606.fasta from Uniprot
3., create config file with these keys parameters set
general:
transfer_step_enabled: true
use_gpu: true
..
search:
extraction_backend: python
target_ms1_tolerance: 17
target_ms2_tolerance: 25
..
library_prediction:
enabled: true
variable_modifications: Oxidation@M;Acetyl@Protein_N-term;Phospho@STY
..
transfer_library:
enabled: true
..
transfer_learning:
enabled: true
4,. start pipeline with alphadia --config <yourconfig.yaml>
Expected behavior
A,. pipeline should check if there are enough precursors to perform transfer learning on.
B,. pipeline should not allow an empty array going to function PolynomialFeatures.
C,. pipeline should took appropriate amendments to modify transfer learning step as needed
Logs
Please provide the log (see the AlphaDIA terminal on where to find it).
8:49:27.440427 INFO: Collecting candidate features
8:50:17.357573 WARNING: intensity_correlation has 3451 NaNs ( 0.01 % out of 30183084)
8:50:17.402002 WARNING: height_correlation has 3 NaNs ( 0.00 % out of 30183084)
8:50:18.544694 INFO: Collecting fragment features
8:50:35.139115 INFO: Finished candidate scoring
8:50:53.413335 INFO: === Performing FDR correction with classifier version 13 ===
8:50:53.437830 INFO: performing precursor FDR with 47 features
8:50:53.437860 INFO: Decoy channel: -1
8:50:53.437876 INFO: competitive: True
8:50:59.184253 INFO: Setting torch num_threads to 2 for FDR classification task
8:51:02.943722 WARNING: dropped 1845 target PSMs due to missing features
8:51:02.943837 WARNING: dropped 1606 decoy PSMs due to missing features
8:51:15.488478 INFO: Estimating optimal hyperparameters - samples: 24,143,706, batch_size: 4,096, learning_rate: 1.00e-03
8:51:15.554330 INFO: Using random state 368999 for train-test-split
9:01:03.822083 INFO: Test AUC: 0.565
9:01:03.834238 INFO: Train AUC: 0.566
9:01:03.834296 INFO: AUC difference: 0.13%
9:01:07.056401 INFO: Resetting torch num_threads to 6
9:01:07.464400 INFO: Removing fragments below FDR threshold
9:01:08.097341 PROGRESS: ============================= Precursor FDR =============================
9:01:08.097517 PROGRESS: Total precursors accumulated: 0
9:01:08.097567 PROGRESS: Target precursors: 0 (0.00%)
9:01:08.097600 PROGRESS: Decoy precursors: 0 (0.00%)
9:01:08.097629 PROGRESS:
9:01:08.097656 PROGRESS: Precursor Summary:
9:01:08.106466 PROGRESS:
9:01:08.106567 PROGRESS: Protein Summary:
9:01:08.106642 PROGRESS: =========================================================================
9:01:08.317947 PROGRESS: === Transfer learning quantification ===
9:01:08.318314 INFO: creating library for charged fragment types: ['b', 'y']
9:01:08.328189 INFO: Calibrating library
9:01:08.328299 INFO: Predicting estimator 'mz' in calibration group 'precursor' ..
9:01:08.328993 ERROR: Search for 20171125_QE7_nLC14_DBJ_SA_DIAphos_RPE1_pilot2_Cobimetinib_5uM_03 failed with error: Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 1 is required by PolynomialFeatures.
Screenshots
If applicable, add screenshots to help explain your problem.
Version (please complete the following information):
- Installation Type [e.g. One-Click Installer / Pip / Developer]
- If no log is available, provide the following:
- Platform information
- system [Debian]
- release [12]
- version [NA]
- machine [x86_64]
- processor [x86_64]
- cpu count [12, with HT]
- Python information:
- alphadia version 2.0.1-dev0
- Python version: 3.10.10
- Platform information
Additional context
Add any other context about the problem here. Attached log files or upload data files if possible.