Skipped 6000 image pairs
2022-01-08 18:00:28,652 - root - INFO -
2022-01-08 18:00:28,652 - root - INFO - Before pruning:
2022-01-08 18:00:28,652 - root - INFO - Sparsity range: 0.0 -> 0.1
2022-01-08 18:00:28,652 - root - INFO -
Train Ep. #1: 4%|##3 | 465/12274 [05:52<1:17:12, 2.55it/s, loss=9.35, accuracy=nan, lr=0.0005, sparsity=1, network_width_mpl=1]
Is it normal to have accuracy=nan in the training beginning ?