l run the model in windows10 with CPU, but it will spend 4 hours every epoch, that is, 100 epoches need 400 hour in order to run the whole model. it claims it is faster than biLSTM+CRF, actually,it is not.
ok, l run the BERT+biLSTM+CRF on same envirment(windows10 with CPU), it only costs 10 hours, however, it's accuracy is 0.92
Please can you tell me that is why?