**else: if beam_search: return outputs[0], outputs[1], outputs[2:] # No gradient norm, loss, outputs.** I have a doubt here: why are you returning 3 values, when the comment says just 2 values.