The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects
An implementation for the paper:
Zhanxing Zhu*, Jingfeng Wu*, Bing Yu, Lei Wu, Jinwen Ma
See folder 2Dim.
One hidden layer experiments
See folder OneHiddenLayer.
See folder FashionMNIST.
See folder SVHN and CIFAR-10.