projects:asgd:start
This is an old revision of the document!
Annealing SGD
Annealing Stochastic Gradient Descent (AGD)
Here, we propose a novel annealed gradient descent (AGD) method for
deep learning. AGD optimizes a sequence of gradually improved smoother mosaic
functions that approximate the original non-convex objective function according
to an annealing schedule during optimization process. We present a theoretical
analysis on its convergence properties and learning speed. The proposed AGD
algorithm is applied to learning deep neural networks (DNN) for image recognition
in MNIST and speech recognition in Switchboard.
Reference:
[1]
projects/asgd/start.1423674803.txt.gz · Last modified: 2015/02/11 17:13 by hj