Differences

This shows you the differences between two versions of the page.

--- projects:asgd:start [2015/02/11 17:09] – created hj
+++ projects:asgd:start [2015/07/12 14:04] (current) – hj
@@ Line 1: / Line 1: @@
 ====== Annealing SGD ======
-===== Annealing Stochastic Gradient Descent (ASGD) =====
+===== Annealed Stochastic Gradient Descent (AGD) =====
-{{ :projects:asgd:screen_shot_2015-02-11_at_12.08.29_pm.png?0x250 |}}
+{{ :projects:asgd:screen_shot_2015-02-11_at_12.08.29_pm.png?0x500 |}}
+\\
+\\
+Here, we propose a novel annealed gradient descent (AGD) method for
+deep learning. AGD optimizes a sequence of gradually improved smoother mosaic
+functions that approximate the original non-convex objective function according
+to an annealing schedule during optimization process. We present a theoretical
+analysis on its convergence properties and learning speed. The proposed AGD
+algorithm is applied to learning deep neural networks (DNN) for image recognition
+in MNIST and speech recognition in Switchboard.
+\\
+**Reference:** \\
+[1] //Hengyue Pan, Hui Jiang//, "Annealed Gradient Descent for Deep Learning", //Proc. of 31th Conference on Uncertainty in Artificial Intelligence (UAI 2015)//, July 2015. ( [[http://auai.org/uai2015/proceedings/papers/73.pdf|here]])