帮助 关于我们

返回检索结果

Convergence of Stochastic Gradient Descent in Deep Neural Network

查看参考文献15篇

Zhou Baicun 1   Han Congying 1,2 *   Guo Tiande 1,2  
文摘 Stochastic gradient descent(SGD)is one of the most common optimization algorithms used in pattern recognition and machine learning. This algorithm and its variants are the preferred algorithm while optimizing parameters of deep neural network for their advantages of low storage space requirement and fast computation speed. Previous studies on convergence of these algorithms were based on some traditional assumptions in optimization problems. However, the deep neural network has its unique properties. Some assumptions are inappropriate in the actual optimization process of this kind of model. In this paper, we modify the assumptions to make them more consistent with the actual optimization process of deep neural network. Based on new assumptions, we studied the convergence and convergence rate of SGD and its two common variant algorithms. In addition, we carried out numerical experiments with LeNet-5, a common network framework, on the data set MNIST to verify the rationality of our assumptions.
来源 Acta Mathematicae Applicatae Sinica-English Series ,2021,37(1):126-136 【核心库】
DOI 10.1007/s10255-021-0991-2
关键词 stochastic gradient descent ; deep neural network ; convergence
地址

1. School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100049  

2. Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing, 100190

语种 英文
文献类型 研究性论文
ISSN 0168-9673
学科 自动化技术、计算机技术
基金 国家自然科学基金 ;  the Leading Project of the Chinese Academy of Sciences
文献收藏号 CSCD:6915543

参考文献 共 15 共1页

1.  Allen-Zhum Z. Optimal black-box reductions between optimization objectives. Proceedings of the Advances in Neural Information Processing Systems,2016:1614-1622 CSCD被引 1    
2.  Allen-Zhu Z. Neon2: Finding local minima via first-order oracles. Proceedings of the Advances in Neural Information Processing Systems,2018:3716-3726 CSCD被引 1    
3.  Bottou L. The tradeoffs of large scale learning. Proceedings of the Advances in neural information processing systems,2008:161-168 CSCD被引 1    
4.  Bottou L. Large-scale machine learning with stochastic gradient descent. Proceedings of the International Conference on Computational Statistics,2010:177-186 CSCD被引 1    
5.  Bottou L. Optimization methods for large-scale machine learning. Siam Review,2018,60(2):223-311 CSCD被引 67    
6.  Deng L. The MNIST Database of Handwritten Digit Images for Machine Learning Research. IEEE Signal Processing Magazine,2012,29(6):141-142 CSCD被引 94    
7.  Greff K. LSTM: A Search Space Odyssey. IEEE Transactions on Neural Networks and Learning Systems,2016,28(10):2222-2232 CSCD被引 339    
8.  He K. Mask rcnn. Proceedings of the IEEE International Conference on Computer Vision,2017:2980-2988 CSCD被引 22    
9.  Lecun Y. Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE,1998,86(11):2278-2324 CSCD被引 2289    
10.  Lecun Y. Deep learning. Nature,2015,521(7553):436-444 CSCD被引 3550    
11.  Nesterov Y E. A method for solving the convex programming problem with convergence rate O(1/k~2). Doklady Akademii Nauk SSSR,1983,269:543-547 CSCD被引 23    
12.  Polyak B T. Some methods of speeding up the convergence of iteration methods. USSR Computational Mathematics and Mathematical Physics,1964,4(5):1-17 CSCD被引 67    
13.  Robbins H. A stochastic approximation method. Herbert Robbins Selected Papers,1985:102-109 CSCD被引 1    
14.  Ge R. Escaping from saddle pointsonline stochastic gradient for tensor decomposition. Proceedings of the Conference on Learning Theory,2015:797-842 CSCD被引 1    
15.  Sainath T N. Deep Convolutional Neural Networks for large-scale speech tasks. Neural Networks,2015,64:39-48 CSCD被引 44    
引证文献 1

1 郭金涛 基于深度学习的宽厚板热轧轧制力预测 锻压技术,2022,47(7):167-174
CSCD被引 3

显示所有1篇文献

论文科学数据集
PlumX Metrics
相关文献

 作者相关
 关键词相关
 参考文献相关

版权所有 ©2008 中国科学院文献情报中心 制作维护:中国科学院文献情报中心
地址:北京中关村北四环西路33号 邮政编码:100190 联系电话:(010)82627496 E-mail:cscd@mail.las.ac.cn 京ICP备05002861号-4 | 京公网安备11010802043238号