帮助 关于我们

返回检索结果

基于无标签视频数据的深度预测学习方法综述
A Survey on Deep Predictive Learning Based on Unlabeled Videos

查看参考文献122篇

潘敏婷 1   王韫博 1 *   朱祥明 1   高思宇 1   龙明盛 2   杨小康 1  
文摘 基于视频数据的深度预测学习(以下简称"深度预测学习")属于深度学习、计算机视觉和强化学习的交叉融合研究方向,是气象预报、自动驾驶、机器人视觉控制等场景下智能预测与决策系统的关键组成部分,在近年来成为机器学习的热点研究领域.深度预测学习遵从自监督学习范式,从无标签的视频数据中挖掘自身的监督信息,学习其潜在的时空模式表达.本文对基于深度学习的视频预测现有研究成果进行了详细综述.首先,归纳了深度预测学习的研究范畴和交叉应用领域.其次,总结了视频预测研究中常用的数据集和评价指标.而后,从基于观测空间的视频预测、基于状态空间的视频预测、有模型的视觉决策三个角度,分类对比了当前主流的深度预测学习模型.最后,本文分析了深度预测学习领域的热点问题,并对研究趋势进行了展望.
其他语种文摘 Deep predictive learning based on video data(hereinafter referred to as "deep predictive learning")is a research direction of deep learning, being interacted with computer vision and reinforcement learning. It is a key part of intelligent prediction and decision-making systems in weather forecasting, autonomous driving, robotics, and other scenarios, and has become a hot research field of machine learning in recent years. Deep predictive learning follows the self-supervised learning paradigm, using internal constraints from unlabeled video data to learn the underlying spatiotemporal patterns. In this paper, we review the existing deep learning techniques for predictive learning in detail. First, we summarize the research scope and application fields of deep predictive learning. Second, we present the datasets and evaluation metrics commonly used in this research field. Third, we summarize current mainstream deep prediction learning models from three perspectives: predictive models based on observation space, predictive models based on state space, and visual planning methods based on the predictive models. Finally, we discuss the hot issues and future research directions in the field of deep predictive learning.
来源 电子学报 ,2022,50(4):869-886 【核心库】
DOI 10.12263/DZXB.20211209
关键词 深度学习 ; 自监督学习 ; 计算机视觉 ; 视频预测 ; 有模型的视觉决策
地址

1. 上海交通大学人工智能研究院, 人工智能教育部重点实验室, 上海, 201109  

2. 清华大学软件学院, 北京, 100084

语种 中文
文献类型 综述型
ISSN 0372-2112
学科 自动化技术、计算机技术
基金 国家自然科学基金 ;  上海市科技重大专项 ;  上海市青年科技英才扬帆计划
文献收藏号 CSCD:7190614

参考文献 共 122 共7页

1.  Shi X J. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Proceedings of The Advances in Neural Information Processing Systems,2015:802-810 CSCD被引 3    
2.  Chandra R. Traphic: Trajectory prediction in dense and heterogeneous traffic using weighted interactions. Proceedings of The IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:8483-8492 CSCD被引 4    
3.  Castrejon L. Improved conditional vrnns for video prediction. Proceedings of The IEEE/CVF International Conference on Computer Vision,2019:7608-7617 CSCD被引 1    
4.  Zhang J. DNN-based prediction model for spatio-temporal data. Proceedings of The 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems,2016:1-4 CSCD被引 9    
5.  Ebert F. Self-supervised visual planning with temporal skip connections. Proceedings of The 1st Annual Conference on Robot Learning,2017:344-356 CSCD被引 2    
6.  Ha D. World models,2018 CSCD被引 5    
7.  Hafner D. Dream to control: Learning behaviors by latent imagination,2019 CSCD被引 1    
8.  Wang Y. DualSMC: Tunneling differentiable filtering and planningunder continuous POM-DPs. Proceedings of The TwentyNinth International Joint Conference on Artificial Intelligence,2020:4190-4198 CSCD被引 1    
9.  Lecun Y. Gradient-based learning applied to document recognition. Proceedings of The IEEE,1998,86(11):2278-2324 CSCD被引 2293    
10.  Jain V. Super-vised learning of image restoration with convolutional networks. Proceedings of The 11th International Conference on Computer Vision,2007:1-8 CSCD被引 1    
11.  Mathieu M. Deep multiscale video prediction beyond mean square error,2015 CSCD被引 1    
12.  Oh J. Action-conditional video prediction using deep networks in Atari games. Proceedings of The Advances in Neural Information Processing Systems,2015:2863-2871 CSCD被引 2    
13.  Vukotic V. Onestep time-dependent future video frame prediction with a convolutional encoder-decoder neural network. International Conference on Image Analysis and Processing,2017:140-151 CSCD被引 1    
14.  Jia X. Dyna-mic filter networks. Proceed ings of The Advances in Neural Information Systems Processing,2016:667-675 CSCD被引 1    
15.  Xue T. Visual dynamics: Stochastic future generation via layered cross convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,41(9):2236-2250 CSCD被引 1    
16.  Xu J. Structure preserving video prediction. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:1460-1469 CSCD被引 1    
17.  Jaderberg M. Spatial transformer networks. Proceedings of the Advances in Neural Information Process Systems ing,2015:2017-2025 CSCD被引 1    
18.  Jin B. Exploring spatial-temporal multi-frequency analysis for high-fide lity and temporalconsistency video prediction. Proceedings of The IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:4554-4563 CSCD被引 1    
19.  Tran D. Learning spatiotemporal features with 3D con volutional networks. Proceedings of The IEEE International Conference on Computer Vision,2015:4489-4497 CSCD被引 114    
20.  Karpathy A. Largescale video classification with convolu tional neural networks. Proceedings of The IEEE Conference on Computer Vision and Recognition Pattern,2014:1725-1732 CSCD被引 1    
引证文献 4

1 栾建霖 基于深度学习模型的船舶碳排放时空预测研究 科研管理,2023,44(3):75-85
CSCD被引 0 次

2 马志峰 基于深度学习的短临降水预报综述 计算机工程与科学,2023,45(10):1731-1753
CSCD被引 3

显示所有4篇文献

论文科学数据集
PlumX Metrics
相关文献

 作者相关
 关键词相关
 参考文献相关

版权所有 ©2008 中国科学院文献情报中心 制作维护:中国科学院文献情报中心
地址:北京中关村北四环西路33号 邮政编码:100190 联系电话:(010)82627496 E-mail:cscd@mail.las.ac.cn 京ICP备05002861号-4 | 京公网安备11010802043238号