帮助 关于我们

返回检索结果

基于注意力机制与图卷积神经网络的单目红外图像深度估计
Depth estimation of monocular infrared images based on attention mechanism and graph convolutional neural network

查看参考文献17篇

文摘 对场景中的物体进行深度估计是无人驾驶领域中的关键问题,红外图像有利于在光线不佳的情况下解决深度估计问题。针对红外图像纹理不清晰与边缘信息不丰富的特点,提出了将注意力机制与图卷积神经网络相结合来解决单目红外图像深度估计问题。首先,在深度估计问题中,图像中每个像素点的深度信息不仅与其周围像素点的深度信息相关,还需考虑更大范围的其他像素点的深度信息,采用注意力机制可以针对这一点有效提取图像的像素级全局深度信息关联。其次,基于深度信息关联得到的特征可以考虑为非欧数据,进一步使用图卷积神经网络(graph convolutional neural network, GCN)来进行推理。最后,在训练阶段将连续的深度估计回归问题转化成分类问题,使训练过程更稳定,降低了网络的学习难度。实验结果表明,该方法在红外数据集NUST-SR上获得了良好的效果,在阈值指标小于1.253时,准确率提升了1.2%,相较其他方法更具优势。
其他语种文摘 The depth estimation of objects in the scene is a key issue in the field of the unmanned driving. The infrared images are helpful to solve the depth estimation problem under poor light conditions. Aiming at characteristics of unclear infrared images texture and insufficient edge information, a combination of attention mechanism and graph convolutional neural network was proposed to solve the problem of monocular infrared images depth estimation. First of all, in the depth estimation problem, the depth information of each pixel in the image was not only related to the depth information of its surrounding pixels, but also needed to consider the depth information of a larger range of other pixels. The attention mechanism could be effectively extract the pixel-level global depth information association of images. Secondly, the features obtained based on the depth information association could be considered as non-Euclidean data, and the graph convolutional neural network (GCN) was further used for reasoning. Finally, in the training phase, the continuous depth estimation regression problem was transformed into the classification problem, which made the training process more stable and reduced the learning difficulty of the network. The experimental results show that the proposed method has obtained good results on the infrared data set NUST-SR. When the threshold index is less than 1.253, the accuracy rate is improved by 1.2%, which is more advantageous than other methods.
来源 应用光学 ,2021,42(1):49-56 【扩展库】
DOI 10.5768/jao202142.0102001
关键词 红外图像 ; 深度估计 ; 注意力机制 ; 图卷积神经网络
地址

华东理工大学信息科学与工程学院, 上海, 200237

语种 中文
文献类型 研究性论文
ISSN 1002-2082
学科 电子技术、通信技术;自动化技术、计算机技术
基金 国家自然科学基金面上项目
文献收藏号 CSCD:6919031

参考文献 共 17 共1页

1.  Silberman N. Indoor segmentation and support inference from RGBD Images. Proceedings of the 12th European conference on Computer Vision,2012:740-746 CSCD被引 1    
2.  Eigen D. Depth map prediction from a single image using a multi-scale deep network. NIPS'14 Proceedings of the 27th International Conference on Neural Information Processing Systems,2014:2366-2374 CSCD被引 2    
3.  Eigen D. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. 2015 IEEE International Conference on Computer Vision (ICCV),2015:2650-2658 CSCD被引 7    
4.  Laina I. Deeper depth prediction with fully convolutional residual networks. 2016 Fourth International Conference on 3D Vision (3DV),2016:239-248 CSCD被引 13    
5.  He Kaiming. Deep residual learning for image recognition. 2016 IEEE CVF Conference on Computer Vision and Pattern Recognition,2016:770-778 CSCD被引 1    
6.  吴寿川. 基于双向递归卷积神经网络的单目红外视频深度估计. 光学学报,2017,37(12):246-254 CSCD被引 2    
7.  顾婷婷. 基于帧间信息提取的单幅红外图像深度估计. 激光与光电子学进展,2018,55(6):163-172 CSCD被引 2    
8.  Fisher Y. Multi-scale context aggregation by dilated convolutions. arXiv,2016: 1511.07122 CSCD被引 1    
9.  Li Bo. Monocular depth estimation with hierarchical fusion of dilated CNNs and soft-weighted-sum inference. Pattern Recognition,2018,83:328-339 CSCD被引 14    
10.  Fu H. Deep ordinal regression network for monocular depth estimation. 2018 IEEE CVF Conference on Computer Vision and Pattern Recognition,2018:2002-2011 CSCD被引 2    
11.  Bahdanau D. Neural machine translation by jointly learning to align and translate. arXiv,2015: 1409.0473 CSCD被引 1    
12.  Xu Dan. Structured attention guided convolutional neural fields for monocular depth estimation. 2018 IEEE CVF Conference on Computer Vision and Pattern Recognition,2018:3917-3925 CSCD被引 1    
13.  Li R B. Deep attention-based classification network for robust depth prediction. Computer Vision-ACCV 2018,2019:663-678 CSCD被引 1    
14.  陈裕如. 基于自适应像素级注意力模型的场景深度估计. 应用光学,2020,41(3):490-499 CSCD被引 2    
15.  Fu Junwei. Monocular depth estimation based on multi-scale graph convolution networks. IEEE Access,2020(8):997-1009 CSCD被引 4    
16.  Xu Keyulu. Representation learning on graphs with jumping knowledge networks. arXiv,2018: 1806.03536 CSCD被引 1    
17.  Simonyan K. Very deep convolutional networks for large-scale image recognition,2014:1409-1556 CSCD被引 5    
引证文献 2

1 施宗晗 基于注意力和角度间隔损失的高光谱目标跟踪 应用光学,2022,43(5):893-903
CSCD被引 0 次

2 石琴 基于局部平面引导层的无监督单目红外图像深度估计 汽车工程,2023,45(12):2291-2298
CSCD被引 0 次

显示所有2篇文献

论文科学数据集
PlumX Metrics
相关文献

 作者相关
 关键词相关
 参考文献相关

版权所有 ©2008 中国科学院文献情报中心 制作维护:中国科学院文献情报中心
地址:北京中关村北四环西路33号 邮政编码:100190 联系电话:(010)82627496 E-mail:cscd@mail.las.ac.cn 京ICP备05002861号-4 | 京公网安备11010802043238号