帮助 关于我们

返回检索结果

基于双融合框架的多模态3D目标检测算法
A Multimodal 3D Object Detection Method Based on Double-Fusion Framework

查看参考文献35篇

葛同澳 1   李辉 1 *   郭颖 1   王俊印 2   周迪 1  
文摘 相机和激光雷达多模态融合的3D目标检测可以综合利用两种传感器的优点,提高目标检测的准确度和鲁棒性.然而,由于环境复杂性以及多模态数据间固有的差异性,3D目标检测仍面临着诸多挑战.本文提出了双融合框架的多模态3D目标检测算法.设计体素级和网格级的双融合框架,有效缓解融合时不同模态数据之间的语义差异;提出ABFF(Adaptive Bird-eye-view Features Fusion)模块,增强算法对小目标特征感知能力;通过体素级全局融合信息指导网格级局部融合,提出基于Transformer的多模态网格特征编码器,充分提取3D检测场景中更丰富的上下文信息,并提升算法运行效率.在KITTI标准数据集上的实验结果表明,提出的3D目标检测算法平均检测精度达78.79%,具有更好的3D目标检测性能.
其他语种文摘 The 3D object detection of camera and lidar multimodal fusion can comprehensively utilize the advantages of the two sensors to improve the accuracy and robustness of detection. However, due to the complexity of the environment and the inherent variability among multimodal data, 3D object detection still faces many challenges. In this paper, we propose a multimodal 3D object detection algorithm with a double-fusion framework. We design a voxel-level and grid-level double-fusion framework, effectively alleviating the semantic differences between modal data. We propose the ABFF (Adaptive Bird-eye-view Features Fusion) module to enhance the algorithm's ability to perceive small object features. Through voxel-level global fusion information to guide grid-level local fusion, we propose a Transformer-based multimodal grid feature encoder to extract richer context information in 3D detection scenes and improve the efficiency of the algorithm. The experimental results on the KITTI standard dataset show that the average detection accuracy of our proposed 3D object detection algorithm reaches 78.79%, which has better 3D object detection performance.
来源 电子学报 ,2023,51(11):3100-3110 【核心库】
DOI 10.12263/DZXB.20230414
关键词 深度学习 ; 三维目标检测 ; 激光雷达 ; 相机 ; 多模态信息融合
地址

1. 青岛科技大学数据科学学院, 山东, 青岛, 266000  

2. 武汉理工大学计算机与人工智能学院, 湖北, 武汉, 430000

语种 中文
文献类型 研究性论文
ISSN 0372-2112
学科 自动化技术、计算机技术
基金 中国高校产学研创新基金 ;  国家自然科学基金 ;  山东省高等学校青创科技支持计划
文献收藏号 CSCD:7641765

参考文献 共 35 共2页

1.  Yan Y. SECOND: Sparsely embedded convolutional detection. Sensors,2018,18(10):3337 CSCD被引 79    
2.  Shi S S. PV-RCNN: Point-voxel feature set abstraction for 3D object detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2020:10529-10538 CSCD被引 1    
3.  Deng J J. Voxel R-CNN: Towards high performance voxel-based 3D object detection. Proceedings of the AAAI Conference on Artificial Intelligence,2021,35(2):1201-1209 CSCD被引 3    
4.  Zheng W. SE-SSD: Self-ensembling single-stage object detector from point cloud. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2021:14494-14503 CSCD被引 1    
5.  Hu J S K. Point densityaware voxels for LiDAR 3D object detection. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2022:8469-8478 CSCD被引 1    
6.  Wu H. CasA: A cascade attention network for 3-D object detection from LiDAR point clouds. IEEE Transactions on Geoscience and Remote Sensing,2022,60:1-11 CSCD被引 3    
7.  Philion J. Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3D. Computer Vision-ECCV 2020,2020:194-210 CSCD被引 1    
8.  Li Y H. BEVDepth: Acquisition of reliable depth for multi-view 3D object detection. Proceedings of the AAAI Conference on Artificial Intelligence,2023,37(2):1477-1485 CSCD被引 1    
9.  Li Z Q. BEVFormer: Learning Bird's-eye-view representation from multi-camera images via spatiotemporal Transformers. Lecture Notes in Computer Science,2022:1-18 CSCD被引 1    
10.  Vora S. Pointpainting: Sequential fusion for 3D object detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2020:4604-4612 CSCD被引 1    
11.  Yin T. Multimodal virtual point 3D detection. Advances in Neural Information Processing Systems,2021,34(11):16494-16507 CSCD被引 4    
12.  Huang T T. EPNet: Enhancing point features with image semantics for 3D object detection. Computer Vision-ECCV 2020,2020:35-52 CSCD被引 4    
13.  Liu Z. EPNet++: Cascade bidirectional fusion for multi-modal 3D object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,2022(12):1-18 CSCD被引 1    
14.  Zhang Y N. CAT-det: Contrastively augmented transformer for multimodal 3D object detection. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2022:908-917 CSCD被引 1    
15.  Pang S. CLOCs: Camera-LiDAR object candidates fusion for 3D object detection. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS),2020:10386-10393 CSCD被引 5    
16.  Wu X P. Sparse fuse dense: Towards high quality 3D detection with depth completion. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2022:5418-5427 CSCD被引 1    
17.  Chen Z. Autoalign: Pixel-instance feature aggregation for multi-modal 3D object detection,2022 CSCD被引 1    
18.  Chen Z. Autoalignv2: Deformable feature aggregation for dynamic multi-modal 3D object detection,2022 CSCD被引 1    
19.  Li Y W. Deepfusion: Lidarcamera deep fusion for multi-modal 3D object detection. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2022:17182-17191 CSCD被引 1    
20.  Chitta K. TransFuser: Imitation with transformer-based sensor fusion for autonomous driving. IEEE Transactions on Pattern Analysis and Machine Intelligence,2022:1-18 CSCD被引 1    
引证文献 1

1 李浩 "人-机-环境"共融的工业数字孪生系统智能优化方法 计算机集成制造系统,2024,30(5):1551-1570
CSCD被引 0 次

显示所有1篇文献

论文科学数据集
PlumX Metrics
相关文献

 作者相关
 关键词相关
 参考文献相关

版权所有 ©2008 中国科学院文献情报中心 制作维护:中国科学院文献情报中心
地址:北京中关村北四环西路33号 邮政编码:100190 联系电话:(010)82627496 E-mail:cscd@mail.las.ac.cn 京ICP备05002861号-4 | 京公网安备11010802043238号