A novel cross-modal hashing algorithm based on multimodal deep learning
查看参考文献36篇
文摘
|
With the growing popularity of multimodal data on the Web, cross-modal retrieval on large-scale multimedia databases has become an important research topic. Cross-modal retrieval methods based on hashing assume that there is a latent space shared by multimodal features. To model the relationship among heterogeneous data, most existing methods embed the data into a joint abstraction space by linear projections. However, these approaches are sensitive to noise in the data and are unable to make use of unlabeled data and multimodal data with missing values in real-world applications. To address these challenges, we proposed a novel multimodal deep-learning-based hash (MDLH) algorithm. In particular, MDLH uses a deep neural network to encode heterogeneous features into a compact common representation and learns the hash functions based on the common representation. The parameters of the whole model are fine-tuned in a supervised training stage. Experiments on two standard datasets show that the method achieves more effective results than other methods in cross-modal retrieval. |
来源
|
Science China. Information Science
,2017,60(9):092104-1-092104-14 【核心库】
|
DOI
|
10.1007/s11432-015-0902-2
|
关键词
|
hashing
;
cross-modal retrieval
;
cross-modal hashing
;
multimodal data analysis
;
deep learning
|
地址
|
1.
School of Information Science and Engineering, Northeastern University, Shenyang, 110819
2.
School of Information Science and Engineering, Northeastern University, Key Laboratory of Medical Image Computing, MOE, Shenyang, 110819
|
语种
|
英文 |
文献类型
|
研究性论文 |
ISSN
|
1674-733X |
学科
|
自动化技术、计算机技术 |
基金
|
国家自然科学基金
;
Fundamental Research Funds for the Central Universities of China
|
文献收藏号
|
CSCD:6087845
|
参考文献 共
36
共2页
|
1.
Chen C. Web media semantic concept retrieval via tag removal and model fusion.
ACM Trans Intel Syst Technol,2013,4:478-488
|
CSCD被引
1
次
|
|
|
|
2.
Leung C H C. Intelligent social media indexing and sharing using an adaptive indexing search engine.
ACM Trans Intel Syst Technol,2012,3:338-343
|
CSCD被引
1
次
|
|
|
|
3.
Zhang R M. Bit-scalable deep hashing with regularized similarity learning for image retrieval and person re-identification.
IEEE Trans Imag Process,2015,24:4766-4779
|
CSCD被引
25
次
|
|
|
|
4.
Nie X S. Robust video hashing based on representative-dispersive frames.
Sci China Inf Sci,2013,56:068104
|
CSCD被引
3
次
|
|
|
|
5.
Xiang S J. Perceptual video hashing robust against geometric distortions.
Sci China Inf Sci,2012,55:1520-1527
|
CSCD被引
4
次
|
|
|
|
6.
Datar M. Locality-sensitive hashing scheme based on p-stable distributions.
Proceedings of ACM Symposium on Computational Geometry,2004:253-262
|
CSCD被引
2
次
|
|
|
|
7.
Weiss Y. Spectral hashing.
Proceedings of 22nd Annual Conference on Neural Information Processing Systems,2008:1753-1760
|
CSCD被引
1
次
|
|
|
|
8.
Zhen Y. A probabilistic model for multimodal hash function learning.
Proceedings of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining,2012:940-948
|
CSCD被引
1
次
|
|
|
|
9.
Zhu X F. Linear cross-modal hashing for efficient multimedia search.
Proceedings of the 21st ACM International Conference on Multimedia,2013:143-152
|
CSCD被引
8
次
|
|
|
|
10.
Yu Z. Discriminative coupled dictionary hashing for fast cross-media retrieval.
Proceedings of the 37th Annual ACM SIGIR Conference,2014:395-404
|
CSCD被引
1
次
|
|
|
|
11.
Bronstein M. Data fusion through cross-modality metric learning using similarity-sensitive hashing.
Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2010:3594-3601
|
CSCD被引
1
次
|
|
|
|
12.
Kumar S. Learning hash functions for cross-view similarity search.
Proceedings of the 25th International Joint Conference on Artificial Intelligence,2011:1360-1365
|
CSCD被引
1
次
|
|
|
|
13.
Hu Y. Iterative multi-view hashing for cross media indexing.
Proceedings of the 22nd ACM International Conference on Multimedia,2014:527-536
|
CSCD被引
1
次
|
|
|
|
14.
Song J K. Inter-media hashing for large-scale retrieval from heterogeneous data sources.
Proceedings of the ACM SIGMOD International Conference on Management of Data,2013:785-796
|
CSCD被引
4
次
|
|
|
|
15.
Wu B T. Quantized correlation hashing for fast cross-modal search.
Proceedings of International Joint Conference on Artificial Intelligence,2015:3946-3952
|
CSCD被引
1
次
|
|
|
|
16.
Kang Y. Deep learning to hash with multiple representations.
Proceedings of IEEE International Conference on Data Mining,2012:930-935
|
CSCD被引
1
次
|
|
|
|
17.
Wang D X. Deep multimodal hashing with orthogonal regularization.
Proceedings of International Joint Conference on Artificial Intelligence,2015:2291-2297
|
CSCD被引
1
次
|
|
|
|
18.
Wang Q F. Learning to hash on partial multimodal data.
Proceedings of International Joint Conference on Artificial Intelligence,2015:3904-3910
|
CSCD被引
1
次
|
|
|
|
19.
Dahl G E. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition.
IEEE Trans Audio Speech,2012,20:30-42
|
CSCD被引
140
次
|
|
|
|
20.
Krizhevsky A. Imagenet classification with deep convolutional neural networks.
Proceedings of Annual Conference on Neural Information Processing Systems,2012:1106-1114
|
CSCD被引
1
次
|
|
|
|
|