帮助 关于我们

返回检索结果

基于高效用神经网络的文本分类方法
High Utility Neural Networks for Text Classification

查看参考文献30篇

文摘 现有的基于深度学习的文本分类方法没有考虑文本特征的重要性和特征之间的关联关系,影响了分类的准确率.针对此问题,本文提出一种基于高效用神经网络(High Utility Neural Networks,HUNN)的文本分类模型,可以有效地表示文本特征的重要性及其关联关系.利用高效用项集挖掘(Mining High Utility Itemsets,MHUI)算法获取数据集中各个特征的重要性以及共现频率.其中,共现频率在一定程度上反映了特征之间的关联关系.将MHUI作为HUNN的挖掘层,用于挖掘每个类别数据中重要性和关联性强的文本特征.然后将这些特征作为神经网络的输入,再经过卷积层进一步提炼类别表达能力更强的高层次文本特征,从而提高模型分类的准确率.通过在6个公开的基准数据集上进行实验分析,提出的算法优于卷积神经网络(Convolutional Neural Networks,CNN),循环神经网络(Recurrent Neural Networks,RNN),循环卷积神经网络(Recurrent Convolutional Neural Networks,RCNN),快速文本分类(Fast Text Classifier,FAST),分层注意力网络(Hierarchical Attention Networks,HAN)等5个基准算法.
其他语种文摘 The existing text classification methods based on deep learning do not consider the importance and association of text features.The association between the text features perhaps affects the accuracy of the classification.To solve this problem,in this study,a framework based on high utility neural networks(HUNN)for text classification were proposed.Which can effectively mine the importance of text features and their association.Mining high utility itemsets(MHUI)from databases is an emerging topic in data mining.It can mine the importance and the co-occurrence frequency of each feature in the dataset.The co-occurrence frequency of the feature reflects the association between the text features.Using MHUI as the mining layer of HUNN,it is used to mine strong importance and association text features in each type,select these text features as input to the neural networks.And then acquire the high-level features with strong ability of categorical representation through the convolution layer for improving the accuracy of model classification.The experimental results showed that the proposed model performed significantly better on six different public datasets compared with convolutional neural networks(CNN),recurrent neural networks(RNN),recurrent convolutional neural networks(RCNN),fast text classifier(FAST),and hierarchical attention networks(HAN).
来源 电子学报 ,2020,48(2):279-284 【核心库】
DOI 10.3969/j.issn.0372-2112.2020.02.008
关键词 数据挖掘 ; 关联规则 ; 高效用项集 ; 自然语言处理 ; 文本分类 ; 神经网络
地址

武汉大学计算机学院, 湖北, 武汉, 430072

语种 中文
文献类型 研究性论文
ISSN 0372-2112
学科 自动化技术、计算机技术
基金 国家973计划 ;  国家自然科学基金 ;  中央高校基本科研业务费专项资金
文献收藏号 CSCD:6770064

参考文献 共 30 共2页

1.  胡小娟. 基于主动学习和否定选择的垃圾邮件分类算法. 电子学报,2018,46(1):203-209 CSCD被引 9    
2.  Kim Y. Convolutional neural networks for sentence classification. Conference on Empirical Methods in Natural Language Processing,2014:1746-1751 CSCD被引 10    
3.  Mullen T. Sentiment analysis using support vector machines with diverse information sources. Conference on Empirical Methods in Natural Language Processing,2004:412-418 CSCD被引 2    
4.  Tan S. Adapting naive Bayes to domain adaptation for sentiment analysis. The 31th European Conference on IR Research,2009:337-349 CSCD被引 1    
5.  Wawre S. Sentiment classification using machine learning techniques. International Journal of Science and Research,2016,5(4):819-821 CSCD被引 1    
6.  Maas A. Learning word vectors for sentiment analysis. The 49th Annual Meeting of the Association for Computational Linguistics,2011:142-150 CSCD被引 1    
7.  Trstenjak B. KNN with TF-IDF based framework for text categorization. Procedia Engineering,2014,69(1):1356-1364 CSCD被引 14    
8.  Johnson R. Deeppyramid convolutional neural networks for text categorization. 55nd Annual Meeting of the Association for Computational Linguistics,2017:562-570 CSCD被引 1    
9.  Ammar Ismael Kadhim. Survey on supervised machine learning techniques for automatic text classification. Artificial Intelligence Review,2019,52(1):273-292 CSCD被引 3    
10.  Johnson R. Effective use of word order for text categorization with convolutional neural networks. Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies,2015:103-112 CSCD被引 1    
11.  Yih W. Semantic parsing for single-relation question answering. 52nd Annual Meeting of the Association for Computational Linguistics,2014:643-648 CSCD被引 1    
12.  Shen Y. Learning semantic representations using convolutional neural networks for web search. Proceedings of the 23rd International Conference on World Wide Web,2014:373-374 CSCD被引 16    
13.  Tang D. Learning semantic representations of users and products for document level sentiment classification. 53rd Annual Meeting of the Association for Computational Linguistics,2015:1014-1023 CSCD被引 1    
14.  Batmaz Z. A review on deep learning for recommender systems:challenges and remedies. Artificial Intelligence Review,2018,52(1):1-37 CSCD被引 18    
15.  Zhang X. Character-level convolutional networks for text classification. Advances in Neural Information Processing Systems,2015:649-657 CSCD被引 45    
16.  Joulin A. Bag of tricks for efficient text classification. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics,2017:427-431 CSCD被引 62    
17.  Liu P. Recurrent neural network for text classification with multi-task learning. The 26th International Joint Conference on Artificial Intelligence,2017:1480-1489 CSCD被引 1    
18.  Lai S. Recurrent convolutional neural networks for text classification. Conference of the Association for the Advancement of Artificial Intelligence,2015:2267-2273 CSCD被引 1    
19.  Yang Z. Hierarchical attention networks for document classification. Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies,2016:1480-1489 CSCD被引 1    
20.  Mikolov T. Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems,2013:3111-3119 CSCD被引 340    
引证文献 6

1 陈长青 基于API短序列的勒索软件早期检测方法 电子学报,2021,49(3):586-595
CSCD被引 3

2 冯帅 基于深度卷积神经网络的水稻知识文本分类方法 农业机械学报,2021,52(3):257-264
CSCD被引 8

显示所有6篇文献

论文科学数据集
PlumX Metrics
相关文献

 作者相关
 关键词相关
 参考文献相关

版权所有 ©2008 中国科学院文献情报中心 制作维护:中国科学院文献情报中心
地址:北京中关村北四环西路33号 邮政编码:100190 联系电话:(010)82627496 E-mail:cscd@mail.las.ac.cn 京ICP备05002861号-4 | 京公网安备11010802043238号