帮助关于我们

返回检索结果

基于组合-卷积神经网络的中文新闻文本分类
A Combined-Convolutional Neural Network for Chinese News Text Classification

查看参考文献28篇

张昱 ^1,2 刘开峰 ¹ ^* 张全新 ³ 王艳歌 ¹ 高凯龙 ¹

文摘	目前的新闻分类研究以英文居多,而且常用的传统机器学习方法在长文本处理方面,存在局部文本块特征提取不完善的问题.为了解决中文新闻分类缺乏专门术语集的问题,采用构造数据索引的方法,制作了适合中文新闻分类的词汇表,并结合word2vec预训练词向量进行文本特征构建.为了解决特征提取不完善的问题,通过改进经典卷积神经网络模型结构,研究不同的卷积和池化操作对分类结果的影响.为提高新闻文本分类的精确率,本文提出并实现了一种组合-卷积神经网络模型,设计了有效的模型正则化和优化方法.实验结果表明,组合-卷积神经网络模型对中文新闻文本分类的精确率达到93.69%,相比最优的传统机器学习方法和经典卷积神经网络模型精确率分别提升6.34%和1.19%,并在召回率和F值两项指标上均优于对比模型.
其他语种文摘	At present,most of the researches on news classification are in English, and the traditional machine learning methods have a problem of incomplete extraction of local text block features in long text processing. In order to solve the problem of lack of special term set for Chinese news classification,a vocabulary suitable for Chinese text classification is made by constructing a data index method,and the text feature construction is combined with word2vec pre-trained word vector. In order to solve the problem of incomplete feature extraction, the effects of different convolution and pooling operations on the classification results are studied by improving the structure of classical convolution neural network model. In order to improve the precision of Chinese news text classification, this paper proposes and implements a combined-convolution neural network model, and designs an effective method of model regularization and optimization. The experimental results show that the precision of the combined-convolutional neural network model for Chinese news text classification reaches 93.69%,which is 6.34% and 1.19% higher than the best traditional machine learning method and classic convolutional neural network model, and it is better than the comparison model in recall and F-measure.
来源	电子学报 ,2021,49(6):1059-1067 【核心库】
DOI	10.12263/DZXB.20200134
关键词	自然语言处理 ; 词向量 ; 组合-卷积神经网络 ; 中文新闻 ; 文本分类
地址	1. 北京建筑大学电气与信息工程学院, 建筑大数据智能处理方法研究北京市重点实验室, 北京, 100044 2. (北京)中国矿业大学, 深部岩土力学与地下工程国家重点实验室, 北京, 100083 3. 北京理工大学计算机科学与技术学院, 北京, 100081
语种	中文
文献类型	研究性论文
ISSN	0372-2112
学科	自动化技术、计算机技术
基金	北京建筑大学优秀主讲教师培育计划 ; 国家重点研发计划 ; 教育部2018产学合作协同育人项目 ; 北京市属高校基本科研业务费 ; 北京建筑大学研究生创新项目
文献收藏号	CSCD:7018836

参考文献共 28 共2页

引证文献 10 篇

1 黄友文 DistillBIGRU:基于知识蒸馏的文本分类模型中文信息学报,2022,36(4):81-89
CSCD被引 3 次

2 陈传刚恶劣环境条件下海外天然气管道站场事故演化知识图谱建模及预警方法清华大学学报. 自然科学版,2022,62(6):1081-1087
CSCD被引 3 次

显示所有10篇文献

论文科学数据集

PlumX Metrics

相关文献
作者相关关键词相关参考文献相关

版权所有 ©2008 中国科学院文献情报中心制作维护：中国科学院文献情报中心
地址：北京中关村北四环西路33号邮政编码：100190 联系电话：(010)82627496 E-mail:cscd@mail.las.ac.cn 京ICP备05002861号-4 | 京公网安备11010802043238号