帮助 关于我们

返回检索结果

基于多种相关性度量的特征选择方法研究
Feature Selection Algorithm Based on Multiple Correlation Measures

查看参考文献22篇

文摘 当前的数据挖掘和机器学习技术面临着大样本、高维度数据的挑战,使用特征选择方法作为重要的降维手段得到了极大的关注.然而,许多过滤式特征选择方法仅使用一种相关性度量去除冗余特征和不相关特征,并且没有考虑特征之间的交互性.因此,提出基于多种相关性度量的过滤式特征选择算法,另外,本文提出的算法也考虑了特征之间的交互性.该算法将转化为0-1标准形式的两种相关性度量进行融合,同时引入待选特征与已选特征的补充相关性因子解决特征之间的交互性.基于8个UCI数据集和3个常用分类器的实验验证了本文算法的有效性,同时与五种典型的过滤式特征选择方法相比,本文所提出的方法获得了更好的分类结果.
其他语种文摘 Data mining and machine learning techniques are currently faced with challenges of large-sized and high-dimensional data. Using feature selection method as an important dimension reduction mean has attracted significant attention. However,many existing filter feature selection methods eliminate redundancy and irrelevance by using single correlation measure,and ignore the interaction between features. In this paper,the idea of using multiple correlation measures is adopted for filter feature selection,additionally,the proposed method is also take feature interaction into account. Two correlation measures that converted to the standard form of 0-1 are fused together in the proposed algorithm,while introducing an item to identify complementary correlation between candidate feature and selected features. To illustrate the effectiveness of the proposed method,experiments are developed based on three common classifiers and eight UCI datasets. Classification results verify the superiority of the proposed method compared with five representative filter feature selection methods.
来源 小型微型计算机系统 ,2017,38(4):696-700 【扩展库】
关键词 特征选择 ; 过滤式 ; 相关性 ; 交互特征
地址

中国科学院沈阳自动化研究所, 中国科学院网络化控制系统重点实验室, 沈阳, 110016

语种 中文
文献类型 研究性论文
ISSN 1000-1220
学科 自动化技术、计算机技术
基金 辽宁省科技计划项目
文献收藏号 CSCD:5964404

参考文献 共 22 共2页

1.  Galelli S. An evaluation framework for input variable selection algorithms for environmental data-driven models. Environmental Modelling & Software,2014,62:33-51 被引 4    
2.  Zhang Xiangzhou. A causal feature selection algorithm for stock prediction modeling. Neuro Computing,2014,142(1):48-69 被引 2    
3.  Morgado P M. Minimal neighborhood redundancy maximal relevance: application to the diagnosis of Alzheimer' s disease. Neuro Computing,2015,155:295-308 被引 2    
4.  Zhang Huaguang. A comprehensive review of stability analysis of continuous-time recurrent neural networks. IEEE Transactions on Neural Networks and Learning Systems,2014,25(7):1229-1262 被引 1    
5.  吴佳. 一种无监督约简的浮选泡沫图像特征选择方法及应用. 信息与控制,2014,43(3):314-317 被引 6    
6.  Isabelle Guyon. An introduction to variable and feature selection. Journal of Machine Learning Research,2003,3(6):1157-1182 被引 13    
7.  姚旭. 特征选择方法综述. 控制与决策,2012,27(2):161-167 被引 83    
8.  Deisy C. A novel information theoreticinteract algorithm(IT-IN) for feature selection using three machine learning algorithms. Expert Systems with Applications,2010,37(12):7589-7597 被引 1    
9.  Veronica B C. Data classification using an ensemble of filters. Neurocomputing,2014,135(135):13-20 被引 4    
10.  Veronica B C. An ensemble of filters and classifiers for microarray data classification. Pattern Recognition,2012,45(1):531-539 被引 4    
11.  Jouni Pohjalainen. Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits. Computer Speech and Language,2015,29(1):145-171 被引 6    
12.  Pradipta Maji. Rough set based maximum relevancemaximum significance criterion and gene selection from microarray data. International Journal of Approximate Reasoning,2011,52(3):408-426 被引 7    
13.  Aleks Jakulin. Analyzing attribute dependencies. Computational Statistics and Data Analysis,1999,32(1):1-12 被引 1    
14.  Gabor J S. Measuring and testing dependence by correlation of distances. The Annals of Statistics,2007,35(6):2769-2794 被引 63    
15.  Gabor J S. Brownian distance covariance. The Annals of Applied Statistics,2009,3(4):1236-1265 被引 1    
16.  Gavin Brown. Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. Journal of Machine Learning Research,2012,13:27-66 被引 1    
17.  Usama M F. Multi-Interval discretization of continuousvalued attributes for classification learning. Machine Learning,1993,5(9):1022-1027 被引 1    
18.  Yu Lei. Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research,2004,5(10):1205-1224 被引 138    
19.  Peng Hanchuan. Feature selection based on mutual information criteria of max-dependency,max-relevance, and min-redundancy. IEEE Transactions on pattern analysis and machine intelligence,2005,27(8):1226-1238 被引 79    
20.  Marko R S. Theoretical and empirical analysis of reliefF and RReliefF. Machine Learning,2003,53(3):23-69 被引 4    
引证文献 3

1 万源 低秩稀疏图嵌入的半监督特征选择 中国图象图形学报,2018,23(9):1316-1325
被引 3

2 江雨燕 融入无监督度量学习的稀疏子空间聚类模型 小型微型计算机系统,2023,44(5):1002-1007
被引 0 次

显示所有3篇文献

论文科学数据集
PlumX Metrics
相关文献

 作者相关
 关键词相关
 参考文献相关

iAuthor 链接
周晓锋 0000-0001-9837-1261
李帅 0000-0002-7375-3551
版权所有 ©2008 中国科学院文献情报中心 制作维护:中国科学院文献情报中心
地址:北京中关村北四环西路33号 邮政编码:100190 联系电话:(010)82627496 E-mail:cscd@mail.las.ac.cn 京ICP备05002861号-4 | 京公网安备11010802043238号