文摘
|
利用数据类别间层次结构关系进行分类学习任务广泛存在于疾病诊断、图像标注等领域.然而,数据特征空间的高维性,使得分层分类学习面临着时间复杂度高和存储负担大等问题.另外,现有研究工作都假设训练集标记粒度是充分细化,与实际分层分类学习中划分细粒度标记代价高,类别标记间存在语义歧义性等矛盾.为解决上述问题,提出一种由粗到细的分层特征选择算法.该算法考虑类内一致性和兄弟节点间的差异性以选择有代表性特征,同时在特征选择的过程中实现预测训练样本未知的细粒度标记.在7个基准数据集上的实验结果表明,所提算法的分类性能优于一些先进的对比算法,且能处理标记粒度不够细化的情况. |
其他语种文摘
|
The task of classification learning using hierarchy of categories in data exists widely in many practical applications such as disease diagnosis, image annotation, etc. However, the high dimensionality of data feature space makes hierarchical classification learning confront problems such as high time and space complexity. In addition, existing research works assume that the training set label granularity is sufficiently fine-grained, which is contradictory to the actual hierarchical classification learning, i.e., dividing fine-grained labels is costly and ambiguity exists among category labels. To solve the above problems, we propose a coarse-to-fine hierarchical feature selection algorithm. We consider intra-class consistency and inter-sibling variability to select representative features and the unknown fine-grained labels of the training samples are predicted during feature selection. Experimental results on seven benchmark datasets show that the proposed algorithm outperforms some advanced comparative algorithms in classification performance and can handle the case where the label granularity is not fine-grained enough. |
来源
|
电子学报
,2022,50(11):2778-2789 【核心库】
|
DOI
|
10.12263/DZXB.20211263
|
关键词
|
特征选择
;
分层分类
;
标记层次结构
;
标记粒度
;
递归正则化
;
稀疏优化
;
全局最优解
|
地址
|
1.
闽南师范大学计算机学院, 福建, 漳州, 363000
2.
闽南师范大学, 数据科学与智能应用福建省高等学校重点实验室, 福建, 漳州, 363000
3.
华侨大学计算机科学与技术学院, 福建, 厦门, 361021
4.
厦门大学人工智能系, 福建, 厦门, 361005
|
语种
|
中文 |
文献类型
|
研究性论文 |
ISSN
|
0372-2112 |
学科
|
自动化技术、计算机技术 |
基金
|
国家自然科学基金面上项目
;
福建省自然科学基金重点项目
|
文献收藏号
|
CSCD:7362403
|