帮助 关于我们

返回检索结果

面向众核处理器的阴阳K-means算法优化
Optimizing Yinyang K-means algorithm on many-core CPUs

查看参考文献32篇

周天阳 1,2   王庆林 1,2 *   李荣春 1,2   梅松竹 1,2   尹尚飞 1,2   郝若晨 1,2   刘杰 1,2  
文摘 传统阴阳K-means算法处理大规模聚类问题时计算开销十分昂贵。针对典型众核处理器的体系结构特征,提出了一种阴阳K-means算法高效并行加速实现。该实现基于一种新内存数据布局,采用众核处理器中的向量单元来加速阴阳K-means中的距离计算,并面向非一致内存访问(non-unified memory access, NUMA)特性进行了针对性的访存优化。与阴阳K-means算法的开源多线程实现相比,该实现在ARMv8和x86众核平台上分别获得了最高约5.6与8.7的加速比。因此上述优化方法在众核处理器上成功实现了对阴阳K-means算法的加速。
其他语种文摘 Traditional Yinyang K-means algorithm is computationally expensive when dealing with large-scale clustering problems. An efficient parallel acceleration implementation of Yinyang K-means algorithm was proposed on the basis of the architectural characteristics of typical many-core CPUs. This implementation was based on a new memory data layout,used vector units in many-core CPUs to accelerate distance calculation in Yinyang K-means,and targeted memory access optimization for NUMA(non-uniform memory access) characteristics. Compared with the open source multi-threaded version of Yinyang K-means algorithm,this implementation can achieve the speedup of up to 5.6 and 8.7 approximately on ARMv8 and x86 many-core CPUs,respectively. Experiments show that the optimization successfully accelerate Yinyang K-means algorithm in many-core CPUs.
来源 国防科技大学学报 ,2024,46(1):93-102 【核心库】
DOI 10.11887/j.cn.202401010
关键词 K-means ; 非一致内存访问 ; 向量化 ; 众核处理器 ; 性能优化
地址

1. 国防科技大学计算机学院, 湖南, 长沙, 410073  

2. 国防科技大学, 并行与分布计算全国重点实验室, 湖南, 长沙, 410073

语种 中文
文献类型 研究性论文
ISSN 1001-2486
学科 自动化技术、计算机技术
基金 国家自然科学基金资助项目
文献收藏号 CSCD:7660746

参考文献 共 32 共2页

1.  Lloyd S. Least squares quantization in PCM. IEEE Transactions on Information Theory,1982,28(2):129-137 CSCD被引 215    
2.  Arthur D. k-means + +: the advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms,2007 CSCD被引 15    
3.  Kanungo T. An efficient k-means clustering algorithm: analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(7):881-892 CSCD被引 267    
4.  Xia S Y. Ball k-means: fast adaptive clustering with no bounds. IEEE transactions on pattern analysis and machine intelligence,2022,44(1):87-99 CSCD被引 12    
5.  Drake J. Accelerated k-means with adaptive distance bounds. Proceedings of 5th NIPS Workshop on Optimization for Machine Learning. 8,2012:1-4 CSCD被引 1    
6.  Hamerly G. Making k-means even faster. Proceedings of the 2010 SIAM International Conference on Data Mining,2010 CSCD被引 1    
7.  Milanov D V. Relaxed triangle inequality for the orbital similarity criterion by Southworth and Hawkins and its variants. Celestial Mechanics and Dynamical Astronomy,2019,131:5 CSCD被引 1    
8.  Ding Y F. Yinyang K-means: a drop-in replacement of the classic K-means with consistent speedup. Proceedings of the 32nd International Conference on Machine Learning,2015 CSCD被引 1    
9.  Wu F H. A vectorized K-means algorithm for intel many integrated core architecture. Lecture Notes in Computer Science,2013:277-294 CSCD被引 1    
10.  Kwedlo W. A hybrid MPI/OpenMP parallelization of K-means algorithms accelerated using the triangle inequality. IEEE Access,2019,7:42280-42297 CSCD被引 3    
11.  Zhao W Z. Parallel K-means clustering based on MapReduce. Lecture Notes in Computer Science,2009:674-679 CSCD被引 8    
12.  Kumar J. Parallel kmeans clustering for quantitative ecoregion delineation using large data sets. Procedia Computer Science,2011,4:1602-1611 CSCD被引 4    
13.  Bhimani J. Accelerating K-means clustering with parallel implementations and GPU computing. Proceedings of IEEE High Performance Extreme Computing Conference (HPEC),2015 CSCD被引 1    
14.  Zechner M. Accelerating k-means on the graphics processor via CUDA. Proceedings of First International Conference on Intensive Applications and Services,2009 CSCD被引 1    
15.  Farivar R. A parallel implementation of K-means clustering on GPUs. Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications,2008 CSCD被引 1    
16.  Hussain H M. FPGA implementation of K-means algorithm for bioinformatics application: an accelerated approach to clustering Microarray data. Proceedings of NASA/ESA Conference on Adaptive Hardware and Systems (AHS),2011 CSCD被引 1    
17.  Dias L A. Parallel implementation of K-means algorithm on FPGA. IEEE Access,2020,8:41071-41084 CSCD被引 1    
18.  Taylor C. Accelerating the Yinyang kmeans algorithm using the GPU. Proceedings of IEEE 37th International Conference on Data Engineering (ICDE),2021 CSCD被引 1    
19.  Intel. Accelerate your compute-intensive workloads: Intel®advanced vector extensions 512(Intel®AVX-512),2022 CSCD被引 1    
20.  ARM. Neon,2022 CSCD被引 1    
引证文献 0 篇
论文科学数据集
PlumX Metrics
相关文献

 作者相关
 关键词相关
 参考文献相关

版权所有 ©2008 中国科学院文献情报中心 制作维护:中国科学院文献情报中心
地址:北京中关村北四环西路33号 邮政编码:100190 联系电话:(010)82627496 E-mail:cscd@mail.las.ac.cn 京ICP备05002861号-4 | 京公网安备11010802043238号