基于十亿亿次国产超算系统的流体力学软件众核适应性研究
Research on Adaptation of CFD Software Based on Many-core Architecture of 100P Domestic Supercomputing System
查看参考文献24篇
文摘
|
国产众核处理器提供了两种移植难度相差较大的众核级并行编程语言。不同流体力学软件对众核架构适应性的不同,决定了它们在移植优化过程中适合于不同的编程语言。首先介绍了国产众核处理器的体系结构、编程模型和并行编程语言;然后分析了流体力学软件应用于国产众核处理器存在的挑战性问题,包括隐格式带来的数据相关性、大型稀疏矩阵线性代数方程组求解、多重网格方法和非结构网格等,这些问题限制了软件对众核架构的适应性。文中针对这些难题分别提出了创新的优化算法,并通过理论分析和实验得到了几种典型流体力学软件的众核适应性研究结论。实践证明,多数流体力学软件对国产众核处理器的适应性良好,能够采用OpenACC编译器自动移植,并扩展到百万核并行规模,能保持较高的并行效率。 |
其他语种文摘
|
Domestic many-core super computing system provides two program languages with different program difficulty.Adaptation to many-core architecture of CFD software decides which program language should be used.Firstly,this paper briefly introduced the many-core architecture,program model and program languages.And then challenges on the adaptation of CFD software were analyzed,including data relativity of implicit method,solving of big parse linear equations,many grid method and unstructured grids.For each challenge,corresponding countermeasure was provided too.At last,the paper provided the speedup ratio of some typical software of fluid dynamics based on theory analysis and experiments.Facts prove that most CFD softwares adapt well to domestic many-core architecture and can use simple program language to get better parallel ration on million cores. |
来源
|
计算机科学
,2020,47(1):24-30 【扩展库】
|
DOI
|
10.11896/jsjkx.181102176
|
关键词
|
国产
;
众核架构
;
流体力学软件
;
适应性
;
编程语言
;
并行算法
|
地址
|
1.
江南计算技术研究所, 江苏, 无锡, 214083
2.
国家计算流体力学实验室, 国家计算流体力学实验室, 北京, 100191
3.
中国船舶科学研究中心, 江苏, 无锡, 214081
4.
中国科学院力学研究所, 北京, 100190
|
语种
|
中文 |
文献类型
|
研究性论文 |
ISSN
|
1002-137X |
学科
|
自动化技术、计算机技术 |
基金
|
载人航天工程技术课题
;
国家自然科学基金
;
国家973计划
|
文献收藏号
|
CSCD:6691966
|
参考文献 共
24
共2页
|
1.
Zheng F. Cooperative Computing Techniques for a Deeply Fused and Heterogeneous Many-Core Processor Architecture.
Journal of Computer Science and Technology,2015,30(1):145-162
|
CSCD被引
11
次
|
|
|
|
2.
Fu H H. The Sunway Taihulight supercomputer:system and applications.
Science China Information Sciences,2016,59(7):72-91
|
CSCD被引
1
次
|
|
|
|
3.
Yang C. 10m-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics.
Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis,2016:6-15
|
CSCD被引
1
次
|
|
|
|
4.
Zhang J. Extreme-Scale Phase Field Simulations of Coarsening Dynamics on the Sunway TaihuLight Supercomputer.
International Conference for High Performance Computing,Networking,Storage and Analysis,2016:34-45
|
CSCD被引
1
次
|
|
|
|
5.
Fu H H. Redesigning CAM-SE for Peta-Scale Climate Modeling Performance on Sunway TaihuLight.
High Performance Computing,Networking,Storage and Analysis,2017:4-12
|
CSCD被引
1
次
|
|
|
|
6.
Fu H H. 15-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight:Enabling Depiction of Realistic 10Hz Scenarios.
High Performance Computing,Networking,Storage and Analysis,2017:102-117
|
CSCD被引
1
次
|
|
|
|
7.
Qiao F L. A highly effective global surface wave numerical simulation with ultra-high resolution.
High Performance Computing,Networking,Storage and Analysis,2016:46-56
|
CSCD被引
1
次
|
|
|
|
8.
Hou C F. Efficient GPU-accelerated molecular dynamics simulation of solid covalent crystals.
MOLECULAR SIMULATION,2012,38(1):8-15
|
CSCD被引
2
次
|
|
|
|
9.
Hou C F. Petascale molecular dynamics simulation of crystalline silicon on Tianhe-1A.
International Journal of High Performance Computing Applications,2013,27(3):307-317
|
CSCD被引
5
次
|
|
|
|
10.
Li D. A survey on information diffusion in online social networks.
Chinese Journal of Computers,2014,37(1):189-206
|
CSCD被引
1
次
|
|
|
|
11.
Lin H. Scalable Graph Traversal on Sunway TaihuLight with Ten Million Cores.
2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS),2017
|
CSCD被引
2
次
|
|
|
|
12.
Lin J. Optimizations of Two Computebound Scientific Kernels on SW26010 Many-core Processor.
Proceedings of the 46th International Conference on Parallel Processing,2017
|
CSCD被引
2
次
|
|
|
|
13.
Xu Z G. Benchmarking Sunway SW26010Manycore Processor.
Proceedings of The Seventh International Workshop on Accelerators and Hybrid Exascale Systems(AsHES)(IPDPS workshop),2017
|
CSCD被引
1
次
|
|
|
|
14.
An H. Pipelining Computation and Data Reuse Strategies for Scaling GROMACS on the Sunway Many-core Processor.
18th International Conference on Algorithms and Architectures for Parallel Processing(ICA3PP-2018),2018
|
CSCD被引
1
次
|
|
|
|
15.
You H T. OpenACC2.0 VS OpenMP4.0Comparation of Two Popular Programming Language Based on Compilation Instructions.
High Performance Computing,2014,227:20-25
|
CSCD被引
1
次
|
|
|
|
16.
何沧平.
OpenACC并行编程实战,2016
|
CSCD被引
2
次
|
|
|
|
17.
Liao J F.
Redesigning CAM-SE for Peta-Scale Climate Modeling Performance on Sunway TaihuLight,2017
|
CSCD被引
2
次
|
|
|
|
18.
Ao Y L.
Research on Key Optimizations of Sparse Matrix and Stencil Computation for the Domestic Large Many-core System,2017
|
CSCD被引
2
次
|
|
|
|
19.
Ni H.
Research on Heterogeneous parallel computing technology of CFD in unstructured grids,2018
|
CSCD被引
1
次
|
|
|
|
20.
Li Z Z.
Research on parallel multi grid of unstructured grids,2012
|
CSCD被引
1
次
|
|
|
|
|