Header

UZH-Logo

Maintenance Infos

ERI sorting for emerging processor architectures


Ramdas, T; Egan, G K; Abramson, D; Baldridge, K K (2009). ERI sorting for emerging processor architectures. Computer Physics Communications, 180(8):1221-1229.

Abstract

Electron Repulsion Integrals (ERIs) are a common bottleneck in ab initio computational chemistry. It is known that sorted/reordered execution of ERIs results in efficient SIMD/vector processing. This paper shows that reconfigurable computing and heterogeneous processor architectures can also benefit from a deliberate ordering of ERI tasks. However, realizing these benefits as net speedup requires a very rapid sorting mechanism. This paper presents two such mechanisms. Included in this study are analytical, simulation-based, and experimental benchmarking approaches to consider five use cases for ERI sorting, i.e. SIMD processing, reconfigurable computing, limited address spaces, instruction cache exploitation, and data cache exploitation. Specific consideration is given to existing cache-based processors, FPGAs, and the Cell Broadband Engine processor. It is proposed that the analyses conducted in this work should be built upon to aid the development of software autotuners which will produce efficient ab initio computational chemistry codes for a variety of computer architectures.

Abstract

Electron Repulsion Integrals (ERIs) are a common bottleneck in ab initio computational chemistry. It is known that sorted/reordered execution of ERIs results in efficient SIMD/vector processing. This paper shows that reconfigurable computing and heterogeneous processor architectures can also benefit from a deliberate ordering of ERI tasks. However, realizing these benefits as net speedup requires a very rapid sorting mechanism. This paper presents two such mechanisms. Included in this study are analytical, simulation-based, and experimental benchmarking approaches to consider five use cases for ERI sorting, i.e. SIMD processing, reconfigurable computing, limited address spaces, instruction cache exploitation, and data cache exploitation. Specific consideration is given to existing cache-based processors, FPGAs, and the Cell Broadband Engine processor. It is proposed that the analyses conducted in this work should be built upon to aid the development of software autotuners which will produce efficient ab initio computational chemistry codes for a variety of computer architectures.

Statistics

Citations

Dimensions.ai Metrics
5 citations in Web of Science®
5 citations in Scopus®
Google Scholar™

Altmetrics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Department of Chemistry
Dewey Decimal Classification:540 Chemistry
Scopus Subject Areas:Physical Sciences > Hardware and Architecture
Physical Sciences > General Physics and Astronomy
Language:English
Date:January 2009
Deposited On:15 Jan 2010 11:01
Last Modified:23 Jan 2022 15:21
Publisher:Elsevier
ISSN:0010-4655
OA Status:Closed
Publisher DOI:https://doi.org/10.1016/j.cpc.2009.01.029
Full text not available from this repository.