Header

UZH-Logo

Maintenance Infos

The quantum chemical search for novel materials and the issue of data processing: The InfoMol project


Lüthi, Hans P; Heinen, Stefan; Schneider, Gisbert; Glöss, Andreas; Brändle, Martin P; King, Rollin A; Pyzer-Knapp, Edward; Alharbi, Fahhad H; Kais, Sabre (2016). The quantum chemical search for novel materials and the issue of data processing: The InfoMol project. Journal of Computational Science, 15:65-73.

Abstract

In the search for novel materials, quantum chemical modeling and simulation has taken an important role. Molecular properties are computed on the basis of first-principles methods and screened against pre-defined criteria. Alternatively, the results of these computations are used as source data to enhance the predictions of data-centric models. Whichever modeling strategy is being applied, data-intense steps are involved in the process. One key bottleneck in this regard is the lack of availability of machine-readable output for virtually all quantum chemistry codes. The results of computations need to be extracted manually or using scripts and parsers, instead of directly being written out in machine-readable format to be imported into a database for archival, analysis and exchange. We present two solutions implemented in two selected examples, the TURBOMOLE and PSI4 program packages. Next to the standard output, both codes generate Extensible Markup Language (XML) output files, but in two different ways. The generation of machine-readable output in a structured format can easily be implemented, and, as long as the data can be transformed, the choice of data format is secondary. The concept is illustrated for two different use cases from method benchmarking and drug design. A third illustration addresses the definition of a data processing and exchange protocol for screening libraries of light-harvesting compounds.

Abstract

In the search for novel materials, quantum chemical modeling and simulation has taken an important role. Molecular properties are computed on the basis of first-principles methods and screened against pre-defined criteria. Alternatively, the results of these computations are used as source data to enhance the predictions of data-centric models. Whichever modeling strategy is being applied, data-intense steps are involved in the process. One key bottleneck in this regard is the lack of availability of machine-readable output for virtually all quantum chemistry codes. The results of computations need to be extracted manually or using scripts and parsers, instead of directly being written out in machine-readable format to be imported into a database for archival, analysis and exchange. We present two solutions implemented in two selected examples, the TURBOMOLE and PSI4 program packages. Next to the standard output, both codes generate Extensible Markup Language (XML) output files, but in two different ways. The generation of machine-readable output in a structured format can easily be implemented, and, as long as the data can be transformed, the choice of data format is secondary. The concept is illustrated for two different use cases from method benchmarking and drug design. A third illustration addresses the definition of a data processing and exchange protocol for screening libraries of light-harvesting compounds.

Statistics

Citations

Dimensions.ai Metrics
4 citations in Web of Science®
5 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

96 downloads since deposited on 10 Nov 2016
19 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Department of Chemistry
Dewey Decimal Classification:540 Chemistry
Scopus Subject Areas:Physical Sciences > Theoretical Computer Science
Physical Sciences > General Computer Science
Physical Sciences > Modeling and Simulation
Uncontrolled Keywords:Quantum chemistry, Modeling and simulation, Materials design, Drug design, Data processing
Language:English
Date:2016
Deposited On:10 Nov 2016 09:38
Last Modified:17 Nov 2023 08:16
Publisher:Elsevier
ISSN:1877-7503
OA Status:Green
Publisher DOI:https://doi.org/10.1016/j.jocs.2015.10.003
  • Content: Accepted Version
  • Language: English
  • Licence: Creative Commons: Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)