Header

UZH-Logo

Maintenance Infos

The quantum chemical search for novel materials and the issue of data processing: The InfoMol project


Lüthi, Hans P; Heinen, Stefan; Schneider, Gisbert; Glöss, Andreas; Brändle, Martin P; King, Rollin A; Pyzer-Knapp, Edward; Alharbi, Fahhad H; Kais, Sabre (2016). The quantum chemical search for novel materials and the issue of data processing: The InfoMol project. Journal of Computational Science, 15:65-73.

Abstract

In the search for novel materials, quantum chemical modeling and simulation has taken an important role. Molecular properties are computed on the basis of first-principles methods and screened against pre-defined criteria. Alternatively, the results of these computations are used as source data to enhance the predictions of data-centric models. Whichever modeling strategy is being applied, data-intense steps are involved in the process. One key bottleneck in this regard is the lack of availability of machine-readable output for virtually all quantum chemistry codes. The results of computations need to be extracted manually or using scripts and parsers, instead of directly being written out in machine-readable format to be imported into a database for archival, analysis and exchange. We present two solutions implemented in two selected examples, the TURBOMOLE and PSI4 program packages. Next to the standard output, both codes generate Extensible Markup Language (XML) output files, but in two different ways. The generation of machine-readable output in a structured format can easily be implemented, and, as long as the data can be transformed, the choice of data format is secondary. The concept is illustrated for two different use cases from method benchmarking and drug design. A third illustration addresses the definition of a data processing and exchange protocol for screening libraries of light-harvesting compounds.

Abstract

In the search for novel materials, quantum chemical modeling and simulation has taken an important role. Molecular properties are computed on the basis of first-principles methods and screened against pre-defined criteria. Alternatively, the results of these computations are used as source data to enhance the predictions of data-centric models. Whichever modeling strategy is being applied, data-intense steps are involved in the process. One key bottleneck in this regard is the lack of availability of machine-readable output for virtually all quantum chemistry codes. The results of computations need to be extracted manually or using scripts and parsers, instead of directly being written out in machine-readable format to be imported into a database for archival, analysis and exchange. We present two solutions implemented in two selected examples, the TURBOMOLE and PSI4 program packages. Next to the standard output, both codes generate Extensible Markup Language (XML) output files, but in two different ways. The generation of machine-readable output in a structured format can easily be implemented, and, as long as the data can be transformed, the choice of data format is secondary. The concept is illustrated for two different use cases from method benchmarking and drug design. A third illustration addresses the definition of a data processing and exchange protocol for screening libraries of light-harvesting compounds.

Statistics

Citations

1 citation in Web of Science®
1 citation in Scopus®
Google Scholar™

Altmetrics

Downloads

3 downloads since deposited on 10 Nov 2016
3 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Department of Chemistry
Dewey Decimal Classification:540 Chemistry
Uncontrolled Keywords:Quantum chemistry; Modeling and simulation; Materials design; Drug design; Data processing
Language:English
Date:2016
Deposited On:10 Nov 2016 09:38
Last Modified:21 Jul 2017 08:51
Publisher:Elsevier
ISSN:1877-7503
Publisher DOI:https://doi.org/10.1016/j.jocs.2015.10.003

Download

Preview Icon on Download
Content: Accepted Version
Language: English
Filetype: PDF - Registered users only until 10 October 2017
Size: 714kB
View at publisher
Embargo till: 2017-10-10

Article Networks

TrendTerms

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.
You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.

Author Collaborations