Header

UZH-Logo

Maintenance Infos

FCC – An automated rule-based processing tool for life science data


Barkow-Oesterreicher, Simon; Tuerker, Can; Panse, Christian (2013). FCC – An automated rule-based processing tool for life science data. Source Code for Biology and Medicine, 8:3.

Abstract

BACKGROUND: Data processing in the bioinformatics field often involves the handling of diverse software programs in one workflow. The field is lacking a set of standards for file formats so that files have to be processed in different ways in order to make them compatible to different analysis programs. The problem is that mass spectrometry vendors at most provide only closed-sourceWindows libraries to programmatically access their proprietary binary formats. This prohibits the creation of an efficient and unified tool that fits all processing needs of the users. Therefore, researchers are spending a significant amount of time using GUI-based conversion and processing programs. Besides the time needed for manual usage, such programs also can show long running times for processing, because most of them make use of only a single CPU. In particular, algorithms to enhance data quality, e.g. peak picking or deconvolution of spectra, add waiting time for the users. RESULTS: To automate these processing tasks and let them run continuously without user interaction, we developed the FGCZ Converter Control (FCC) at the Functional Genomics Center Zurich (FGCZ) core facility. The FCC is a rule-based system for automated file processing that reduces the operation of diverse programs to a single configuration task. Using filtering rules for raw data files, the parameters for all tasks can be custom-tailored to the needs of every single researcher and processing can run automatically and efficiently on any number of servers in parallel using all available CPU resources. CONCLUSIONS: FCC has been used intensively at FGCZ for processing more than hundred thousand mass spectrometry raw files so far. Since we know that many other research facilities have similar problems, we would like to report on our tool and the accompanying ideas for an efficient set-up for potential reuse.

Abstract

BACKGROUND: Data processing in the bioinformatics field often involves the handling of diverse software programs in one workflow. The field is lacking a set of standards for file formats so that files have to be processed in different ways in order to make them compatible to different analysis programs. The problem is that mass spectrometry vendors at most provide only closed-sourceWindows libraries to programmatically access their proprietary binary formats. This prohibits the creation of an efficient and unified tool that fits all processing needs of the users. Therefore, researchers are spending a significant amount of time using GUI-based conversion and processing programs. Besides the time needed for manual usage, such programs also can show long running times for processing, because most of them make use of only a single CPU. In particular, algorithms to enhance data quality, e.g. peak picking or deconvolution of spectra, add waiting time for the users. RESULTS: To automate these processing tasks and let them run continuously without user interaction, we developed the FGCZ Converter Control (FCC) at the Functional Genomics Center Zurich (FGCZ) core facility. The FCC is a rule-based system for automated file processing that reduces the operation of diverse programs to a single configuration task. Using filtering rules for raw data files, the parameters for all tasks can be custom-tailored to the needs of every single researcher and processing can run automatically and efficiently on any number of servers in parallel using all available CPU resources. CONCLUSIONS: FCC has been used intensively at FGCZ for processing more than hundred thousand mass spectrometry raw files so far. Since we know that many other research facilities have similar problems, we would like to report on our tool and the accompanying ideas for an efficient set-up for potential reuse.

Statistics

Citations

Altmetrics

Downloads

44 downloads since deposited on 22 Mar 2013
15 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:04 Faculty of Medicine > Functional Genomics Center Zurich
Dewey Decimal Classification:570 Life sciences; biology
610 Medicine & health
Language:English
Date:2013
Deposited On:22 Mar 2013 12:11
Last Modified:03 Aug 2017 19:46
Publisher:BioMed Central
ISSN:1751-0473
Free access at:PubMed ID. An embargo period may apply.
Publisher DOI:https://doi.org/10.1186/1751-0473-8-3
PubMed ID:23311610

Download

Download PDF  'FCC – An automated rule-based processing tool for life science data'.
Preview
Content: Published Version
Filetype: PDF
Size: 1MB
View at publisher
Licence: Creative Commons: Attribution 2.0 Generic (CC BY 2.0)