Header

UZH-Logo

Maintenance Infos

A workflow to increase the detection rate of proteins from unsequenced organisms in high-throughput proteomics experiments


Grossmann, J; Fischer, B; Baerenfaller, K; Owiti, J; Buhmann, J M; Gruissem, W; Baginsky, S (2007). A workflow to increase the detection rate of proteins from unsequenced organisms in high-throughput proteomics experiments. Proteomics, 7(23):4245-4254.

Abstract

We present and evaluate a strategy for the mass spectrometric identification of proteins from organisms for which no genome sequence information is available that incorporates cross-species information from sequenced organisms. The presented method combines spectrum quality scoring, de novo sequencing and error tolerant BLAST searches and is designed to decrease input data complexity. Spectral quality scoring reduces the number of investigated mass spectra without a loss of information. Stringent quality-based selection and the combination of different de novo sequencing methods substantially increase the catalog of significant peptide alignments. The de novo sequences passing a reliability filter are subsequently submitted to error tolerant BLAST searches and MS-BLAST hits are validated by a sampling technique. With the described workflow, we identified up to 20% more groups of homologous proteins in proteome analyses with organisms whose genome is not sequenced than by state-of-the-art database searches in an Arabidopsis thaliana database. We consider the novel data analysis workflow an excellent screening method to identify those proteins that evade detection in proteomics experiments as a result of database constraints.

Abstract

We present and evaluate a strategy for the mass spectrometric identification of proteins from organisms for which no genome sequence information is available that incorporates cross-species information from sequenced organisms. The presented method combines spectrum quality scoring, de novo sequencing and error tolerant BLAST searches and is designed to decrease input data complexity. Spectral quality scoring reduces the number of investigated mass spectra without a loss of information. Stringent quality-based selection and the combination of different de novo sequencing methods substantially increase the catalog of significant peptide alignments. The de novo sequences passing a reliability filter are subsequently submitted to error tolerant BLAST searches and MS-BLAST hits are validated by a sampling technique. With the described workflow, we identified up to 20% more groups of homologous proteins in proteome analyses with organisms whose genome is not sequenced than by state-of-the-art database searches in an Arabidopsis thaliana database. We consider the novel data analysis workflow an excellent screening method to identify those proteins that evade detection in proteomics experiments as a result of database constraints.

Statistics

Citations

36 citations in Web of Science®
36 citations in Scopus®
Google Scholar™

Altmetrics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:04 Faculty of Medicine > Functional Genomics Center Zurich
08 University Research Priority Programs > Systems Biology / Functional Genomics
Dewey Decimal Classification:570 Life sciences; biology
610 Medicine & health
Language:English
Date:2007
Deposited On:18 Jan 2010 09:48
Last Modified:05 Apr 2016 13:36
Publisher:Wiley-Blackwell
ISSN:1615-9853
Publisher DOI:https://doi.org/10.1002/pmic.200700474
PubMed ID:18040981

Download

Full text not available from this repository.
View at publisher