UZH-Logo

Maintenance Infos

Protein identification false discovery rates for very large proteomics data sets generated by tandem mass spectrometry


Reiter, L; Claassen, M; Schrimpf, S P; Jovanovic, M; Schmidt, A; Buhmann, J M; Hengartner, M O; Aebersold, R (2009). Protein identification false discovery rates for very large proteomics data sets generated by tandem mass spectrometry. Molecular & Cellular Proteomics, 8(11):2405-2417.

Abstract

Comprehensive characterization of a proteome is a fundamental goal in proteomics. To achieve saturation coverage of a proteome or specific subproteome via tandem mass spectrometric identification of tryptic protein sample digests, proteomics data sets are growing dramatically in size and heterogeneity. The trend toward very large integrated data sets poses so far unsolved challenges to control the uncertainty of protein identifications going beyond well established confidence measures for peptide-spectrum matches. We present MAYU, a novel strategy that reliably estimates false discovery rates for protein identifications in large scale data sets. We validated and applied MAYU using various large proteomics data sets. The data show that the size of the data set has an important and previously underestimated impact on the reliability of protein identifications. We particularly found that protein false discovery rates are significantly elevated compared with those of peptide-spectrum matches. The function provided by MAYU is critical to control the quality of proteome data repositories and thereby to enhance any study relying on these data sources. The MAYU software is available as standalone software and also integrated into the Trans-Proteomic Pipeline.

Comprehensive characterization of a proteome is a fundamental goal in proteomics. To achieve saturation coverage of a proteome or specific subproteome via tandem mass spectrometric identification of tryptic protein sample digests, proteomics data sets are growing dramatically in size and heterogeneity. The trend toward very large integrated data sets poses so far unsolved challenges to control the uncertainty of protein identifications going beyond well established confidence measures for peptide-spectrum matches. We present MAYU, a novel strategy that reliably estimates false discovery rates for protein identifications in large scale data sets. We validated and applied MAYU using various large proteomics data sets. The data show that the size of the data set has an important and previously underestimated impact on the reliability of protein identifications. We particularly found that protein false discovery rates are significantly elevated compared with those of peptide-spectrum matches. The function provided by MAYU is critical to control the quality of proteome data repositories and thereby to enhance any study relying on these data sources. The MAYU software is available as standalone software and also integrated into the Trans-Proteomic Pipeline.

Citations

120 citations in Web of Science®
130 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

34 downloads since deposited on 09 Feb 2010
17 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Molecular Life Sciences
08 University Research Priority Programs > Systems Biology / Functional Genomics
Dewey Decimal Classification:570 Life sciences; biology
Language:English
Date:November 2009
Deposited On:09 Feb 2010 19:56
Last Modified:05 Apr 2016 13:49
Publisher:American Society for Biochemistry and Molecular Biology
ISSN:1535-9476
Additional Information:This research was originally published in: Reiter, L; Claassen, M; Schrimpf, S P; Jovanovic, M; Schmidt, A; Buhmann, J M; Hengartner, M O; Aebersold, R (2009). Protein identification false discovery rates for very large proteomics data sets generated by tandem mass spectrometry. Molecular & Cellular Proteomics, 8(11):2405-2417. © the American Society for Biochemistry and Molecular Biology.
Publisher DOI:10.1074/mcp.M900317-MCP200
PubMed ID:19608599
Permanent URL: http://doi.org/10.5167/uzh-28712

Download

[img]
Filetype: PDF - Registered users only
Size: 1MB
View at publisher
[img]
Preview
Content: Accepted Version
Filetype: PDF
Size: 754kB

TrendTerms

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.
You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.

Author Collaborations