Header

UZH-Logo

Maintenance Infos

eggNOG v4.0: nested orthology inference across 3686 organisms


Powell, Sean; Forslund, Kristoffer; Szklarczyk, Damian; Trachana, Kalliopi; Roth, Alexander; Huerta-Cepas, Jaime; Gabaldón, Toni; Rattei, Thomas; Creevey, Chris; Kuhn, Michael; Jensen, Lars J; von Mering, Christian; Bork, Peer (2014). eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Research, 42(1):D231-D239.

Abstract

With the increasing availability of various 'omics data, high-quality orthology assignment is crucial for evolutionary and functional genomics studies. We here present the fourth version of the eggNOG database (available at http://eggnog.embl.de) that derives nonsupervised orthologous groups (NOGs) from complete genomes, and then applies a comprehensive characterization and analysis pipeline to the resulting gene families. Compared with the previous version, we have more than tripled the underlying species set to cover 3686 organisms, keeping track with genome project completions while prioritizing the inclusion of high-quality genomes to minimize error propagation from incomplete proteome sets. Major technological advances include (i) a robust and scalable procedure for the identification and inclusion of high-quality genomes, (ii) provision of orthologous groups for 107 different taxonomic levels compared with 41 in eggNOGv3, (iii) identification and annotation of particularly closely related orthologous groups, facilitating analysis of related gene families, (iv) improvements of the clustering and functional annotation approach, (v) adoption of a revised tree building procedure based on the multiple alignments generated during the process and (vi) implementation of quality control procedures throughout the entire pipeline. As in previous versions, eggNOGv4 provides multiple sequence alignments and maximum-likelihood trees, as well as broad functional annotation. Users can access the complete database of orthologous groups via a web interface, as well as through bulk download.

Abstract

With the increasing availability of various 'omics data, high-quality orthology assignment is crucial for evolutionary and functional genomics studies. We here present the fourth version of the eggNOG database (available at http://eggnog.embl.de) that derives nonsupervised orthologous groups (NOGs) from complete genomes, and then applies a comprehensive characterization and analysis pipeline to the resulting gene families. Compared with the previous version, we have more than tripled the underlying species set to cover 3686 organisms, keeping track with genome project completions while prioritizing the inclusion of high-quality genomes to minimize error propagation from incomplete proteome sets. Major technological advances include (i) a robust and scalable procedure for the identification and inclusion of high-quality genomes, (ii) provision of orthologous groups for 107 different taxonomic levels compared with 41 in eggNOGv3, (iii) identification and annotation of particularly closely related orthologous groups, facilitating analysis of related gene families, (iv) improvements of the clustering and functional annotation approach, (v) adoption of a revised tree building procedure based on the multiple alignments generated during the process and (vi) implementation of quality control procedures throughout the entire pipeline. As in previous versions, eggNOGv4 provides multiple sequence alignments and maximum-likelihood trees, as well as broad functional annotation. Users can access the complete database of orthologous groups via a web interface, as well as through bulk download.

Statistics

Citations

166 citations in Web of Science®
184 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

28 downloads since deposited on 18 Mar 2014
9 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Molecular Life Sciences
08 University Research Priority Programs > Systems Biology / Functional Genomics
08 University Research Priority Programs > Evolution in Action: From Genomes to Ecosystems
Dewey Decimal Classification:570 Life sciences; biology
Language:English
Date:2014
Deposited On:18 Mar 2014 16:24
Last Modified:08 Dec 2017 04:09
Publisher:Oxford University Press
ISSN:0305-1048
Free access at:PubMed ID. An embargo period may apply.
Publisher DOI:https://doi.org/10.1093/nar/gkt1253
PubMed ID:24297252

Download

Download PDF  'eggNOG v4.0: nested orthology inference across 3686 organisms'.
Preview
Content: Published Version
Filetype: PDF
Size: 7MB
View at publisher