Header

UZH-Logo

Maintenance Infos

eggNOG: automated construction and annotation of orthologous groups of genes


Jensen, L J; Julien, P; Kuhn, M; von Mering, C; Muller, J; Doerks, T; Bork, P (2008). eggNOG: automated construction and annotation of orthologous groups of genes. Nucleic Acids Research, 36(Databa):D250-D254.

Abstract

The identification of orthologous genes forms the basis for most comparative genomics studies. Existing approaches either lack functional annotation of the identified orthologous groups, hampering the interpretation of subsequent results, or are manually annotated and thus lag behind the rapid sequencing of new genomes. Here we present the eggNOG database ('evolutionary genealogy of genes: Non-supervised Orthologous Groups'), which contains orthologous groups constructed from Smith-Waterman alignments through identification of reciprocal best matches and triangular linkage clustering. Applying this procedure to 312 bacterial, 26 archaeal and 35 eukaryotic genomes yielded 43 582 course-grained orthologous groups of which 9724 are extended versions of those from the original COG/KOG database. We also constructed more fine-grained groups for selected subsets of organisms, such as the 19 914 mammalian orthologous groups. We automatically annotated our non-supervised orthologous groups with functional descriptions, which were derived by identifying common denominators for the genes based on their individual textual descriptions, annotated functional categories, and predicted protein domains. The orthologous groups in eggNOG contain 1 241 751 genes and provide at least a broad functional description for 77% of them. Users can query the resource for individual genes via a web interface or download the complete set of orthologous groups at http://eggnog.embl.de.

Abstract

The identification of orthologous genes forms the basis for most comparative genomics studies. Existing approaches either lack functional annotation of the identified orthologous groups, hampering the interpretation of subsequent results, or are manually annotated and thus lag behind the rapid sequencing of new genomes. Here we present the eggNOG database ('evolutionary genealogy of genes: Non-supervised Orthologous Groups'), which contains orthologous groups constructed from Smith-Waterman alignments through identification of reciprocal best matches and triangular linkage clustering. Applying this procedure to 312 bacterial, 26 archaeal and 35 eukaryotic genomes yielded 43 582 course-grained orthologous groups of which 9724 are extended versions of those from the original COG/KOG database. We also constructed more fine-grained groups for selected subsets of organisms, such as the 19 914 mammalian orthologous groups. We automatically annotated our non-supervised orthologous groups with functional descriptions, which were derived by identifying common denominators for the genes based on their individual textual descriptions, annotated functional categories, and predicted protein domains. The orthologous groups in eggNOG contain 1 241 751 genes and provide at least a broad functional description for 77% of them. Users can query the resource for individual genes via a web interface or download the complete set of orthologous groups at http://eggnog.embl.de.

Statistics

Citations

Dimensions.ai Metrics
346 citations in Web of Science®
374 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

107 downloads since deposited on 07 Mar 2009
2 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Molecular Life Sciences
08 Research Priority Programs > Systems Biology / Functional Genomics
Dewey Decimal Classification:570 Life sciences; biology
Scopus Subject Areas:Life Sciences > Genetics
Language:English
Date:January 2008
Deposited On:07 Mar 2009 20:41
Last Modified:02 Dec 2023 02:41
Publisher:Oxford University Press
ISSN:0305-1048
Additional Information:Full final text Oxford Journal
OA Status:Gold
Free access at:PubMed ID. An embargo period may apply.
Publisher DOI:https://doi.org/10.1093/nar/gkm796
PubMed ID:17942413
  • Content: Published Version
  • Language: English
  • Description: Nationallizenz 142-005