Header

UZH-Logo

Maintenance Infos

Greedy de novo motif discovery to construct motif repositories for bacterial proteomes


Khakzad, Hamed; Malmström, Johan; Malmström, Lars (2019). Greedy de novo motif discovery to construct motif repositories for bacterial proteomes. BMC Bioinformatics, 20(Suppl 4):141.

Abstract

BACKGROUND Bacterial surfaces are complex systems, constructed from membranes, peptidoglycan and, importantly, proteins. The proteins play crucial roles as critical regulators of how the bacterium interacts with and survive in its environment. A full catalog of the motifs in protein families and their relative conservation grade is a prerequisite to target the protein-protein interaction that bacterial surface protein makes to host proteins.
RESULTS In this paper, we propose a greedy approach to identify conserved motifs in large sequence families iteratively. Each iteration discovers a motif de novo and masks all occurrences of that motif. Remaining unmasked sequences are subjected to the next round of motif detection until no more significant motifs can be found. We demonstrate the utility of the method through the construction of a proteome-wide motif repository for Group A Streptococcus (GAS), a significant human pathogen. GAS produce numerous surface proteins that interact with over 100 human plasma proteins, helping the bacteria to evade the host immune response. We used the repository to find that proteins part of the bacterial surface has motif architectures that differ from intracellular proteins.
CONCLUSIONS We elucidate that the M protein, a coiled-coil homodimer that extends over 500 A from the cell wall, has a motif architecture that differs between various GAS strains. As the M protein is known to bind a variety of different plasma proteins, the results indicate that the different motif architectures are responsible for the quantitative differences of plasma proteins that various strains bind. The speed and applicability of the method enable its application to all major human pathogens.

Abstract

BACKGROUND Bacterial surfaces are complex systems, constructed from membranes, peptidoglycan and, importantly, proteins. The proteins play crucial roles as critical regulators of how the bacterium interacts with and survive in its environment. A full catalog of the motifs in protein families and their relative conservation grade is a prerequisite to target the protein-protein interaction that bacterial surface protein makes to host proteins.
RESULTS In this paper, we propose a greedy approach to identify conserved motifs in large sequence families iteratively. Each iteration discovers a motif de novo and masks all occurrences of that motif. Remaining unmasked sequences are subjected to the next round of motif detection until no more significant motifs can be found. We demonstrate the utility of the method through the construction of a proteome-wide motif repository for Group A Streptococcus (GAS), a significant human pathogen. GAS produce numerous surface proteins that interact with over 100 human plasma proteins, helping the bacteria to evade the host immune response. We used the repository to find that proteins part of the bacterial surface has motif architectures that differ from intracellular proteins.
CONCLUSIONS We elucidate that the M protein, a coiled-coil homodimer that extends over 500 A from the cell wall, has a motif architecture that differs between various GAS strains. As the M protein is known to bind a variety of different plasma proteins, the results indicate that the different motif architectures are responsible for the quantitative differences of plasma proteins that various strains bind. The speed and applicability of the method enable its application to all major human pathogens.

Statistics

Citations

Dimensions.ai Metrics

Altmetrics

Downloads

16 downloads since deposited on 04 Jun 2019
9 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Institute for Computational Science
Dewey Decimal Classification:530 Physics
Scopus Subject Areas:Life Sciences > Structural Biology
Life Sciences > Biochemistry
Life Sciences > Molecular Biology
Physical Sciences > Computer Science Applications
Physical Sciences > Applied Mathematics
Language:English
Date:18 April 2019
Deposited On:04 Jun 2019 14:32
Last Modified:15 Apr 2020 23:44
Publisher:BioMed Central
ISSN:1471-2105
OA Status:Gold
Free access at:PubMed ID. An embargo period may apply.
Publisher DOI:https://doi.org/10.1186/s12859-019-2686-8
PubMed ID:30999854

Download

Gold Open Access

Download PDF  'Greedy de novo motif discovery to construct motif repositories for bacterial proteomes'.
Preview
Content: Published Version
Language: English
Filetype: PDF
Size: 1MB
View at publisher
Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)