Collagens are a superfamily of glycoproteins mainly found in the extracellular matrix. They are the most abundant proteins in the human body. Collagens are characterized by a right handed triple helix formed out of three lefthanded α-chains representing repeats of the motif G-X-Y, where (hydroxy)proline and (hydroxy)lysine are often found at positions X and Y. To act as a functional collagen, selected hydroxylysine residues have to be further modified by the addition of either galactose or the disaccharide glucosylgalactose. This glycosylation of collagen takes place in the endoplasmic reticulum before the formation of the triple helix and is mediated by specific β(1-O)
galactosyl- and α(1-2) glucosyltransferase enzymes. The molecular nature of these glycosyltransferases has remained unknown to date. The present study describes the identification of collagen galactosyltransferase enzymes
using a strategy based on affinity chromatography and protein sequencing by mass spectrometry. Three structurally related candidate genes were cloned and expressed in Sf9 insect cells using the baculovirus system. Two of the
three candidate glycosyltransferases (GLT25D1 and GLT25D2) were confirmed to be active collagen galactosyltransferases. The collagen galactosyltransferase genes are differentially expressed in human tissues, suggesting that these enzymes may show preference for different types of collagens or contribute to the varying extent of collagen glycosylation throughout tissues. This was supported by showing a selective preference of GLT25D1 and GLT25D2 for collagen type III and collagen type IV acceptors. GLT25D1 showed a higher enzymatic activity on deglycosylated collagen type I to type V than GLT25D2. Collagen glycosylation is conserved in animals and collagen is also found in several prokaryotic genomes. Proteins sharing structural similarity with the collagen galactosyltransferases have been found in prokaryotes and even in virus. The Acanthamoeba polyphaga mimivirus has been detected as an unique member of the nucleo-cytoplasmic large DNA virus family being clearly a virus but also showing features never seen before in viruses. Mimivirus is the largest known DNA virus with a 1.2 Mbp linear dsDNA genome. It was reported that mimivirus encodes eight proteins with a collagen triple helix motif. These collagens are most probably found in the fibrils of the virus capsid. These fibrils cover the whole icosahedral virus capsid, which are specific for mimivirus. As the Gram staining of mimivirus is positive, it is supposed that the viral fibrils are glycosylated. This suggests that the mimiviral collagens might be post-translationally modified by hydroxylation and subsequent glycosylation.
In this study, the protein L230 was identified as a mimiviral collagen glucosyltransferase transferring glucose on the acceptor hydroxylysine on animal and on mimiviral collagens. This addition of glucose on the acceptor hydroxylysine in collagen has not been reported up to now. It seems that collagen glycosylation in mimivirus is different than in animal collagens. In conclusion, this work reports the identification and characterization of the
two human collagen galactosyltransferases GLT25D1 and GLT25D2 and the identification of the mimiviral collagen glucosyltransferase L230.