Glycome

From "A B C"
Jump to navigation Jump to search

Glycome


This page is a placeholder, or under current development; it is here principally to establish the logical framework of the site. The material on this page is correct, but incomplete.


The machinery that gives rise to the varied post-translational modifications of proteins with branched oligosaccharides has given rise to a combinatorial code that plays important roles in localization and sorting, and cellular communication. Glycobiology studies the enzymes, reactions and products involved, glycomics establishes the cross-sectional view of the entire complement of a cell's glycosylations, their dynamic change, and what general functional roles could be associated with such change.



 

Introductory reading

Ranzinger et al. (2011) GlycomeDB--a unified database for carbohydrate structures. Nucleic Acids Res 39:D373-6. (pmid: 21045056)

PubMed ] [ DOI ] GlycomeDB integrates the structural and taxonomic data of all major public carbohydrate databases, as well as carbohydrates contained in the Protein Data Bank, which renders the database currently the most comprehensive and unified resource for carbohydrate structures worldwide. GlycomeDB retains the links to the original databases and is updated at weekly intervals with the newest structures available from the source databases. The complete database can be downloaded freely or accessed through a Web-interface (www.glycome-db.org) that provides flexible and powerful search functionalities.


 

Contents

  • Representation of oligoglycoside sequence and topology
  • Databases
  • Workflow

   

Further reading and resources

Hashimoto et al. (2006) KEGG as a glycome informatics resource. Glycobiology 16:63R-70R. (pmid: 16014746)

PubMed ] [ DOI ] Bioinformatics approaches to carbohydrate research have recently begun using large amounts of protein and carbohydrate data. In this field called glycome informatics, the foremost necessity is a comprehensive resource for genome-scale bioinformatics analysis of glycan data. Although the accumulation of experimental data may be useful as a reference of biological and biochemical information on carbohydrates, this is insufficient for bioinformatics analysis. Thus, we have developed a glycome informatics resource (http://www.genome.jp/kegg/glycan/) in KEGG (Kyoto Encyclopedia of Genes and Genomes), an integrated knowledge base of protein networks, genomic information, and chemical information. This review describes three noteworthy features: (1) GLYCAN, a database of carbohydrate structures; (2) glycan-related pathways; and (3) Composite Structure Map (CSM), a map illustrating all possible variations of carbohydrate structures within organisms. GLYCAN includes two useful tools: an intuitive drawing tool called KegDraw, and an efficient glycan search and alignment tool called KEGG Carbohydrate Matcher (KCaM). KEGG's glycan biosynthesis and metabolism pathways, integrating carbohydrate structures, proteins, and reactions, are also a pivotal resource. CSM is constructed as a bridge between carbohydrate functions and structures. CSM is able to display, for example, expression data of glycosyltransferases in a compact manner. In all the KEGG resources, various objects including KEGG pathways, chemical compounds, as well as carbohydrate structures are commonly represented as graphs, which are widely studied and utilized in the computer science field.

Herget et al. (2008) GlycoCT-a unifying sequence format for carbohydrates. Carbohydr Res 343:2162-71. (pmid: 18436199)

PubMed ] [ DOI ] As part of the EUROCarbDB project (www.eurocarbdb.org) we have carefully analyzed the encoding capabilities of all existing carbohydrate sequence formats and the content of publically available structure databases. We have found that none of the existing structural encoding schemata are capable of coping with the full complexity to be expected for experimentally derived structural carbohydrate sequence data across all taxonomic sources. This gap motivated us to define an encoding scheme for complex carbohydrates, named GlycoCT, to overcome the current limitations. This new format is based on a connection table approach, instead of a linear encoding scheme, to describe the carbohydrate sequences, with a controlled vocabulary to name monosaccharides, adopting IUPAC rules to generate a consistent, machine-readable nomenclature. The format uses a block concept to describe frequently occurring special features of carbohydrate sequences like repeating units. It exists in two variants, a condensed form and a more verbose XML syntax. Sorting rules assure the uniqueness of the condensed form, thus making it suitable as a direct primary key for database applications, which rely on unique identifiers. GlycoCT encompasses the capabilities of the heterogeneous landscape of digital encoding schemata in glycomics and is thus a step forward on the way to a unified and broadly accepted sequence format in glycobioinformatics.

Lütteke & von der Lieth (2009) Data mining the PDB for glyco-related data. Methods Mol Biol 534:293-310. (pmid: 19277543)

PubMed ] [ DOI ] The 3D structural data of glycoprotein or protein-carbohydrate complexes that are found in the Protein Data Bank (PDB) are an interesting data source for glycobiologists. Unfortunately, carbohydrate components are difficult to find with the means provided by the PDB. The GLYCOSCIENCES.de internet portal offers a variety of tools and databases to locate and analyze these structures. This chapter describes how to find PDB entries that feature a specific carbohydrate structure and how to locate carbohydrate residues in a 3D structure file and to check their consistency. In addition to this, methods to statistically analyze torsion angles and the abundance of amino acids both in the neighborhood of glycosylation sites and in the spatial vicinity of non-covalently bound carbohydrate chains are summarized.

von der Lieth et al. (2011) EUROCarbDB: An open-access platform for glycoinformatics. Glycobiology 21:493-502. (pmid: 21106561)

PubMed ] [ DOI ] The EUROCarbDB project is a design study for a technical framework, which provides sophisticated, freely accessible, open-source informatics tools and databases to support glycobiology and glycomic research. EUROCarbDB is a relational database containing glycan structures, their biological context and, when available, primary and interpreted analytical data from high-performance liquid chromatography, mass spectrometry and nuclear magnetic resonance experiments. Database content can be accessed via a web-based user interface. The database is complemented by a suite of glycoinformatics tools, specifically designed to assist the elucidation and submission of glycan structure and experimental data when used in conjunction with contemporary carbohydrate research workflows. All software tools and source code are licensed under the terms of the Lesser General Public License, and publicly contributed structures and data are freely accessible. The public test version of the web interface to the EUROCarbDB can be found at http://www.ebi.ac.uk/eurocarb.