Genome annotation

From "A B C"
Jump to navigation Jump to search

Genome annotation


This page is a placeholder, or under current development; it is here principally to establish the logical framework of the site. The material on this page is correct, but incomplete.


Summary ...



 

Contents

Genome browsers

Wang et al. (2013) A brief introduction to web-based genome browsers. Brief Bioinformatics 14:131-43. (pmid: 22764121)

PubMed ] [ DOI ] Genome browser provides a graphical interface for users to browse, search, retrieve and analyze genomic sequence and annotation data. Web-based genome browsers can be classified into general genome browsers with multiple species and species-specific genome browsers. In this review, we attempt to give an overview for the main functions and features of web-based genome browsers, covering data visualization, retrieval, analysis and customization. To give a brief introduction to the multiple-species genome browser, we describe the user interface and main functions of the Ensembl and UCSC genome browsers using the human alpha-globin gene cluster as an example. We further use the MSU and the Rice-Map genome browsers to show some special features of species-specific genome browser, taking a rice transcription factor gene OsSPL14 as an example.

Dreszer et al. (2012) The UCSC Genome Browser database: extensions and updates 2011. Nucleic Acids Res 40:D918-23. (pmid: 22086951)

PubMed ] [ DOI ] The University of California Santa Cruz Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analyzing and sharing both publicly available and user-generated genomic data sets. In the past year, the local database has been updated with four new species assemblies, and we anticipate another four will be released by the end of 2011. Further, a large number of annotation tracks have been either added, updated by contributors, or remapped to the latest human reference genome. Among these are new phenotype and disease annotations, UCSC genes, and a major dbSNP update, which required new visualization methods. Growing beyond the local database, this year we have introduced 'track data hubs', which allow the Genome Browser to provide access to remotely located sets of annotations. This feature is designed to significantly extend the number and variety of annotation tracks that are publicly available for visualization and analysis from within our site. We have also introduced several usability features including track search and a context-sensitive menu of options available with a right-click anywhere on the Browser's image.

Fujita et al. (2011) The UCSC Genome Browser database: update 2011. Nucleic Acids Res 39:D876-82. (pmid: 20959295)

PubMed ] [ DOI ] The University of California, Santa Cruz Genome Browser (http://genome.ucsc.edu) offers online access to a database of genomic sequence and annotation data for a wide variety of organisms. The Browser also has many tools for visualizing, comparing and analyzing both publicly available and user-generated genomic data sets, aligning sequences and uploading user data. Among the features released this year are a gene search tool and annotation track drag-reorder functionality as well as support for BAM and BigWig/BigBed file formats. New display enhancements include overlay of multiple wiggle tracks through use of transparent coloring, options for displaying transformed wiggle data, a 'mean+whiskers' windowing function for display of wiggle data at high zoom levels, and more color schemes for microarray data. New data highlights include seven new genome assemblies, a Neandertal genome data portal, phenotype and disease association data, a human RNA editing track, and a zebrafish Conservation track. We also describe updates to existing tracks.

Schattner (2009) Genomics made easier: an introductory tutorial to genome datamining. Genomics 93:187-95. (pmid: 19041391)

PubMed ] [ DOI ] Integrated genome databases--such as the UCSC, Ensembl and NCBI MapViewer databases--and their associated data querying and visualization interfaces (e.g. the genome browsers) have transformed the way that molecular biologists, geneticists and bioinformaticists analyze genomic data. Nevertheless, because of the complexity of these tools, many researchers take advantage of only a fraction of their capabilities. In this tutorial, using examples from medical genetics and alternative splicing, I describe some of the biological questions that can be addressed with these techniques. I also show why doing so typically is more effective than using alternative methods and indicate some of the resources available for learning more about the advanced capabilities of these powerful tools.

Zweig et al. (2008) UCSC genome browser tutorial. Genomics 92:75-84. (pmid: 18514479)

PubMed ] [ DOI ] The University of California Santa Cruz (UCSC) Genome Bioinformatics website consists of a suite of free, open-source, on-line tools that can be used to browse, analyze, and query genomic data. These tools are available to anyone who has an Internet browser and an interest in genomics. The website provides a quick and easy-to-use visual display of genomic data. It places annotation tracks beneath genome coordinate positions, allowing rapid visual correlation of different types of information. Many of the annotation tracks are submitted by scientists worldwide; the others are computed by the UCSC Genome Bioinformatics group from publicly available sequence data. It also allows users to upload and display their own experimental results or annotation sets by creating a custom track. The suite of tools, downloadable data files, and links to documentation and other information can be found at http://genome.ucsc.edu/.


   

Further reading and resources

Tewhey et al. (2011) The importance of phase information for human genomics. Nat Rev Genet 12:215-23. (pmid: 21301473)

PubMed ] [ DOI ] Contemporary sequencing studies often ignore the diploid nature of the human genome because they do not routinely separate or 'phase' maternally and paternally derived sequence information. However, many findings - both from recent studies and in the more established medical genetics literature - indicate that relationships between human DNA sequence and phenotype, including disease, can be more fully understood with phase information. Thus, the existing technological impediments to obtaining phase information must be overcome if human genomics is to reach its full potential.