BIN-GENOME-Genome Sequencing

From "A B C"
Jump to navigation Jump to search

Genome sequencing

(Sequencing technologies, highly parallel, single-molecule and single-cell)


 


Abstract:

A basic introduction to "Next Generation Sequencing" concepts and technologies.


Objectives:
This unit will ...

  • ... introduce methods and concepts of "Next Generation Sequencing" and genome assembly.

Outcomes:
After working through this unit you ...

  • ... are familar with the basic methods and concepts of "Next Generation Sequencing" and genome assembly.

Deliverables:

  • Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.
  • Journal: Document your progress in your Course Journal. Some tasks may ask you to include specific items in your journal. Don't overlook these.
  • Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page.

  • Prerequisites:
    You need the following preparation before beginning this unit. If you are not familiar with this material from courses you took previously, you need to prepare yourself from other information sources:

    • Biomolecules: The molecules of life; nucleic acids and amino acids; the genetic code; protein folding; post-translational modifications and protein biochemistry; membrane proteins; biological function.

    This unit builds on material covered in the following prerequisite units:


     



     



     


    Evaluation

    Evaluation: NA

    This unit is not evaluated for course marks.

    Contents

    Task:


    Further reading, links and resources

    For the newest published developments in this still rapidly evolving field: see the Nature subjects collection - NGS (continuous). Some recent papers of interest to us:

    Kashima et al. (2020) Single-cell sequencing techniques from individual to multiomics analyses. Exp Mol Med . (pmid: 32929221)

    PubMed ] [ DOI ] Here, we review single-cell sequencing techniques for individual and multiomics profiling in single cells. We mainly describe single-cell genomic, epigenomic, and transcriptomic methods, and examples of their applications. For the integration of multilayered data sets, such as the transcriptome data derived from single-cell RNA sequencing and chromatin accessibility data derived from single-cell ATAC-seq, there are several computational integration methods. We also describe single-cell experimental methods for the simultaneous measurement of two or more omics layers. We can achieve a detailed understanding of the basic molecular profiles and those associated with disease in each cell by utilizing a large number of single-cell sequencing techniques and the accumulated data sets.

    Kempfer & Pombo (2020) Methods for mapping 3D chromosome architecture. Nat Rev Genet 21:207-226. (pmid: 31848476)

    PubMed ] [ DOI ] Determining how chromosomes are positioned and folded within the nucleus is critical to understanding the role of chromatin topology in gene regulation. Several methods are available for studying chromosome architecture, each with different strengths and limitations. Established imaging approaches and proximity ligation-based chromosome conformation capture (3C) techniques (such as DNA-FISH and Hi-C, respectively) have revealed the existence of chromosome territories, functional nuclear landmarks (such as splicing speckles and the nuclear lamina) and topologically associating domains. Improvements to these methods and the recent development of ligation-free approaches, including GAM, SPRITE and ChIA-Drop, are now helping to uncover new aspects of 3D genome topology that confirm the nucleus to be a complex, highly organized organelle.

    Ho et al. (2020) Structural variation in the sequencing era. Nat Rev Genet 21:171-189. (pmid: 31729472)

    PubMed ] [ DOI ] Identifying structural variation (SV) is essential for genome interpretation but has been historically difficult due to limitations inherent to available genome technologies. Detection methods that use ensemble algorithms and emerging sequencing technologies have enabled the discovery of thousands of SVs, uncovering information about their ubiquity, relationship to disease and possible effects on biological mechanisms. Given the variability in SV type and size, along with unique detection biases of emerging genomic platforms, multiplatform discovery is necessary to resolve the full spectrum of variation. Here, we review modern approaches for investigating SVs and proffer that, moving forwards, studies integrating biological information with detection will be necessary to comprehensively understand the impact of SV in the human genome.

    Stark et al. (2019) RNA sequencing: the teenage years. Nat Rev Genet 20:631-656. (pmid: 31341269)

    PubMed ] [ DOI ] Over the past decade, RNA sequencing (RNA-seq) has become an indispensable tool for transcriptome-wide analysis of differential gene expression and differential splicing of mRNAs. However, as next-generation sequencing technologies have developed, so too has RNA-seq. Now, RNA-seq methods are available for studying many different aspects of RNA biology, including single-cell gene expression, translation (the translatome) and RNA structure (the structurome). Exciting new applications are being explored, such as spatial transcriptomics (spatialomics). Together with new long-read and direct RNA-seq technologies and better computational tools for data analysis, innovations in RNA-seq are contributing to a fuller understanding of RNA biology, from questions such as when and where transcription occurs to the folding and intermolecular interactions that govern RNA function.

    Sedlazeck et al. (2018) Piercing the dark matter: bioinformatics of long-range sequencing and mapping. Nat Rev Genet 19:329-346. (pmid: 29599501)

    PubMed ] [ DOI ] Several new genomics technologies have become available that offer long-read sequencing or long-range mapping with higher throughput and higher resolution analysis than ever before. These long-range technologies are rapidly advancing the field with improved reference genomes, more comprehensive variant identification and more complete views of transcriptomes and epigenomes. However, they also require new bioinformatics approaches to take full advantage of their unique characteristics while overcoming their complex errors and modalities. Here, we discuss several of the most important applications of the new technologies, focusing on both the currently available bioinformatics tools and opportunities for future research.

    Langmead & Nellore (2018) Cloud computing for genomic data analysis and collaboration. Nat Rev Genet 19:208-219. (pmid: 29379135)

    PubMed ] [ DOI ] Next-generation sequencing has made major strides in the past decade. Studies based on large sequencing data sets are growing in number, and public archives for raw sequencing data have been doubling in size every 18 months. Leveraging these data requires researchers to use large-scale computational resources. Cloud computing, a model whereby users rent computers and storage from large data centres, is a solution that is gaining traction in genomics research. Here, we describe how cloud computing is used in genomics for research and large-scale collaborations, and argue that its elasticity, reproducibility and privacy features make it ideally suited for the large-scale reanalysis of publicly available archived data, including privacy-protected data.

    A very informative and influential blog about the absolutely newest in the field is written by Lior Pachter, Berkley - "Bits of DNA". Check it out, and from time to time (continuous).

    Notes


     


    About ...
     
    Author:

    Boris Steipe <boris.steipe@utoronto.ca>

    Created:

    2017-08-05

    Modified:

    2020-09-25

    Version:

    1.1

    Version history:

    • 1.0 2020 contents updates
    • 1.0 First live 2017
    • 0.1 First stub

    CreativeCommonsBy.png This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.