Difference between revisions of "BIN-GENOME-Genome Annotation"

Latest revision as of 01:28, 6 September 2021

Genome annotation

(Genome contents; ENCODE; Genome annotation methods.)

Abstract:

Introduction to genome annotation: the content of genomes - what to look for; identifying genes, and keeping up-to-date on methods.

Objectives:
This unit will ...

... introduce categories of genome contents, as defined eg. through the ENCODE project, and discuss annotation methods.

Outcomes:
After working through this unit you ...

... are familar with the contents of genomes, some methods to annotate protein genes, and sources for genomes;
... know how to get up-to-date information on genome annotation workflows.

Deliverables:

Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.

Journal: Document your progress in your Course Journal. Some tasks may ask you to include specific items in your journal. Don't overlook these.

Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page.

Prerequisites:
This unit builds on material covered in the following prerequisite units:

BIN-Genome-Sequencing (Genome sequencing)

Further reading, links and resources

General

Salzberg (2019) Next-generation genome annotation: we still struggle to get it right. Genome Biol 20:92. (pmid: 31097009)

[ PubMed ] [ DOI ] Abstract

Ejigu & Jung (2020) Review on the Computational Genome Annotation of Sequences Obtained by Next-Generation Sequencing. Biology (Basel) 9:. (pmid: 32962098)

[ PubMed ] [ DOI ] Abstract

Encode

The ENCODE project

Davis et al. (2018) The Encyclopedia of DNA elements (ENCODE): data portal update. Nucleic Acids Res 46:D794-D801. (pmid: 29126249)

[ PubMed ] [ DOI ] Abstract

ENCODE Project Consortium (2011) A user's guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol 9:e1001046. (pmid: 21526222)

[ PubMed ] [ DOI ] Abstract

Annotation example papers

Lok et al. (2017) De Novo Genome and Transcriptome Assembly of the Canadian Beaver (Castor canadensis). G3 (Bethesda) 7:755-773. (pmid: 28087693)

[ PubMed ] [ DOI ] Abstract

Seo et al. (2016) De novo assembly and phasing of a Korean human genome. Nature 538:243-247. (pmid: 27706134)

[ PubMed ] [ DOI ] Abstract

Advances in genome assembly and phasing provide an opportunity to investigate the diploid architecture of the human genome and reveal the full range of structural variation across population groups. Here we report the de novo assembly and haplotype phasing of the Korean individual AK1 (ref. 1) using single-molecule real-time sequencing, next-generation mapping, microfluidics-based linked reads, and bacterial artificial chromosome (BAC) sequencing approaches. Single-molecule sequencing coupled with next-generation mapping generated a highly contiguous assembly, with a contig N50 size of 17.9 Mb and a scaffold N50 size of 44.8 Mb, resolving 8 chromosomal arms into single scaffolds. The de novo assembly, along with local assemblies and spanning long reads, closes 105 and extends into 72 out of 190 euchromatic gaps in the reference genome, adding 1.03 Mb of previously intractable sequence. High concordance between the assembly and paired-end sequences from 62,758 BAC clones provides strong support for the robustness of the assembly. We identify 18,210 structural variants by direct comparison of the assembly with the human reference, identifying thousands of breakpoints that, to our knowledge, have not been reported before. Many of the insertions are reflected in the transcriptome and are shared across the Asian population. We performed haplotype phasing of the assembly with short reads, long reads and linked reads from whole-genome sequencing and with short reads from 31,719 BAC clones, thereby achieving phased blocks with an N50 size of 11.6 Mb. Haplotigs assembled from single-molecule real-time reads assigned to haplotypes on phased blocks covered 89% of genes. The haplotigs accurately characterized the hypervariable major histocompatability complex region as well as demonstrating allele configuration in clinically relevant genes such as CYP2D6. This work presents the most contiguous diploid human genome assembly so far, with extensive investigation of unreported and Asian-specific structural variants, and high-quality haplotyping of clinically relevant alleles for precision medicine.

Amemiya et al. (2013) The African coelacanth genome provides insights into tetrapod evolution. Nature 496:311-6. (pmid: 23598338)

[ PubMed ] [ DOI ] Abstract

Specific topics

Copy Number Variation

Zarrei et al. (2015) A copy number variation map of the human genome. Nat Rev Genet 16:172-83. (pmid: 25645873)

[ PubMed ] [ DOI ] Abstract

miRNA

Bracken et al. (2016) A network-biology perspective of microRNA function and dysfunction in cancer. Nat Rev Genet 17:719-732. (pmid: 27795564)

[ PubMed ] [ DOI ] Abstract

Epigenomics

Stricker et al. (2017) From profiles to function in epigenomics. Nat Rev Genet 18:51-66. (pmid: 27867193)

[ PubMed ] [ DOI ] Abstract

Notes

About ...

Author:

Boris Steipe <boris.steipe@utoronto.ca>

Created:

2017-08-05

Modified:

2020-09-25

Version:

1.1

Version history:

1.1 2020 Updates
1.0 First live version
0.1 First stub

This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.

Difference between revisions of "BIN-GENOME-Genome Annotation"

Latest revision as of 01:28, 6 September 2021

Contents

Evaluation

Contents

Further reading, links and resources

Notes

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Sections

Tools

@@ Line 1: / Line 1: @@
-<div id="BIO">
+<div id="ABC">
-  <div class="b1">
+<div style="padding:5px; border:1px solid #000000; background-color:#b3dbce; font-size:300%; font-weight:400; color: #000000; width:100%;">
 Genome annotation
-  </div>
+<div style="padding:5px; margin-top:20px; margin-bottom:10px; background-color:#b3dbce; font-size:30%; font-weight:200; color: #000000; ">
+(Genome contents; ENCODE; Genome annotation methods.)
-  {{Vspace}}
+</div>
-<div class="keywords">
-<b>Keywords:</b>&nbsp;
-EnCode
 </div>
-{{Vspace}}
+{{Smallvspace}}
-__TOC__
-{{Vspace}}
-{{STUB}}
-{{Vspace}}
+<div style="padding:5px; border:1px solid #000000; background-color:#b3dbce33; font-size:85%;">
+<div style="font-size:118%;">
+<b>Abstract:</b><br />
+<section begin=abstract />
+Introduction to genome annotation: the content of genomes - what to look for; identifying genes, and keeping up-to-date on methods.
+<section end=abstract />
+</div>
+<!-- ============================  -->
+<hr>
+<table>
+<tr>
+<td style="padding:10px;">
+<b>Objectives:</b><br />
+This unit will ...
+* ... introduce categories of genome contents, as defined eg. through the ENCODE project, and discuss annotation methods.
+</td>
+<td style="padding:10px;">
+<b>Outcomes:</b><br />
+After working through this unit you ...
+* ... are familar with the contents of genomes, some methods to annotate protein genes, and sources for genomes;
+* ... know how to get up-to-date information on genome annotation workflows.
+</td>
+</tr>
+</table>
+<!-- ============================  -->
+<hr>
+<b>Deliverables:</b><br />
+<section begin=deliverables />
+<li><b>Time management</b>: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.</li>
+<li><b>Journal</b>: Document your progress in your [[FND-Journal|Course Journal]]. Some tasks may ask you to include specific items in your journal. Don't overlook these.</li>
+<li><b>Insights</b>: If you find something particularly noteworthy about this unit, make a note in your [[ABC-Insights|'''insights!''' page]].</li>
+<section end=deliverables />
+<!-- ============================  -->
+<hr>
+<section begin=prerequisites />
+<b>Prerequisites:</b><br />
+This unit builds on material covered in the following prerequisite units:<br />
+*[[BIN-Genome-Sequencing|BIN-Genome-Sequencing (Genome sequencing)]]
+<section end=prerequisites />
+<!-- ============================  -->
 </div>
-<div id="ABC-unit-framework">
-== Abstract ==
-<!-- included from "../components/BIN-Genome-Annotation.components.wtxt", section: "abstract" -->
-...
-{{Vspace}}
+{{Smallvspace}}
-== This unit ... ==
-=== Prerequisites ===
-<!-- included from "../components/BIN-Genome-Annotation.components.wtxt", section: "prerequisites" -->
-<!-- included from "ABC-unit_components.wtxt", section: "notes-prerequisites" -->
-You need to complete the following units before beginning this one:
-*[[BIN-Genome-NGS_bioinformatics]]
-{{Vspace}}
+{{Smallvspace}}
-=== Objectives ===
-<!-- included from "../components/BIN-Genome-Annotation.components.wtxt", section: "objectives" -->
-...
-{{Vspace}}
+__TOC__
-=== Outcomes ===
-<!-- included from "../components/BIN-Genome-Annotation.components.wtxt", section: "outcomes" -->
-...
-{{Vspace}}
-=== Deliverables ===
-<!-- included from "../components/BIN-Genome-Annotation.components.wtxt", section: "deliverables" -->
-<!-- included from "ABC-unit_components.wtxt", section: "deliverables-time_management" -->
-*<b>Time management</b>: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.
-<!-- included from "ABC-unit_components.wtxt", section: "deliverables-journal" -->
-*<b>Journal</b>: Document your progress in your [[FND-Journal|course journal]].
-<!-- included from "ABC-unit_components.wtxt", section: "deliverables-insights" -->
-*<b>Insights</b>: If you find something particularly noteworthy about this unit, make a note in your [[ABC-Insights|insights! page]].
 {{Vspace}}
@@ Line 70: / Line 65: @@
 === Evaluation ===
-<!-- included from "../components/BIN-Genome-Annotation.components.wtxt", section: "evaluation" -->
-<!-- included from "ABC-unit_components.wtxt", section: "eval-none" -->
 <b>Evaluation: NA</b><br />
-:This unit is not evaluated for course marks.
+<div style="margin-left: 2rem;">This unit is not evaluated for course marks.</div>
+== Contents ==
-{{Vspace}}
+{{Task|1=
+*Read the introductory notes on {{ABC-PDF|BIN-Genome-Annotation|the annotation of genome sequences}}.
+* Visit the [https://rast.nmpdr.org/ '';'' server Website] - (more publications linked from there). For many currently published genome-sequencing projects this is the standard of annotation.
+{{#pmid:18261238}}
+}}
-</div>
-<div id="BIO">
-== Contents ==
-<!-- included from "../components/BIN-Genome-Annotation.components.wtxt", section: "contents" -->
-...
-{{Vspace}}
 == Further reading, links and resources ==
-<!-- {{#pmid: 19957275}} -->
-<!-- {{WWW|WWW_GMOD}} -->
-<!-- <div class="reference-box">[http://www.ncbi.nlm.nih.gov]</div> -->
-{{Vspace}}
+;General:
+{{#pmid: 31097009}}
+{{#pmid: 32962098}}
+{{Smallvspace}}
-== Notes ==
+----
-<!-- included from "../components/BIN-Genome-Annotation.components.wtxt", section: "notes" -->
-<!-- included from "ABC-unit_components.wtxt", section: "notes" -->
-<references />
-{{Vspace}}
+;Encode:
+<div class="reference-box">[https://www.encodeproject.org/ '''The ENCODE project''']</div>
+{{#pmid: 29126249}}
+{{#pmid: 21526222}}
+{{Smallvspace}}
-</div>
+----
-<div id="ABC-unit-framework">
-== Self-evaluation ==
-<!-- included from "../components/BIN-Genome-Annotation.components.wtxt", section: "self-evaluation" -->
-<!--
-=== Question 1===
-Question ...
+;Annotation example papers
+{{#pmid: 28087693}}<!-- 2017 Beaver -->
+{{#pmid: 27706134}}<!-- 2016 Korean genome with phasing -->
+{{#pmid: 23598338}}<!-- 2013 Coelacanth -->
-<div class="toccolours mw-collapsible mw-collapsed" style="width:800px">
-Answer ...
-<div class="mw-collapsible-content">
-Answer ...
-</div>
+----
-  </div>
+;Specific topics
-  {{Vspace}}
+:;Copy Number Variation:
+:{{#pmid: 25645873}}
+{{Smallvspace}}
+:;miRNA:
+:{{#pmid: 27795564}}
+{{Smallvspace}}
+:;Epigenomics:
+:{{#pmid: 27867193}}
--->
+== Notes ==
+<references />
 {{Vspace}}
-{{Vspace}}
-<!-- included from "ABC-unit_components.wtxt", section: "ABC-unit_ask" -->
-----
-{{Vspace}}
-<b>If in doubt, ask!</b> If anything about this learning unit is not clear to you, do not proceed blindly but ask for clarification. Post your question on the course mailing list: others are likely to have similar problems. Or send an email to your instructor.
-----
-{{Vspace}}
 <div class="about">
@@ Line 151: / Line 128: @@
 :2017-08-05
 <b>Modified:</b><br />
-:2017-08-05
+:2020-09-25
 <b>Version:</b><br />
-:0.1
+:1.1
 <b>Version history:</b><br />
+*1.1 2020 Updates
+*1.0 First live version
 *0.1 First stub
 </div>
-[[Category:ABC-units]]
-<!-- included from "ABC-unit_components.wtxt", section: "ABC-unit_footer" -->
 {{CC-BY}}
+[[Category:ABC-units]]
+{{UNIT}}
+{{LIVE}}
 </div>
 <!-- [END] -->