ABC-INT-Genome annotation

From "A B C"
Revision as of 07:31, 20 November 2017 by Boris (talk | contribs)
Jump to navigation Jump to search

Integration Unit: Genome annotation


 

Keywords:  Integrator unit: annotate sequences in a genome


 



 


 


Abstract

This page assesses the learning units for data management and sequence analysis of genomic sequence data.


 


This unit ...

Prerequisites

You need to complete the following units before beginning this one:


 


Deliverables

  • Integrator unit: Deliverables will be marked as detailed on this page.


 


Evaluation

This "Integrator Unit" should be submitted for evaluation for a maximum of 8 marks if one of the written deliverables is chosen, resp. 16 marks for the oral exam[1].

Please note the evaluation types that are available as options for this unit. Choose one evaluation type that you have not chosen for another Integrator Unit. (Each submitted Integrator Unit must be evaluated in a different way and one of your evaluations - but not your first one - must be an oral exam).
 
Interview option
Identify a laboratory whose work includes genome annotation, or re-annotation. Get in touch with the PI, a postdoc or senior graduate student in the laboratory and interview them in person or by eMail. Find out
  • why this work is important;
  • how they approach it methodologically;
  • in particular, what features they are looking for, and what discoveries can be made by looking for these features (get very specific on that point, we are most interested in strategies for interpretation of data);
  • what they have recently learned;
  • what the major challenges, current discussions, or controversies are.
  • write up your interview on a subpage of your User page of the Student Wiki;
  • add information that may be required to understand the methodology;
  • make sure that you have included important literature references.
  • When you are done with everything, add the following category tag to the page:
[[Category:EVAL-INT-Genome_annotation]]
Do not change your submission page after this tag has been added. The page will be marked and the category tag will be removed by the instructor.
 
Literature research option
This option requires that a primary publication is available for the MYSPE genome sequence; if there is none, this option is not available.
  • Write a report on the annotation methodology that was used for the MYSPE genome. Note: this is not a review, but a report. Think of a "whitepaper", not a publication. Write to a specialist technical audience - imagine collaborators who want to use the same methods - and be specific to provide actionable information.
  • write your report on a subpage of your User page of the Student Wiki;
  • make sure that you have included all references and citations.
  • When you are done with everything, add the following category tag to the page:
[[Category:EVAL-INT-Genome_annotation]]
Do not change your submission page after this tag has been added. The page will be marked and the category tag will be removed by the instructor.
 
Oral exam option
  • Work through the tasks described in the scenario. Remember to document your work in your journal.
  • Part of your task will involve writing an R script, place that code in a subpage of your User page on the Student Wiki and link to it from your Journal. (Do not add an evaluation category tag to that code).
  • Your work must be complete before 21:00 on the day of your exam.
  • Schedule an oral exam by editing the signup page on the Student Wiki. Enter the unit that you are signing up for, and your name. You must have signed-up for an exam slot before 21:00 on the day before your exam.
 
Genome sequence analysis option
  • Start a subpage of your User page on the Student Wiki to document your analysis;
  • Work through the tasks described in the scenario, download sequence data and develop an analysis script as required. Keep your script generic, so that you could easily adapt it to analyze a different gene. Keep careful Journal notes of your activities with your analysis.
  • When you are done with everything, add the following category tag to the page:
[[Category:EVAL-INT-Genome_annotation]]
Do not change your submission page after this tag has been added. The page will be marked and the category tag will be removed by the instructor.


 


Contents

 

Scenario

 

You know that MYSPE has an Mbp1 orthologue. The key questions of functional genome annotation would be: does it work in the same way in MYSPE as in yeast? Does it have the same target genes? Is it regulated by orthologues to other yeast genes that imply the same feedback mechanisms and genetic regulatory circuits? Here we will try to deduce just one part of such questions: is the binding motif for Mbp1 conserved? If that is the case, we could automate the task to find genes that are potentially regulated by MBP1_MYSPE, if not, we would need to pursue a different strategy of binding site discovery.

Here is how we assess the conservation of the Mbp1 DNA binding motif in MYSPE, working from the orthologue of Cdc6, a pre-replicative complex component:

  • Find the MYSPE orthologue for yeast Cdc6.
  • Fetch 500 nucleotides of upstream genome sequence. (Demonstrate that this is the correct sequence by showing the first 10 translated Cdc6 codons with your sequence.)
  • The yeast Mbp1 canonical binding site is defined by the regular expression [AT]CGCG[AT].
  • Are there CGCG motifs present in your nucleotide sequence?
  • Identify them using a regular expression search. You may find the following code useful:
patt <- "..CGCG.."
m <- gregexpr(patt, mySeq)
regmatches(mySeq, m)[[1]]
  • Are there [AT]CGCG or CGCG[AT] motifs? What about [AT]CGCG[AT]?
  • Where are they located? Do they cluster? Are they arranged in a similar way as the yeast binding sites that you visited at UCSC?
  • Interpret your finding. Does this support or refute the idea that MBP1_MYSPE has the same DNA sequence binding specificity as MBP1_SACCEE?


 


 


Further reading, links and resources

 


Notes

  1. Note: the oral exam will focus on the unit content but will also cover other material that leads up to it.


 


Self-evaluation

 



 




 

If in doubt, ask! If anything about this learning unit is not clear to you, do not proceed blindly but ask for clarification. Post your question on the course mailing list: others are likely to have similar problems. Or send an email to your instructor.



 

About ...
 
Author:

Boris Steipe <boris.steipe@utoronto.ca>

Created:

2017-08-05

Modified:

2017-11-19

Version:

1.0

Version history:

  • 1.0 First live version
  • 0.1 First stub

CreativeCommonsBy.png This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.