Difference between revisions of "ABC-INT-Phylogeny"

From "A B C"
Jump to navigation Jump to search
m
m
Line 28: Line 28:
 
== Abstract ==
 
== Abstract ==
 
<section begin=abstract />
 
<section begin=abstract />
<!-- included from "../components/ABC-INT-Phylogeny.components.wtxt", section: "abstract" -->
+
<!-- included from "./components/ABC-INT-Phylogeny.components.txt", section: "abstract" -->
 
This page integrates material from the learning units for working with multiple sequence alignments, and building and analysing phylogenetic trees, in a task for evaluation.
 
This page integrates material from the learning units for working with multiple sequence alignments, and building and analysing phylogenetic trees, in a task for evaluation.
 
<section end=abstract />
 
<section end=abstract />
Line 37: Line 37:
 
== This unit ... ==
 
== This unit ... ==
 
=== Prerequisites ===
 
=== Prerequisites ===
<!-- included from "../components/ABC-INT-Phylogeny.components.wtxt", section: "prerequisites" -->
+
<!-- included from "./components/ABC-INT-Phylogeny.components.txt", section: "prerequisites" -->
<!-- included from "ABC-unit_components.wtxt", section: "notes-prerequisites" -->
+
<!-- included from "./data/ABC-unit_components.txt", section: "notes-prerequisites" -->
 
You need to complete the following units before beginning this one:
 
You need to complete the following units before beginning this one:
 
*[[BIN-PHYLO-Tree_analysis|BIN-PHYLO-Tree_analysis (Analysing Phylogenetic Trees)]]
 
*[[BIN-PHYLO-Tree_analysis|BIN-PHYLO-Tree_analysis (Analysing Phylogenetic Trees)]]
Line 46: Line 46:
  
 
=== Deliverables ===
 
=== Deliverables ===
<!-- included from "../components/ABC-INT-Phylogeny.components.wtxt", section: "deliverables" -->
+
<!-- included from "./components/ABC-INT-Phylogeny.components.txt", section: "deliverables" -->
<!-- included from "ABC-unit_components.wtxt", section: "deliverables-integrator" -->
+
<!-- included from "./data/ABC-unit_components.txt", section: "deliverables-integrator" -->
 
*<b>Integrator unit</b>: Deliverables will be marked as detailed on this page.
 
*<b>Integrator unit</b>: Deliverables will be marked as detailed on this page.
  
Line 54: Line 54:
  
 
=== Evaluation ===
 
=== Evaluation ===
<!-- included from "../components/ABC-INT-Phylogeny.components.wtxt", section: "evaluation" -->
+
<!-- included from "./components/ABC-INT-Phylogeny.components.txt", section: "evaluation" -->
 
This "Integrator Unit" should be submitted for evaluation for a maximum of 8 marks if one of the written deliverables is chosen, resp. 16 marks for the oral exam<ref>Note: the oral exam will focus on the unit content but will also cover other material that leads up to it</ref>.
 
This "Integrator Unit" should be submitted for evaluation for a maximum of 8 marks if one of the written deliverables is chosen, resp. 16 marks for the oral exam<ref>Note: the oral exam will focus on the unit content but will also cover other material that leads up to it</ref>.
 
:Please note the evaluation types that are available as options for this unit. Choose one evaluation type that you have not chosen for another Integrator Unit. (Each submitted Integrator Unit must be evaluated in a different way and one of your evaluations - but not your first one - must be an oral exam).
 
:Please note the evaluation types that are available as options for this unit. Choose one evaluation type that you have not chosen for another Integrator Unit. (Each submitted Integrator Unit must be evaluated in a different way and one of your evaluations - but not your first one - must be an oral exam).
Line 109: Line 109:
 
<div id="BIO">
 
<div id="BIO">
 
== Contents ==
 
== Contents ==
<!-- included from "../components/ABC-INT-Phylogeny.components.wtxt", section: "contents" -->
+
<!-- included from "./components/ABC-INT-Phylogeny.components.txt", section: "contents" -->
  
  
Line 246: Line 246:
  
 
== Notes ==
 
== Notes ==
<!-- included from "../components/ABC-INT-Phylogeny.components.wtxt", section: "notes" -->
+
<!-- included from "./components/ABC-INT-Phylogeny.components.txt", section: "notes" -->
<!-- included from "ABC-unit_components.wtxt", section: "notes" -->
+
<!-- included from "./data/ABC-unit_components.txt", section: "notes" -->
 
<references />
 
<references />
  
Line 259: Line 259:
  
  
<!-- included from "ABC-unit_components.wtxt", section: "ABC-unit_ask" -->
+
<!-- included from "./data/ABC-unit_components.txt", section: "ABC-unit_ask" -->
  
 
----
 
----
Line 288: Line 288:
 
</div>
 
</div>
 
[[Category:ABC-units]]
 
[[Category:ABC-units]]
<!-- included from "ABC-unit_components.wtxt", section: "ABC-unit_footer" -->
+
<!-- included from "./data/ABC-unit_components.txt", section: "ABC-unit_footer" -->
  
 
{{CC-BY}}
 
{{CC-BY}}

Revision as of 01:02, 6 January 2018

Abstract

This page integrates material from the learning units for working with multiple sequence alignments, and building and analysing phylogenetic trees, in a task for evaluation.


 


This unit ...

Prerequisites

You need to complete the following units before beginning this one:


 


Deliverables

  • Integrator unit: Deliverables will be marked as detailed on this page.


 


Evaluation

This "Integrator Unit" should be submitted for evaluation for a maximum of 8 marks if one of the written deliverables is chosen, resp. 16 marks for the oral exam[1].

Please note the evaluation types that are available as options for this unit. Choose one evaluation type that you have not chosen for another Integrator Unit. (Each submitted Integrator Unit must be evaluated in a different way and one of your evaluations - but not your first one - must be an oral exam).
 
Report option
  • Work through the tasks described in the scenario.
  • Document your results in a short report on a subpage of your User page on the Student Wiki. Describe your methods (R-code!) in an appendix;
  • When you are done with everything, add the following category tag to the page:
[[Category:EVAL-INT-Phylogeny]]
Do not change your submission page after this tag has been added. The page will be marked and the category tag will be removed by the instructor.
 
Interview option
Identify a laboratory whose work includes constructing and evaluating phylogenetic trees. Get in touch with the PI, a postdoc or senior graduate student in the laboratory and interview them in person or by eMail. Find out
  • why this work is important;
  • how they approach it methodologically;
  • in particular, how they quantify the severity of an effect (get technical on that point);
  • what they have recently learned;
  • what the major challenges, current discussions, or controversies are.
  • write up your interview on a subpage of your User page of the Student Wiki;
  • add information that may be required to understand the methodology;
  • make sure that you have included important literature references.
  • When you are done with everything, add the following category tag to the page:
[[Category:EVAL-INT-Phylogeny]]
Do not change your submission page after this tag has been added. The page will be marked and the category tag will be removed by the instructor.
 
Literature research option
Navigate to the Phylogeny Literature Research Topics page on the Student Wiki.
  • Pick a topic and enter your name in the table to claim it.
  • Write a report on your research. Note: this is not a review, but a report. Think of a "whitepaper", not a publication. Write to a specialist technical audience and be specific to provide actionable information.
  • write your report on a subpage of your User page of the Student Wiki;
  • make sure that you have included all references and citations.
  • When you are done with everything, add the following category tag to the page:
[[Category:EVAL-INT-Phylogeny]]
Do not change your submission page after this tag has been added. The page will be marked and the category tag will be removed by the instructor.
 
Oral exam option
  • Work through the tasks described in the scenario. Remember to document your work in your journal.
  • Part of your task will involve writing an R script, place that code in a subpage of your User page on the Student Wiki and link to it from your Journal. (Do not add an evaluation category tag to that code).
  • Your work must be complete before 21:00 on the day before your exam.
  • Schedule an oral exam by editing the signup page on the Student Wiki. Enter the unit that you are signing up for, and your name. You must have signed-up for an exam slot before 21:00 on the day before your exam.
 
R code option
  • Work through the tasks described in the scenario and develop code as required.
  • Put your code on a subpage of your User page on the Student Wiki;
  • When you are done with everything, add the following category tag to the page:
[[Category:EVAL-INT-Phylogeny]]
Do not change your submission page after this tag has been added. The page will be marked and the category tag will be removed by the instructor.


 


Contents

For the Report Option ...

Choose one of the two tasks below:

 
Does masking improve the tree?
 

Task:

  1. Produce a phylogenetic tree from full-length Mbp1 orthologues for the reference species and MYSPE. Do not apply masking.
  2. Produce a second phylogenetic tree from full-length Mbp1 orthologues for the reference species and MYSPE. Apply masking to delete all columns that have more then 2/3 gap-characters.
  3. Determine which tree is "more correct" by calculating tree distances to the species tree.
  4. Report your findings.
 
Does adding characters improve the tree?
 

Task:

  1. Produce a phylogenetic tree from full-length Mbp1 orthologues for the reference species and MYSPE. Do not apply masking.
  2. Produce a second phylogenetic tree only from APSES domains of Mbp1 orthologues for the reference species and MYSPE. Again, do not apply masking.
  3. Determine which tree is "more correct" by calculating tree distances to the species tree.
  4. Report your findings.


 

For the Oral Exam Option ...

Interpret the full APSES tree.

 

Task:

  • Produce a phylogenetic tree from APSES domains of all proteins in myDB. This includes all APSES domain proteins from MYSPE that you have found with PSI-BLAST. (Caution: the proml program may take quite long to compute this tree. Several hours.) You will find this bit of code useful to get you started:
library(msa)

# Align all sequences in the database + KILA_ESSCO
mySeq <- myDB$protein$sequence
names(mySeq) <- myDB$protein$name
mySeq <- c(mySeq,
           "IDGEIIHLRAKDGYINATSMCRTAGKLLSDYTRLKTTQEFFDELSRDMGIPISELIQSFKGGRPENQGTWVHPDIAINLAQ")
names(mySeq)[length(mySeq)] <- "KILA_ESCCO"

mySeqMSA <- msaClustalOmega(AAStringSet(mySeq)) # too many sequences for MUSCLE


# get the sequence of the SACCE APSES domain
sel <- myDB$protein$name == "MBP1_SACCE"
proID <- myDB$protein$ID[sel]

sel <- myDB$feature$ID[myDB$feature$name == "APSES fold"]
fanID <- myDB$annotation$ID[myDB$annotation$proteinID == proID &
                            myDB$annotation$featureID == sel]
start <- myDB$annotation$start[fanID]
end   <- myDB$annotation$end[fanID]

SACCEapses <- substring(myDB$protein$sequence[proID], start, end)

# extract the APSES domains from the MSA
APSESmsa <- fetchMSAmotif(mySeqMSA, SACCEapses)

# Produce the phylogenetic tree ...
  • Interpret the tree with two objectives.
  • (A) how many APSES domain proteins did the last common ancestor (LCA) of all fungi have?
  • (B) what is the evolutionary history of the APSES domain proteins in MYSPE? Were genes lost? Did duplications occur?
  • Print your tree and annotate it. Bring it to the exam and be prepared to discuss your interpretation.


 

For the R-code option ...

Does adding species improve the tree?
 

Task:

  • Produce a MSA from APSES domains of all proteins in myDB.
library(msa)

# Align all sequences in the database + KILA_ESSCO
mySeq <- myDB$protein$sequence
names(mySeq) <- myDB$protein$name
mySeq <- c(mySeq,
           "IDGEIIHLRAKDGYINATSMCRTAGKLLSDYTRLKTTQEFFDELSRDMGIPISELIQSFKGGRPENQGTWVHPDIAINLAQ")
names(mySeq)[length(mySeq)] <- "KILA_ESCCO"

mySeqMSA <- msaClustalOmega(AAStringSet(mySeq)) # too many sequences for MUSCLE


# get the sequence of the SACCE APSES domain
sel <- myDB$protein$name == "MBP1_SACCE"
proID <- myDB$protein$ID[sel]

sel <- myDB$feature$ID[myDB$feature$name == "APSES fold"]
fanID <- myDB$annotation$ID[myDB$annotation$proteinID == proID &
                            myDB$annotation$featureID == sel]
start <- myDB$annotation$start[fanID]
end   <- myDB$annotation$end[fanID]

SACCEapses <- substring(myDB$protein$sequence[proID], start, end)

# extract the APSES domains from the MSA
APSESmsa <- fetchMSAmotif(mySeqMSA, SACCEapses)
  • Write an R-script that does the following:
    • pick ten random sequences plus the Mbp1 orthologues plus KILA_ESCCO
    • remove all other sequences from the alignment
    • mask all columns that have more then 80% gap characters
    • produce a phylogenetic tree from this input data
    • drop all tips from your tree that are not Mbp1 orthologues and not KILA_ESCCO. This code will be useful:
# assuming your new tree is called "allApsTree"
sel <- ! (allApsTree$tip.label %in% fungiTree$tip.label)
newTree <- drop.tip(allApsTree, allApsTree$tip.label[sel])
  • Is this tree more similar to fungiTree than apsTree was?
  • Submit your script, significant data, and the results.


 


 


Further reading, links and resources

 


Notes

  1. Note: the oral exam will focus on the unit content but will also cover other material that leads up to it


 



 




 

If in doubt, ask! If anything about this learning unit is not clear to you, do not proceed blindly but ask for clarification. Post your question on the course mailing list: others are likely to have similar problems. Or send an email to your instructor.



 

About ...
 
Author:

Boris Steipe <boris.steipe@utoronto.ca>

Created:

2017-08-05

Modified:

2017-11-01

Version:

1.1

Version history:

  • 1.1 Corrected posted marks, which were not consistent with the description in the syllabus.
  • 1.0 First live version
  • 0.1 First stub

CreativeCommonsBy.png This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.