Expected Preparations:

  [BIN-PHYLO]
Tree_analysis
 
  The units listed above are part of this course and contain important preparatory material.  

Keywords: Integrator unit: calculate and analyse a phylogenetic tree

Objectives:

Outcomes:


Deliverables:

Integrator unit: Deliverables can be submitted for course marks. See below for details.


Evaluation:

Material based on this Integrator Unit can be submitted for summative feedback (course marks). It will be marked for a maximum of 18 marks for a regular submission, resp. 36 marks if you choose this for your Oral Test1.

For your report:

  1. Create a new document in your shared Google drive folder.
  2. Call your document ABC-INT-Phylogeny-<your name>-2022
  3. Work through the tasks described below.
  4. Document your work and your results. Write this at a technical level, like a lab report and include all details that are needed to make your work reproducible. Follow the additional instructions for R-code in case you are submitting code.
  5. Include a (CC) license at the end of your document, as instructed at the beginning of the course.
  6. When you are done with everything, go to the Assignments page on Quercus and open the appropriate Integrator Unit submission category. Paste the URL of your report document into the form, and click on Submit Assignment. Your link can be submitted only once and not edited. Also: do not edit your document after it has been submitted.

If you choose this unit for your Oral Test option:

  1. Prepare your report as above.
  2. Be prepared to discuss your findings during the test.
  3. Make sure the report is submitted before your test date (cf. Oral Test instructions).

 

Contents

This page integrates material from the learning units for working with multiple sequence alignments, and building and analysing phylogenetic trees, in a task for evaluation.

 

Scenario and background

A full gene tree for Mbp1 homologues includes information from the very first ancestral protein, to the time a number of paralogues were present in the cenancestor of all fungi, and how these ancestral genes further split and were lost as the kingdom of fungi evolved. You have all information and tools to make this information availbale, when you build a full APSES domain tree.

 

Building a tree

  • Produce a phylogenetic tree from APSES domains of all proteins in myDB. This includes all APSES domain proteins from MYSPE that you have found with PSI-BLAST. You will find this bit of code useful to get you started:
library(msa)

# Align all sequences in the database + KILA_ESSCO
mySeq <- myDB$protein$sequence
names(mySeq) <- myDB$protein$name
mySeq <- c(mySeq,
           "IDGEIIHLRAKDGYINATSMCRTAGKLLSDYTRLKTTQEFFDELSRDMGIPISELIQSFKGGRPENQGTWVHPDIAINLAQ")
names(mySeq)[length(mySeq)] <- "KILA_ESCCO"

mySeqMSA <- msaClustalOmega(AAStringSet(mySeq)) # too many sequences for MUSCLE

# get the sequence of the SACCE APSES domain
sel <- myDB$protein$name == "MBP1_SACCE"
proID <- myDB$protein$ID[sel]

sel <- myDB$feature$ID[myDB$feature$name == "APSES fold"]
fanID <- myDB$annotation$ID[myDB$annotation$proteinID == proID &
                            myDB$annotation$featureID == sel]
start <- myDB$annotation$start[fanID]
end   <- myDB$annotation$end[fanID]

SACCEapses <- substring(myDB$protein$sequence[proID], start, end)

# extract the APSES domains from the MSA
APSESmsa <- fetchMSAmotif(mySeqMSA, SACCEapses)

# Now produce an ML phylogenetic tree ...

 

Interpreting the tree

Interpret the tree with two objectives.

  1. How many APSES domain proteins did the last common ancestor (LCA, or MRCA) of all fungi have?
  2. What is the evolutionary history of the APSES domain proteins in MYSPE? Were genes lost? Did duplications occur?

Annotate your tree. Include the annotation in your report.

 

Questions, comments

If in doubt, ask! If anything about this contents is not clear to you, do not proceed but ask for clarification. If you have ideas about how to make this material better, let’s hear them. We are aiming to compile a list of FAQs for all learning units, and your contributions will count towards your participation marks.

Improve this page! If you have questions or comments, please post them on the Quercus Discussion board with a subject line that includes the name of the unit.

References

Page ID: ABC-INT-Phylogeny

Keywords: Integrator unit: calculate and analyse a phylogenetic tree

Author:
Boris Steipe ( <boris.steipe@utoronto.ca> )
Created:
2017-08-05
Last modified:
2022-11-11
Version:
2.0
Version History:
–  2.0 The Phylip era is over
–  1.5 Update for 2022 format
–  1.4 Regenerated inadvertently deleted “Report option” instructions.
–  1.3 Edit policy update
–  1.2 2020 Updates
–  1.1 Corrected posted marks, which were not consistent with the description in the syllabus.
–  1.0 First live version
–  0.1 First stub
Tagged with:
–  Integrator
–  Live
–  Has R code examples

 

[END]


  1. Note: the oral test is cumulative. It will focus on the content of this unit but will also cover other material that leads up to it.↩︎