Expected Preparations:

  [BIN-SX]
Homology_modelling
 
  The units listed above are part of this course and contain important preparatory material.  

Keywords: Integrator unit: create a homology model and assess the role of sequence conservation

Objectives:

Outcomes:


Deliverables:

Integrator unit: Deliverables can be submitted for course marks. See below for details.


Evaluation:

Material based on this Integrator Unit can be submitted for summative feedback (course marks). It will be marked for a maximum of 18 marks for a regular submission, resp. 36 marks if you choose this for your Oral Test1.

For your report:

  1. Create a new document in your shared Google drive folder.
  2. Call your document ABC-INT-Homology-modelling-<your name>-2022
  3. Work through the tasks described below.
  4. Document your work and your results. Write this at a technical level, like a lab report and include all details that are needed to make your work reproducible. Follow the additional instructions for R-code in case you are submitting code.
  5. You are asked to produce several ChimeraX images. Either write the required ChimeraX commands into a script, or use the CX() function to execute them. In any case: you should be able to recreate the image by running your script. No manual commands through the ChimeraX menu interface should be required. Document your script in an appendix to your report.
  6. Include a (CC) license at the end of your document, as instructed at the beginning of the course.
  7. When you are done with everything, go to the Assignments page on Quercus and open the appropriate Integrator Unit submission category. Paste the URL of your report document into the form, and click on Submit Assignment. Your link can be submitted only once and not edited. Also: do not edit your document after it has been submitted.

If you choose this unit for your Oral Test option:

  1. Prepare your report as above.
  2. Be prepared to discuss your findings during the test.
  3. Make sure the report is submitted before your test date (cf. Oral Test instructions).

 

Contents

This page integrates material from the learning units for working with multiple sequence alignments and structure data in a task for evaluation.

 

Scenario and background

You have collected the APSES domain proteins of MYSPE in your protein database and this is now a useful collection of sequences with a shared fold that have evolved separately - but under similar constraints - for hundreds of millions of years.

We can be very confident about our APSES domain alignments, since there are hardly any indels in these sequences - and given a confident alignment we can arrive at a very reasonable structural model. This, for example would allow us to look at residues in the APSES recognition domain that are conserved among known Mbp1 orthologues, but vary between paralogues - you have all the tools to try this at some point.

For this assignment however we are going to look at conservation in the ankyrin domains. Their identification and alignment is a bigger challenge. Interestingly, an ankyrin domain structure is known for one of the homologues in this set - although it is not in our set of sequences. This is the structure of yeast Swi6, a homologue of Mbp1 that has a non-functional APSES domain; it too is involved in cell-cycle regulation since it dimerizes with Mbp1 in the MBF complex (as well as dimerizing with Swi4 in the SBF complex).

Ankyrin domains are among the most widely distributed protein interaction modules and they are found in a wide variaty of functionally diverse proteins. One might think that it is an advantage to work with sequences for which much data is available, but in this case, the abundance of examples actually turns into a liability: database searches come up with so many hits that the biologically or functionally significant ones easily get drowned out in the noise.

There are two sources of structure we will compare. One is the “classic” homology modeling approach: take a known structure, change some amino acids, then find a reasonable model that shows how the sequence changes would be accommodated. This is the SwissModel approach.

The other approach is less than two years old. In December 2020 a program made headlines as a contender for having solved the greatest challenge of computational biology: the protein folding problem. Indeed, this algorithm - Alpha Fold - is unanimously considered the source of a “revolution” in structural biology, by the most highly regarded researchers in the field. As a recent Nature news feature maps out, its impact can hardly be overstated (Callaway 2022). Do read that article. This is ab initio structure prediction.

 

Homology Model: Swiss-Model

 

Ab initio Model: AlphaFold

There are currently two easy ways to get AlphaFold models. For many sequences the models already exist and can be pulled from a large database of structures curated by the EBI. For sequences for which no model exists yet, you can run Alpha Fold for free and get a prediction (although this may take a few hours). Both procedures are integrated with ChimeraX.

(If no AlphaFold structure exists at the EBI and you can’t produce a model, document what you did, what you expected to happen, what happened instead, and whether you think this problem can be solved. Then (and only then) use the Saccharomyyces cerevisiae model instead.)

 

Compare

 

Analyze

 

Interpret

 

Questions, comments

If in doubt, ask! If anything about this contents is not clear to you, do not proceed but ask for clarification. If you have ideas about how to make this material better, let’s hear them. We are aiming to compile a list of FAQs for all learning units, and your contributions will count towards your participation marks.

Improve this page! If you have questions or comments, please post them on the Quercus Discussion board with a subject line that includes the name of the unit.

References

Callaway, Ewen. 2022. “What’s Next for AlphaFold and the AI Protein-Folding Revolution.” Nature 604 (7905): 234–38. https://doi.org/10.1038/d41586-022-00997-5.

Page ID: ABC-INT-Homology_modelling

Keywords: Integrator unit: create a homology model and assess the role of sequence conservation

Author:
Boris Steipe ( <boris.steipe@utoronto.ca> )
Created:
2017-08-05
Last modified:
2022-11-03
Version:
1.4
Version History:
–  1.4 Update for 2022 format. Add AlphaFold.
–  1.3 Edit policy update
–  1.2 2020 updates. Full rewrite of tasks and evaluation: model ankyrin domains and focus on conservation scores.
–  1.1 Corrected posted marks, which were not consistent with the description in the syllabus.
–  1.0 Live 2017
–  0.1 First stub
To Do:
–  Change modelling to Ankyrin-domain - color by conservation.
Tagged with:
–  Integrator
–  Live
–  Evaluated unit
–  Has R code examples

 

[END]


  1. Note: the oral test is cumulative. It will focus on the content of this unit but will also cover other material that leads up to it.↩︎

  2. You will probably already have this sequence annotated to MBP1_MYSPE since this is annotated by similarity by SMART. However we need a “real alignment” of the entire sequence this time.↩︎

  3. These will be too many sequences for the Muscle algorithm, use CLUSTAL Omega instead.↩︎