Difference between revisions of "ABC-INT-Homology modelling"

From "A B C"
Jump to navigation Jump to search
m
m
Line 36: Line 36:
  
  
{{HOLD}}
 
  
 
{{Smallvspace}}
 
{{Smallvspace}}
Line 65: Line 64:
 
;Publication Image option
 
;Publication Image option
 
* Work through the tasks described in the scenario.
 
* Work through the tasks described in the scenario.
* Document your results in a short report on a subpage of your User page on the Student Wiki. Describe your methods (R-code!) in an appendix;
+
* Document your results in a short report on a subpage of your User page on the Student Wiki. Describe your methods (R-code!, ChimeraX commands!) in an appendix;
 
* When you are done with everything, add the following category tag '''to the end of page''':
 
* When you are done with everything, add the following category tag '''to the end of page''':
 
::<code><nowiki>[[Category:EVAL-INT-Homology_modelling]]</nowiki></code>.
 
::<code><nowiki>[[Category:EVAL-INT-Homology_modelling]]</nowiki></code>.
Line 119: Line 118:
 
=== Scenario background ===
 
=== Scenario background ===
  
You have collected the APSES domain proteins of MYSPE in your protein database. This collection of proteins now contains orthologues and paralogues. It is reasonable to assume that orthologues conserve structure and function, whereas paralogues conserve structure, but change function - in particular, paralogous APSES domains would be expected to recognize different DNA binding sites.
+
You have collected the APSES domain proteins of MYSPE in your protein database and this is now a pretty nice collection of widely distributed sequences with a shared fold. We can be very confident about our APSES domain alignments, since there are hardly any indels in these sequences - and given a confident alignment we can arrive at a very reasonable structural  model. This, for example would allow us to look at residues in the APSES recognition domain that are conserved among known Mbp1 orthologues, but vary between paralogues - you have all the tools to try this at some point.
  
 +
For this assignment however we are going to look at conservation in the  ankyrin domains and their identification and alignment is a bigger challenge. Interestingly, an ankyrin domain structure is known for one of the homologues in this set- although it is '''not''' in our set of sequences. This is the structure of yeast Swi6, a homologue of Mbp1 that has a non-functional APSES domain; it too is involved in cell-cycle regulation since it dimerizes with Mbp1 in the MBF complex (as well as dimerizing with Swi4 in the SBF complex).
  
- Model Ankyrin domain
 
- Compute conservation scores
 
- color residues by conservation score
 
- Do conserved residues patch?
 
  
- Report
+
{{Smallvspace}}
  
- R-code
+
{{task|1=
  - quantify this by computing sasa
+
;Your common taks for this scenario are as follows. Execute them and document.
  - color surface by conservation score
 
  - compute conservation as a function of sasa and intramolecular contacts
 
  - distinguish structurally and functionally conserved residues
 
  
 +
* Produce a multiple sequence alignment of the yeast Swi6 sequence from [http://www.rcsb.org/structure/1SW6 '''PDB 1SW6'''].<ref>You will probably already have this sequence annotated to MBP1_MYSPE since this is annotated by similarity by SMART. However we need a "real alignment" of the entire sequence this time.</ref> Make sure you include all MBP1 homologues from the database, not just the Mbp1 orthologues.<ref>These will be too many sequences for the Muscle algorithm, use CLUSTAL Omega instead.</ref>
 +
* Following the procedures of [[BIN-SX-Homology_modelling|the Homology Modelling unit]], prepare a homology model of the MBP1_MYSPE ankyrin domains based on the 1SW6 structure.
 +
}}
  
 +
{{Smallvspace}}
  
 
+
{{task|1=
=== For the Report Option ...===
+
; For the report option...
{{task|
+
:Considering the alignment columns of your multiple sequence alignment, discuss and document with reference to your homology model whether the model has solvent exposed residues that are highly conserved. Take particular note whether there are any such residues where MBP1_MYSPE differs from the consensus of the other, aligned sequences. (Such outliers could point to functionally significant residues.)
# Produce two separate MSAs, one for the Mbp1 orthologues, and one for the other APSES domain containing sequences in <code>myDB</code>. Save the MSA for the APSES domain only as a multi FASTA file that can be read by Chimera.
 
# In Chimera, create a model of your MBP1_MYSPE APSES domain bound to DNA, based on the <code>4UX5</code> structure. (Superimpose your homology model on Chain A and delete Chain B).
 
# Create a copy of that model.
 
# Colour the original by conservation scores of the Mbp1 APSES domain MSA, and colour the copy with the conservation scores for the other APSES domains.
 
# Identify residues that appear functionally conserved - i.e. potentially contributing to DNA binding specificity; we expect them to be conserved in Mbp1 orthologues, but variable in the other sequences.
 
# Identify residues that are structurally conserved, i.e. conserved in all APSES domains.
 
# Illustrate your findings with stereo images and write a brief technical report. Make sure that your description is specific with respect to actually identifying sequence numbers and residue types.
 
 
}}
 
}}
  
{{Vspace}}
+
{{Smallvspace}}
  
=== For the Oral Test Option ...===
 
 
{{task|1=
 
{{task|1=
 
+
; For the Oral Test option...
What can be learned from the two different binding modes of chains A and B of 4UX5, regarding the APSES domain of MBP1_MYSPE?
+
:Prepare to discuss during the test, with specific reference to your homology model, whether the model has solvent exposed residues that are highly conserved. Take particular note whether there are any such residues where MBP1_MYSPE differs from the consensus of the other, aligned sequences. (Such outliers could point to functionally significant residues.) Make sure the data upon which you base your conclusions is available in summary on a page of the Student Wiki before your test.
 
 
# In Chimera, create a model of your MBP1_MYSPE APSES domain bound to DNA, based on the <code>4UX5</code> structure. Superimpose your homology model on 4UX5 Chain A.
 
# Create a copy of 4UX5 Chain B and superimpose that too on Chain A.
 
# Colour all protein and DNA chains by element. Then colour the C-atoms of your model and the two 4UX5 chains with distinct colours that can't be confused with N, O, S, and P atoms. This makes it possible to identify different chains, while still studying details of interactions. Select all residues that are within 5.0 Å of DNA and display them as sticks. Display the rest of the protein as ribbons or tubes. Display the water molecules too.
 
# Set the pivot to a residue at the protein DNA interface.
 
# Use '''Save Session''' to save the entire scene to file.
 
# Be prepared to reload the scene on your laptop during the oral test and explain how the findings in 4UX5 relate to your model.
 
 
 
 
}}
 
}}
  
{{Vspace}}
+
{{Smallvspace}}
  
=== For the Publication Image Option ...===
+
{{task|1=
{{task|
+
; For the publication image option...
 
+
:Create a two-panel publication-quality image to illustrate whether conserved residues of the model form a putative interaction surface. Panel '''(a)''' contains a stereo image of your modelled domain with a transparent surface enclosing a cartoon representation of your model. Highly conserved residues are colored red, and their side-chains are shown. Panel '''(b)''' shows the plot of conservation scores (per residue, not rolling averages) with a horizontal line that indicates your cutoff of what you have considered "highly conserved". Include informative figure captions. Don't forget to describe your methods and submit your code and commands in an appendix.
DNA binding interfaces are expected to comprise a number of positively charged amino acids, that might form salt-bridges with the phosphate backbone. Of course, your homology model did not take the DNA ligand into account.
 
 
 
# In Chimera, create a model of your MBP1_MYSPE APSES domain bound to DNA, based on the <code>4UX5</code> structure. (Superimpose your homology model on Chain A and delete Chain B).
 
# Create a publication quality image (wall-eyed stereo) with two panels: ('''A''') shows the conserved positively charged residues of MBP1_MYSPE that bind to DNA (labels!) in context of the bound DNA, ('''B''') shows the solvent excluded surface calculated separately for protein and DNA, colored by Coulombic surface coloring. Make the surface sufficiently transparent to show the underlying ribbon representations of the backbone, and the side-chains of the conserved positively charged residues. Your Figure ('''A''') is probably best done as a stick or sphere model but it's '''your''' figure so channel your creative talent for information design. Figure ('''B''') may combine (i) the protein and DNA backbones (in '''ribbon view'''), (ii) the sidechains of residues that your are discussing, distinctly coloured, (iii) a transparent surface of the protein, and (iv) a tranparent surface of the DNA. The goal is to demonstrate that the residue conservation of positively charged residues can be explained by their contribution to a surface that is electrostatically complementary to DNA. Make sure your figure does not include irrelevant items that obscure the message.
 
# Write an explicit, descriptive figure caption.
 
# Print your figure and figure caption as a PDF and upload to the Student Wiki.
 
# On your submission page, describe the steps that you went through to create the images and link to your PDF.
 
 
}}
 
}}
  
{{Vspace}}
+
{{Smallvspace}}
  
=== For the R-code option ...===
+
{{task|1=
{{task|
+
; For the R-code option
  
How different are homology models based on 1BM8 and 4UX5? Where are the important differences?
+
:Referring to the documentation of the ChimeraX commands <tt>color byattribute...</tt>, and <tt>defattr ...</tt>, write the msa conservation scores for your model to a text file in the format of a ChimeraX attribute assignment file (cf. the examples linked from the ChimeraX User Documentation). Read the attribute file into ChimeraX using the <tt>remotecontrol rest ...</tt> command and appropriate scripted commands sent via the <tt>CX()</tt> function as described. Write your code so that you can switch a defined constant at the beginning that defines whetehr the commands are applied to your homology model, or to the original 1SW6 coordinates. Produce a  stereo scene that shows your model as a cartoon image enclosed by a transparent surface, where both model as well as surface are colored by conservation values. To evaluate your code, I need to be able to reproduce the scene from your code.
  
# Produce two homology models for MBP1_MYSPE: one with the 1BM8 template, the other with the 4UX5 template (You already have one of the two).
+
<!-- compute sasa per residue and correlate with conservation -->
# Write an R script using bio3d to superimpose the two models, calculate the RMSD between each residue pair, writes the RMSD to each residues' B Factor field for both of the two models, and save the resulting PDB files.
 
# In Chimera, load the two models, superimpose them, and color them by the RMSD values you have computed.
 
# Save a stereo image.
 
# Submit your script and the image. Don't forget to comment your script.
 
  
 +
<!-- distinguish structurally and functionally consrved residues -->
 
}}
 
}}
  
 
{{Vspace}}
 
{{Vspace}}
  
== Self-evaluation ==
 
<!--
 
=== Question 1===
 
 
Question ...
 
 
<div class="toccolours mw-collapsible mw-collapsed" style="width:800px">
 
Answer ...
 
<div class="mw-collapsible-content">
 
Answer ...
 
 
</div>
 
  </div>
 
 
  {{Vspace}}
 
 
-->
 
== Further reading, links and resources ==
 
<!-- {{#pmid: 19957275}} -->
 
<!-- {{WWW|WWW_GMOD}} -->
 
<!-- <div class="reference-box">[http://www.ncbi.nlm.nih.gov]</div> -->
 
 
== Notes ==
 
== Notes ==
 
<references />
 
<references />
Line 231: Line 181:
 
:2017-08-05
 
:2017-08-05
 
<b>Modified:</b><br />
 
<b>Modified:</b><br />
:2017-10-31
+
:2020-10-02
 
<b>Version:</b><br />
 
<b>Version:</b><br />
:1.1
+
:1.2
 
<b>Version history:</b><br />
 
<b>Version history:</b><br />
 +
*1.2 2020 updates. Full rewrite of tasks and evaluation: model ankyrin domains and focus on conservation scores.
 
*1.1 Corrected posted marks, which were not consistent with the description in the syllabus.
 
*1.1 Corrected posted marks, which were not consistent with the description in the syllabus.
 
*1.0 Live 2017
 
*1.0 Live 2017
Line 244: Line 195:
 
[[Category:ABC-units]]
 
[[Category:ABC-units]]
 
{{INTEGRATOR}}
 
{{INTEGRATOR}}
{{HOLD}}
+
{{LIVE}}
 
{{EVAL}}
 
{{EVAL}}
 
</div>
 
</div>
 
<!-- [END] -->
 
<!-- [END] -->

Revision as of 11:46, 2 October 2020

Integrator Unit: Homology Modelling

(Integrator unit: create a homology model and assess the role of sequence conservation)


 


Abstract:

This page integrates material from the learning units for working with multiple sequence alignments and structure data in a task for evaluation.


Deliverables:

  • Integrator unit: Deliverables can be submitted for course marks. See below for details.

  • Prerequisites:
    This unit builds on material covered in the following prerequisite units:


     



     



     


    Evaluation

    This "Integrator Unit" should be submitted for evaluation for a maximum of 8 marks if one of the written deliverables is chosen, resp. 16 marks for the oral test[1].
    Please note the evaluation types that are available as options for this unit. Choose one evaluation type that you have not chosen for another Integrator Unit. (Each submitted Integrator Unit must be evaluated in a different way and one of your evaluations - but not your first one - must be an oral test).


     
    Report option
    • Work through the tasks described in the scenario.
    • Document your results in a short report on a subpage of your User page on the Student Wiki. Describe your methods (R-code!) in an appendix;
    • When you are done with everything, add the following category tag to the end of page:
    [[Category:EVAL-INT-Homology_modelling]].

    Once the page has been saved with this tag, it is considered "submitted". Do not change your submission after this tag has been added. The page will be marked and the category tag will be removed by the instructor.


     
    Publication Image option
    • Work through the tasks described in the scenario.
    • Document your results in a short report on a subpage of your User page on the Student Wiki. Describe your methods (R-code!, ChimeraX commands!) in an appendix;
    • When you are done with everything, add the following category tag to the end of page:
    [[Category:EVAL-INT-Homology_modelling]].

    Once the page has been saved with this tag, it is considered "submitted". Do not change your submission after this tag has been added. The page will be marked and the category tag will be removed by the instructor.


     
    Oral test option
    • Work through the tasks described in the scenario. Remember to document your work in your journal.
    • Your work must be complete before 21:00 on the day before your exam.
    • Schedule an oral test by editing the signup page on the Student Wiki. Enter the unit that you are signing up for, and your name. You must have signed-up for an exam slot before 21:00 on the day before your exam.
     
    R code option
    • Work through the tasks described in the scenario and develop code as required.
    • Put your code on a subpage of your User page on the Student Wiki;
    • When you are done with everything, add the following category tag to the end of page:
    [[Category:EVAL-INT-Homology_modelling]].

    Once the page has been saved with this tag, it is considered "submitted". Do not change your submission after this tag has been added. The page will be marked and the category tag will be removed by the instructor.

    Contents

     

    Scenario background

    You have collected the APSES domain proteins of MYSPE in your protein database and this is now a pretty nice collection of widely distributed sequences with a shared fold. We can be very confident about our APSES domain alignments, since there are hardly any indels in these sequences - and given a confident alignment we can arrive at a very reasonable structural model. This, for example would allow us to look at residues in the APSES recognition domain that are conserved among known Mbp1 orthologues, but vary between paralogues - you have all the tools to try this at some point.

    For this assignment however we are going to look at conservation in the ankyrin domains and their identification and alignment is a bigger challenge. Interestingly, an ankyrin domain structure is known for one of the homologues in this set- although it is not in our set of sequences. This is the structure of yeast Swi6, a homologue of Mbp1 that has a non-functional APSES domain; it too is involved in cell-cycle regulation since it dimerizes with Mbp1 in the MBF complex (as well as dimerizing with Swi4 in the SBF complex).


     

    Task:

    Your common taks for this scenario are as follows. Execute them and document.
    • Produce a multiple sequence alignment of the yeast Swi6 sequence from PDB 1SW6.[2] Make sure you include all MBP1 homologues from the database, not just the Mbp1 orthologues.[3]
    • Following the procedures of the Homology Modelling unit, prepare a homology model of the MBP1_MYSPE ankyrin domains based on the 1SW6 structure.


     

    Task:

    For the report option...
    Considering the alignment columns of your multiple sequence alignment, discuss and document with reference to your homology model whether the model has solvent exposed residues that are highly conserved. Take particular note whether there are any such residues where MBP1_MYSPE differs from the consensus of the other, aligned sequences. (Such outliers could point to functionally significant residues.)


     

    Task:

    For the Oral Test option...
    Prepare to discuss during the test, with specific reference to your homology model, whether the model has solvent exposed residues that are highly conserved. Take particular note whether there are any such residues where MBP1_MYSPE differs from the consensus of the other, aligned sequences. (Such outliers could point to functionally significant residues.) Make sure the data upon which you base your conclusions is available in summary on a page of the Student Wiki before your test.


     

    Task:

    For the publication image option...
    Create a two-panel publication-quality image to illustrate whether conserved residues of the model form a putative interaction surface. Panel (a) contains a stereo image of your modelled domain with a transparent surface enclosing a cartoon representation of your model. Highly conserved residues are colored red, and their side-chains are shown. Panel (b) shows the plot of conservation scores (per residue, not rolling averages) with a horizontal line that indicates your cutoff of what you have considered "highly conserved". Include informative figure captions. Don't forget to describe your methods and submit your code and commands in an appendix.


     

    Task:

    For the R-code option
    Referring to the documentation of the ChimeraX commands color byattribute..., and defattr ..., write the msa conservation scores for your model to a text file in the format of a ChimeraX attribute assignment file (cf. the examples linked from the ChimeraX User Documentation). Read the attribute file into ChimeraX using the remotecontrol rest ... command and appropriate scripted commands sent via the CX() function as described. Write your code so that you can switch a defined constant at the beginning that defines whetehr the commands are applied to your homology model, or to the original 1SW6 coordinates. Produce a stereo scene that shows your model as a cartoon image enclosed by a transparent surface, where both model as well as surface are colored by conservation values. To evaluate your code, I need to be able to reproduce the scene from your code.


     

    Notes

    1. Note: the oral test will focus on the unit content but will also cover other material that leads up to it
    2. You will probably already have this sequence annotated to MBP1_MYSPE since this is annotated by similarity by SMART. However we need a "real alignment" of the entire sequence this time.
    3. These will be too many sequences for the Muscle algorithm, use CLUSTAL Omega instead.


     


    About ...
     
    Author:

    Boris Steipe <boris.steipe@utoronto.ca>

    Created:

    2017-08-05

    Modified:

    2020-10-02

    Version:

    1.2

    Version history:

    • 1.2 2020 updates. Full rewrite of tasks and evaluation: model ankyrin domains and focus on conservation scores.
    • 1.1 Corrected posted marks, which were not consistent with the description in the syllabus.
    • 1.0 Live 2017
    • 0.1 First stub

    CreativeCommonsBy.png This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.