Difference between revisions of "ABC-INT-Homology modelling"

From "A B C"
Jump to navigation Jump to search
m
m
 
(3 intermediate revisions by the same user not shown)
Line 36: Line 36:
  
  
{{HOLD}}
 
  
 
{{Smallvspace}}
 
{{Smallvspace}}
Line 47: Line 46:
  
 
=== Evaluation ===
 
=== Evaluation ===
;This "Integrator Unit" should be submitted for evaluation for a maximum of 8 marks if one of the written deliverables is chosen, resp. 16 marks for the oral test<ref>Note: the oral test will focus on the unit content but will also cover other material that leads up to it</ref>.
+
This "Integrator Unit" should be submitted for evaluation for a maximum of 13 marks if one of the written deliverables is chosen, resp. 24 marks if you choose this for your oral test<ref>Note: the oral test is cumulative. It will focus on the content of this unit but will also cover other material that leads up to it.</ref>.
:Please note the evaluation types that are available as options for this unit. Choose one evaluation type that you have not chosen for another Integrator Unit. (Each submitted Integrator Unit must be evaluated in a different way and one of your evaluations - but not your first one - must be an oral test).
+
:Please note the evaluation types that are available as options for this unit.
 +
:Be mindful of the [[ABC-Rubrics| '''Marking rubrics''']].
 +
:If this is submitted for your oral test, please read the [[BCH441 Oral Test instructions|Oral test instructions]] before you begin.
 +
:If your submission includes R code, please read the [[BCH441 Code submisson instructions|Code submission instructions]] before you begin.
 +
 
 +
Once you have chosen an option ...
 +
<ol>
 +
<li>Create a new page on the student Wiki as a subpage of your User Page.</li>
 +
<li>Put all of your writing to submit on this one page.</li>
 +
 
 +
<li>When you are done with everything, go to the [https://q.utoronto.ca/courses/180416/assignments Quercus '''Assignments''' page] and open the appropriate '''Integrator Unit''' assignment. Paste the URL of your Wiki page into the form, and click on '''Submit Assignment'''.</li>
 +
</ol>
 +
 
 +
Your link can be submitted only once and not edited. But you may change your Wiki page at any time. However only the last version before the due date will be marked. All later edits will be silently ignored.
  
 
{{Smallvspace}}
 
{{Smallvspace}}
  
 
;Report option
 
;Report option
* Work through the tasks described in the scenario.
+
* Work through the tasks described in the scenario below.
* Document your results in a short report on a subpage of your User page on the Student Wiki. Describe your methods (R-code!) in an appendix;
+
* Document your results in a short technical report on a subpage of your User page on the Student Wiki. Describe your methods (R-code!) in [[BCH441 Code submisson instructions|an appendix]] linked from your report;
* When you are done with everything, add the following category tag '''to the end of page''':
+
* When you are done, submit the link to your page via Quercus as described above.
::<code><nowiki>[[Category:EVAL-INT-Homology_modelling]]</nowiki></code>.
 
 
 
Once the page has been saved with this tag, it is considered "submitted".
 
'''Do not''' change your submission after this tag has been added. The page will be marked and the category tag will be removed by the instructor.
 
  
 
{{Smallvspace}}
 
{{Smallvspace}}
  
 
;Publication Image option
 
;Publication Image option
* Work through the tasks described in the scenario.
+
* Work through the tasks described in the scenario below.
* Document your results in a short report on a subpage of your User page on the Student Wiki. Describe your methods (R-code!) in an appendix;
+
* Produce the image as described and "publish" it on a subpage of your User page on the Student Wiki. Describe your methods (R-code! ChimeraX commnds!) in [[BCH441 Code submisson instructions|an appendix]] linked from your report;
* When you are done with everything, add the following category tag '''to the end of page''':
+
* When you are done, submit the link to your page via Quercus as described above.
::<code><nowiki>[[Category:EVAL-INT-Homology_modelling]]</nowiki></code>.
 
 
 
Once the page has been saved with this tag, it is considered "submitted".
 
'''Do not''' change your submission after this tag has been added. The page will be marked and the category tag will be removed by the instructor.
 
  
 
<!--
 
<!--
Line 101: Line 105:
  
 
;Oral test option
 
;Oral test option
* Work through the tasks described in the scenario. Remember to document your work in your journal.
+
* Work through the tasks described below. Remember to document your work in your journal, but there is no need to format this specially as a report.
* Your work must be complete before 21:00 on the day before your exam.
+
* Part of your task will involve writing an R script; refer to the [[BCH441 Code submisson instructions|Code submission instructions]] and link to your page from your Journal.
* Schedule an oral test by editing the [http://steipe.biochemistry.utoronto.ca/abc/students/index.php/Signup-BCH441_Oral_tests '''signup page on the Student Wiki''']. Enter the unit that you are signing up for, and your name. You must have signed-up for an exam slot before 21:00 on the day before your exam.
+
* Note that the work must be completed [[BCH441 Oral Test instructions| '''before''' your actual test date.]]
 +
 
 
{{Smallvspace}}
 
{{Smallvspace}}
 
;R code option
 
;R code option
* Work through the tasks described in the scenario and develop code as required.
+
* Work through the tasks described in the scenario below and develop code as required.
* Put your code on a subpage of your User page on the Student Wiki;
+
* Put your code and other documentation on a subpage of your User page on the Student Wiki;
* When you are done with everything, add the following category tag '''to the end of page''':
+
* When you are done, submit the link to your page via Quercus as described above.
::<code><nowiki>[[Category:EVAL-INT-Homology_modelling]]</nowiki></code>.
+
 
  
Once the page has been saved with this tag, it is considered "submitted".
 
'''Do not''' change your submission after this tag has been added. The page will be marked and the category tag will be removed by the instructor.
 
 
== Contents ==
 
== Contents ==
  
Line 119: Line 122:
 
=== Scenario background ===
 
=== Scenario background ===
  
You have collected the APSES domain proteins of MYSPE in your protein database. This collection of proteins now contains orthologues and paralogues. It is reasonable to assume that orthologues conserve structure and function, whereas paralogues conserve structure, but change function - in particular, paralogous APSES domains would be expected to recognize different DNA binding sites.
+
You have collected the APSES domain proteins of MYSPE in your protein database and this is now a pretty nice collection of widely distributed sequences with a shared fold. We can be very confident about our APSES domain alignments, since there are hardly any indels in these sequences - and given a confident alignment we can arrive at a very reasonable structural  model. This, for example would allow us to look at residues in the APSES recognition domain that are conserved among known Mbp1 orthologues, but vary between paralogues - you have all the tools to try this at some point.
  
 +
For this assignment however we are going to look at conservation in the  ankyrin domains and their identification and alignment is a bigger challenge. Interestingly, an ankyrin domain structure is known for one of the homologues in this set- although it is '''not''' in our set of sequences. This is the structure of yeast Swi6, a homologue of Mbp1 that has a non-functional APSES domain; it too is involved in cell-cycle regulation since it dimerizes with Mbp1 in the MBF complex (as well as dimerizing with Swi4 in the SBF complex).
  
- Model Ankyrin domain
 
- Compute conservation scores
 
- color residues by conservation score
 
- Do conserved residues patch?
 
  
- Report
+
{{Smallvspace}}
  
- R-code
+
{{task|1=
  - quantify this by computing sasa
+
;Your common taks for this scenario are as follows. Execute them and document.
  - color surface by conservation score
 
  - compute conservation as a function of sasa and intramolecular contacts
 
  - distinguish structurally and functionally conserved residues
 
  
 +
* Produce a multiple sequence alignment of the yeast Swi6 sequence from [http://www.rcsb.org/structure/1SW6 '''PDB 1SW6'''].<ref>You will probably already have this sequence annotated to MBP1_MYSPE since this is annotated by similarity by SMART. However we need a "real alignment" of the entire sequence this time.</ref> Make sure you include all MBP1 homologues from the database, not just the Mbp1 orthologues.<ref>These will be too many sequences for the Muscle algorithm, use CLUSTAL Omega instead.</ref>
 +
* Following the procedures of [[BIN-SX-Homology_modelling|the Homology Modelling unit]], prepare a homology model of the MBP1_MYSPE ankyrin domains based on the 1SW6 structure.
 +
}}
  
 +
{{Smallvspace}}
  
 
+
{{task|1=
=== For the Report Option ...===
+
; For the report option...
{{task|
+
:Considering the columns of your multiple sequence alignment, discuss and document with reference to your homology model whether the model has solvent exposed residues that are highly conserved. Take particular note whether there are any such residues where MBP1_MYSPE differs from the consensus of the other, aligned sequences. (Such outliers could point to functionally significant residues.)
# Produce two separate MSAs, one for the Mbp1 orthologues, and one for the other APSES domain containing sequences in <code>myDB</code>. Save the MSA for the APSES domain only as a multi FASTA file that can be read by Chimera.
 
# In Chimera, create a model of your MBP1_MYSPE APSES domain bound to DNA, based on the <code>4UX5</code> structure. (Superimpose your homology model on Chain A and delete Chain B).
 
# Create a copy of that model.
 
# Colour the original by conservation scores of the Mbp1 APSES domain MSA, and colour the copy with the conservation scores for the other APSES domains.
 
# Identify residues that appear functionally conserved - i.e. potentially contributing to DNA binding specificity; we expect them to be conserved in Mbp1 orthologues, but variable in the other sequences.
 
# Identify residues that are structurally conserved, i.e. conserved in all APSES domains.
 
# Illustrate your findings with stereo images and write a brief technical report. Make sure that your description is specific with respect to actually identifying sequence numbers and residue types.
 
 
}}
 
}}
  
{{Vspace}}
+
{{Smallvspace}}
  
=== For the Oral Test Option ...===
 
 
{{task|1=
 
{{task|1=
 +
; For the Oral Test option...
 +
:Prepare to discuss during the test, with specific reference to your homology model, whether the model has solvent exposed residues that are highly conserved. Take particular note whether there are any such residues where MBP1_MYSPE differs from the consensus of the other, aligned sequences. (Such outliers could point to functionally significant residues.) Make sure the data upon which you base your conclusions is available in summary on a page of the Student Wiki before your test.
 +
}}
  
What can be learned from the two different binding modes of chains A and B of 4UX5, regarding the APSES domain of MBP1_MYSPE?
+
{{Smallvspace}}
 
 
# In Chimera, create a model of your MBP1_MYSPE APSES domain bound to DNA, based on the <code>4UX5</code> structure. Superimpose your homology model on 4UX5 Chain A.
 
# Create a copy of 4UX5 Chain B and superimpose that too on Chain A.
 
# Colour all protein and DNA chains by element. Then colour the C-atoms of your model and the two 4UX5 chains with distinct colours that can't be confused with N, O, S, and P atoms. This makes it possible to identify different chains, while still studying details of interactions. Select all residues that are within 5.0 Å of DNA and display them as sticks. Display the rest of the protein as ribbons or tubes. Display the water molecules too.
 
# Set the pivot to a residue at the protein DNA interface.
 
# Use '''Save Session''' to save the entire scene to file.
 
# Be prepared to reload the scene on your laptop during the oral test and explain how the findings in 4UX5 relate to your model.
 
  
 +
{{task|1=
 +
; For the publication image option...
 +
:Create a two-panel publication-quality image to illustrate whether conserved residues of the model form a putative interaction surface. Panel '''(a)''' contains a stereo image of your modelled domain with a transparent surface enclosing a cartoon representation of your model. Highly conserved residues are colored red, and their side-chains are shown. Panel '''(b)''' shows the plot of conservation scores (per residue, not rolling averages) with a horizontal line that indicates your cutoff of what you have considered "highly conserved". Include informative figure captions. Don't forget to describe your methods and submit your code and commands in an appendix.
 
}}
 
}}
  
{{Vspace}}
+
{{Smallvspace}}
  
=== For the Publication Image Option ...===
+
{{task|1=
{{task|
+
; For the R-code option
  
DNA binding interfaces are expected to comprise a number of positively charged amino acids, that might form salt-bridges with the phosphate backbone. Of course, your homology model did not take the DNA ligand into account.
+
:Referring to the User Documentation of the ChimeraX commands <tt>color byattribute ...</tt>, and <tt>defattr ...</tt>, write the msa conservation scores for your model to a text file in the format of a ChimeraX attribute assignment file (cf. the examples linked from the ChimeraX User Documentation). Read the attribute file into ChimeraX using the <tt>remotecontrol rest ...</tt> command and appropriate scripted commands sent via the <tt>CX()</tt> function as demonstrated in the <tt>RPR-ChimeraX-remote.R</tt> script. Structure your code so that you can switch a defined constant at the beginning that defines whether the commands are applied to your homology model, or to the original 1SW6 coordinate set. Produce a stereo scene that shows your model as a cartoon image enclosed by a transparent surface, where both model as well as surface are colored by conservation values. Briefly interpret the results.
  
# In Chimera, create a model of your MBP1_MYSPE APSES domain bound to DNA, based on the <code>4UX5</code> structure. (Superimpose your homology model on Chain A and delete Chain B).
+
:To evaluate your code, I need to be able to reproduce the scene from your code.
# Create a publication quality image (wall-eyed stereo) with two panels: ('''A''') shows the conserved positively charged residues of MBP1_MYSPE that bind to DNA (labels!) in context of the bound DNA, ('''B''') shows the solvent excluded surface calculated separately for protein and DNA, colored by Coulombic surface coloring. Make the surface sufficiently transparent to show the underlying ribbon representations of the backbone, and the side-chains of the conserved positively charged residues. Your Figure ('''A''') is probably best done as a stick or sphere model but it's '''your''' figure so channel your creative talent for information design. Figure ('''B''') may combine (i) the protein and DNA backbones (in '''ribbon view'''), (ii) the sidechains of residues that your are discussing, distinctly coloured, (iii) a transparent surface of the protein, and (iv) a tranparent surface of the DNA. The goal is to demonstrate that the residue conservation of positively charged residues can be explained by their contribution to a surface that is electrostatically complementary to DNA. Make sure your figure does not include irrelevant items that obscure the message.
 
# Write an explicit, descriptive figure caption.
 
# Print your figure and figure caption as a PDF and upload to the Student Wiki.
 
# On your submission page, describe the steps that you went through to create the images and link to your PDF.
 
}}
 
  
{{Vspace}}
+
:Document your work clearly, and discuss the process of computing attributes in R and mapping them to 3D models with ChimeraX. Can the process be significantly improved?
 
 
=== For the R-code option ...===
 
{{task|
 
 
 
How different are homology models based on 1BM8 and 4UX5? Where are the important differences?
 
  
# Produce two homology models for MBP1_MYSPE: one with the 1BM8 template, the other with the 4UX5 template (You already have one of the two).
+
<!-- compute sasa per residue and correlate with conservation -->
# Write an R script using bio3d to superimpose the two models, calculate the RMSD between each residue pair, writes the RMSD to each residues' B Factor field for both of the two models, and save the resulting PDB files.
 
# In Chimera, load the two models, superimpose them, and color them by the RMSD values you have computed.
 
# Save a stereo image.
 
# Submit your script and the image. Don't forget to comment your script.
 
  
 +
<!-- distinguish structurally and functionally consrved residues -->
 
}}
 
}}
  
 
{{Vspace}}
 
{{Vspace}}
  
== Self-evaluation ==
 
<!--
 
=== Question 1===
 
 
Question ...
 
 
<div class="toccolours mw-collapsible mw-collapsed" style="width:800px">
 
Answer ...
 
<div class="mw-collapsible-content">
 
Answer ...
 
 
</div>
 
  </div>
 
 
  {{Vspace}}
 
 
-->
 
== Further reading, links and resources ==
 
<!-- {{#pmid: 19957275}} -->
 
<!-- {{WWW|WWW_GMOD}} -->
 
<!-- <div class="reference-box">[http://www.ncbi.nlm.nih.gov]</div> -->
 
 
== Notes ==
 
== Notes ==
 
<references />
 
<references />
Line 231: Line 189:
 
:2017-08-05
 
:2017-08-05
 
<b>Modified:</b><br />
 
<b>Modified:</b><br />
:2017-10-31
+
:2020-10-07
 
<b>Version:</b><br />
 
<b>Version:</b><br />
:1.1
+
:1.3
 
<b>Version history:</b><br />
 
<b>Version history:</b><br />
 +
*1.3 Edit policy update
 +
*1.2 2020 updates. Full rewrite of tasks and evaluation: model ankyrin domains and focus on conservation scores.
 
*1.1 Corrected posted marks, which were not consistent with the description in the syllabus.
 
*1.1 Corrected posted marks, which were not consistent with the description in the syllabus.
 
*1.0 Live 2017
 
*1.0 Live 2017
Line 244: Line 204:
 
[[Category:ABC-units]]
 
[[Category:ABC-units]]
 
{{INTEGRATOR}}
 
{{INTEGRATOR}}
{{HOLD}}
+
{{LIVE}}
 
{{EVAL}}
 
{{EVAL}}
 
</div>
 
</div>
 
<!-- [END] -->
 
<!-- [END] -->

Latest revision as of 05:37, 7 October 2020

Integrator Unit: Homology Modelling

(Integrator unit: create a homology model and assess the role of sequence conservation)


 


Abstract:

This page integrates material from the learning units for working with multiple sequence alignments and structure data in a task for evaluation.


Deliverables:

  • Integrator unit: Deliverables can be submitted for course marks. See below for details.

  • Prerequisites:
    This unit builds on material covered in the following prerequisite units:


     



     



     


    Evaluation

    This "Integrator Unit" should be submitted for evaluation for a maximum of 13 marks if one of the written deliverables is chosen, resp. 24 marks if you choose this for your oral test[1].

    Please note the evaluation types that are available as options for this unit.
    Be mindful of the Marking rubrics.
    If this is submitted for your oral test, please read the Oral test instructions before you begin.
    If your submission includes R code, please read the Code submission instructions before you begin.

    Once you have chosen an option ...

    1. Create a new page on the student Wiki as a subpage of your User Page.
    2. Put all of your writing to submit on this one page.
    3. When you are done with everything, go to the Quercus Assignments page and open the appropriate Integrator Unit assignment. Paste the URL of your Wiki page into the form, and click on Submit Assignment.

    Your link can be submitted only once and not edited. But you may change your Wiki page at any time. However only the last version before the due date will be marked. All later edits will be silently ignored.


     
    Report option
    • Work through the tasks described in the scenario below.
    • Document your results in a short technical report on a subpage of your User page on the Student Wiki. Describe your methods (R-code!) in an appendix linked from your report;
    • When you are done, submit the link to your page via Quercus as described above.


     
    Publication Image option
    • Work through the tasks described in the scenario below.
    • Produce the image as described and "publish" it on a subpage of your User page on the Student Wiki. Describe your methods (R-code! ChimeraX commnds!) in an appendix linked from your report;
    • When you are done, submit the link to your page via Quercus as described above.


     
    Oral test option
    • Work through the tasks described below. Remember to document your work in your journal, but there is no need to format this specially as a report.
    • Part of your task will involve writing an R script; refer to the Code submission instructions and link to your page from your Journal.
    • Note that the work must be completed before your actual test date.


     
    R code option
    • Work through the tasks described in the scenario below and develop code as required.
    • Put your code and other documentation on a subpage of your User page on the Student Wiki;
    • When you are done, submit the link to your page via Quercus as described above.


    Contents

     

    Scenario background

    You have collected the APSES domain proteins of MYSPE in your protein database and this is now a pretty nice collection of widely distributed sequences with a shared fold. We can be very confident about our APSES domain alignments, since there are hardly any indels in these sequences - and given a confident alignment we can arrive at a very reasonable structural model. This, for example would allow us to look at residues in the APSES recognition domain that are conserved among known Mbp1 orthologues, but vary between paralogues - you have all the tools to try this at some point.

    For this assignment however we are going to look at conservation in the ankyrin domains and their identification and alignment is a bigger challenge. Interestingly, an ankyrin domain structure is known for one of the homologues in this set- although it is not in our set of sequences. This is the structure of yeast Swi6, a homologue of Mbp1 that has a non-functional APSES domain; it too is involved in cell-cycle regulation since it dimerizes with Mbp1 in the MBF complex (as well as dimerizing with Swi4 in the SBF complex).


     

    Task:

    Your common taks for this scenario are as follows. Execute them and document.
    • Produce a multiple sequence alignment of the yeast Swi6 sequence from PDB 1SW6.[2] Make sure you include all MBP1 homologues from the database, not just the Mbp1 orthologues.[3]
    • Following the procedures of the Homology Modelling unit, prepare a homology model of the MBP1_MYSPE ankyrin domains based on the 1SW6 structure.


     

    Task:

    For the report option...
    Considering the columns of your multiple sequence alignment, discuss and document with reference to your homology model whether the model has solvent exposed residues that are highly conserved. Take particular note whether there are any such residues where MBP1_MYSPE differs from the consensus of the other, aligned sequences. (Such outliers could point to functionally significant residues.)


     

    Task:

    For the Oral Test option...
    Prepare to discuss during the test, with specific reference to your homology model, whether the model has solvent exposed residues that are highly conserved. Take particular note whether there are any such residues where MBP1_MYSPE differs from the consensus of the other, aligned sequences. (Such outliers could point to functionally significant residues.) Make sure the data upon which you base your conclusions is available in summary on a page of the Student Wiki before your test.


     

    Task:

    For the publication image option...
    Create a two-panel publication-quality image to illustrate whether conserved residues of the model form a putative interaction surface. Panel (a) contains a stereo image of your modelled domain with a transparent surface enclosing a cartoon representation of your model. Highly conserved residues are colored red, and their side-chains are shown. Panel (b) shows the plot of conservation scores (per residue, not rolling averages) with a horizontal line that indicates your cutoff of what you have considered "highly conserved". Include informative figure captions. Don't forget to describe your methods and submit your code and commands in an appendix.


     

    Task:

    For the R-code option
    Referring to the User Documentation of the ChimeraX commands color byattribute ..., and defattr ..., write the msa conservation scores for your model to a text file in the format of a ChimeraX attribute assignment file (cf. the examples linked from the ChimeraX User Documentation). Read the attribute file into ChimeraX using the remotecontrol rest ... command and appropriate scripted commands sent via the CX() function as demonstrated in the RPR-ChimeraX-remote.R script. Structure your code so that you can switch a defined constant at the beginning that defines whether the commands are applied to your homology model, or to the original 1SW6 coordinate set. Produce a stereo scene that shows your model as a cartoon image enclosed by a transparent surface, where both model as well as surface are colored by conservation values. Briefly interpret the results.
    To evaluate your code, I need to be able to reproduce the scene from your code.
    Document your work clearly, and discuss the process of computing attributes in R and mapping them to 3D models with ChimeraX. Can the process be significantly improved?


     

    Notes

    1. Note: the oral test is cumulative. It will focus on the content of this unit but will also cover other material that leads up to it.
    2. You will probably already have this sequence annotated to MBP1_MYSPE since this is annotated by similarity by SMART. However we need a "real alignment" of the entire sequence this time.
    3. These will be too many sequences for the Muscle algorithm, use CLUSTAL Omega instead.


     


    About ...
     
    Author:

    Boris Steipe <boris.steipe@utoronto.ca>

    Created:

    2017-08-05

    Modified:

    2020-10-07

    Version:

    1.3

    Version history:

    • 1.3 Edit policy update
    • 1.2 2020 updates. Full rewrite of tasks and evaluation: model ankyrin domains and focus on conservation scores.
    • 1.1 Corrected posted marks, which were not consistent with the description in the syllabus.
    • 1.0 Live 2017
    • 0.1 First stub

    CreativeCommonsBy.png This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.