Difference between revisions of "ABC-INT-Homology modelling"

From "A B C"
Jump to navigation Jump to search
m
m
 
(27 intermediate revisions by the same user not shown)
Line 1: Line 1:
<div id="BIO">
+
<div id="ABC">
  <div class="b1">
+
<div style="padding:5px; border:4px solid #000000; background-color:#e19fa7; font-size:300%; font-weight:400; color: #000000; width:100%;">
Integration Unit: Homology Modelling
+
Integrator Unit: Homology Modelling
  </div>
+
<div style="padding:5px; margin-top:20px; margin-bottom:10px; background-color:#e19fa7; font-size:30%; font-weight:200; color: #000000; ">
 
+
(Integrator unit: create a homology model and assess the role of sequence conservation)
  {{Vspace}}
+
</div>
 
 
<div class="keywords">
 
<b>Keywords:</b>&nbsp;
 
Integration unit: create a homology model and assess the role of sequence conservation
 
 
</div>
 
</div>
  
{{Vspace}}
+
{{Smallvspace}}
 
 
 
 
__TOC__
 
 
 
{{Vspace}}
 
 
 
 
 
{{STUB}}
 
 
 
{{Vspace}}
 
  
  
</div>
+
<div style="padding:5px; border:1px solid #000000; background-color:#e19fa733; font-size:85%;">
<div id="ABC-unit-framework">
+
<div style="font-size:118%;">
== Abstract ==
+
<b>Abstract:</b><br />
 
<section begin=abstract />
 
<section begin=abstract />
<!-- included from "../components/ABC-INT-Homology_modelling.components.wtxt", section: "abstract" -->
+
This page integrates material from the learning units for working with multiple sequence alignments and structure data in a task for evaluation.
This page assesses the learning units for working with multiple sequence alignments and structure data.
 
 
 
*Use an Mbp1 orthologue (RBM)
 
*Find all orthologues in YFO and reference species
 
*Homology model
 
*Calculate conservation scores
 
*Map, visualize
 
*Interpret
 
 
<section end=abstract />
 
<section end=abstract />
 +
</div>
 +
<!-- ============================  -->
 +
<hr>
 +
<b>Deliverables:</b><br />
 +
<section begin=deliverables />
 +
<li><b>Integrator unit</b>: Deliverables can be submitted for course marks. See below for details.</li>
 +
<section end=deliverables />
 +
<!-- ============================  -->
 +
<hr>
 +
<section begin=prerequisites />
 +
<b>Prerequisites:</b><br />
 +
This unit builds on material covered in the following prerequisite units:<br />
 +
*[[BIN-SX-Homology_modelling|BIN-SX-Homology_modelling (Homology Modeling)]]
 +
<section end=prerequisites />
 +
<!-- ============================  -->
 +
</div>
  
{{Vspace}}
+
{{Smallvspace}}
 
 
  
== This unit ... ==
 
=== Prerequisites ===
 
<!-- included from "../components/ABC-INT-Homology_modelling.components.wtxt", section: "prerequisites" -->
 
<!-- included from "ABC-unit_components.wtxt", section: "notes-prerequisites" -->
 
You need to complete the following units before beginning this one:
 
*[[BIN-SX-Homology_modeling]]
 
  
{{Vspace}}
 
  
 +
{{Smallvspace}}
  
=== Objectives ===
 
<!-- included from "../components/ABC-INT-Homology_modelling.components.wtxt", section: "objectives" -->
 
...
 
  
{{Vspace}}
+
__TOC__
 
 
 
 
=== Outcomes ===
 
<!-- included from "../components/ABC-INT-Homology_modelling.components.wtxt", section: "outcomes" -->
 
...
 
 
 
{{Vspace}}
 
 
 
 
 
=== Deliverables ===
 
<!-- included from "../components/ABC-INT-Homology_modelling.components.wtxt", section: "deliverables" -->
 
<!-- included from "ABC-unit_components.wtxt", section: "deliverables-milestone" -->
 
*<b>No separate deliverables</b>: This unit collects other units and has no deliverables on its own.
 
  
 
{{Vspace}}
 
{{Vspace}}
Line 75: Line 46:
  
 
=== Evaluation ===
 
=== Evaluation ===
<!-- included from "../components/ABC-INT-Homology_modelling.components.wtxt", section: "evaluation" -->
+
This "Integrator Unit" should be submitted for evaluation for a maximum of 13 marks if one of the written deliverables is chosen, resp. 24 marks if you choose this for your oral test<ref>Note: the oral test is cumulative. It will focus on the content of this unit but will also cover other material that leads up to it.</ref>.
<!-- included from "ABC-unit_components.wtxt", section: "eval-INT-TBD" -->
+
:Please note the evaluation types that are available as options for this unit.
<b>Evaluation: Integrated Unit</b><br />
+
:Be mindful of the [[ABC-Rubrics| '''Marking rubrics''']].
:This unit should be submitted for evaluation for a maximum of 10 marks. Details TBD.
+
:If this is submitted for your oral test, please read the [[BCH441 Oral Test instructions|Oral test instructions]] before you begin.
 +
:If your submission includes R code, please read the [[BCH441 Code submisson instructions|Code submission instructions]] before you begin.
  
{{Vspace}}
+
Once you have chosen an option ...
 +
<ol>
 +
<li>Create a new page on the student Wiki as a subpage of your User Page.</li>
 +
<li>Put all of your writing to submit on this one page.</li>
  
 +
<li>When you are done with everything, go to the [https://q.utoronto.ca/courses/180416/assignments Quercus '''Assignments''' page] and open the appropriate '''Integrator Unit''' assignment. Paste the URL of your Wiki page into the form, and click on '''Submit Assignment'''.</li>
 +
</ol>
  
</div>
+
Your link can be submitted only once and not edited. But you may change your Wiki page at any time. However only the last version before the due date will be marked. All later edits will be silently ignored.
<div id="BIO">
 
== Contents ==
 
<!-- included from "../components/ABC-INT-Homology_modelling.components.wtxt", section: "contents" -->
 
<!-- included from "ABC-unit_components.wtxt", section: "milestone" -->
 
This is a "milestone unit". Its purpose is merely to collect a number of preparatory units into a single, common prerequisite. It has no contents of its own; you are expected to be familiar and competent with all preparatory material at this point.
 
  
 +
{{Smallvspace}}
  
=== The DNA binding site ===
+
;Report option
 +
* Work through the tasks described in the scenario below.
 +
* Document your results in a short technical report on a subpage of your User page on the Student Wiki. Describe your methods (R-code!) in [[BCH441 Code submisson instructions|an appendix]] linked from your report;
 +
* When you are done, submit the link to your page via Quercus as described above.
  
 +
{{Smallvspace}}
  
Now, that you know how YFO Mbp1 aligns with yeast Mbp1, you can evaluate functional conservation in these homologous proteins. You probably already downloaded the two Biochemistry papers by Taylor et al. (2000) and by Deleeuw et al. (2008) that we encountered in Assignment 2. These discuss the residues involved in DNA binding<ref>([http://www.ncbi.nlm.nih.gov/pubmed/10747782 Taylor ''et al.'' (2000) ''Biochemistry'' '''39''': 3943-3954] and [http://www.ncbi.nlm.nih.gov/pubmed/18491920 Deleeuw ''et al.'' (2008) Biochemistry. '''47''':6378-6385])</ref>. In particular the residues between 50-74 have been proposed to comprise the DNA recognition domain.
+
;Publication Image option
 +
* Work through the tasks described in the scenario below.
 +
* Produce the image as described and "publish" it on a subpage of your User page on the Student Wiki. Describe your methods (R-code! ChimeraX commnds!) in [[BCH441 Code submisson instructions|an appendix]] linked from your report;
 +
* When you are done, submit the link to your page via Quercus as described above.
  
{{task|
+
<!--
# Using the APSES domain alignment you have just constructed, find the YFO Mbp1 residues that correspond to the range 50-74 in yeast.
+
{{Smallvspace}}
# Note whether the sequences are especially highly conserved in this region.
+
;Interview option
# Using Chimera, look at the region. Use the sequence window '''to make sure''' that the sequence numbering between the paper and the PDB file are the same (they are often not identical!). Then select the residues - the proposed recognition domain - and color them differently for emphasis. Study this in stereo to get a sense of the spatial relationships. Check where the conserved residues are.
+
: Identify a laboratory whose work has recently included producing and interpreting a homology model. Get in touch with the PI, a postdoc or senior graduate student in the laboratory and interview them in person or by eMail. Find out
# A good representation is '''stick''' - but other representations that include sidechains will also serve well.
+
* why this work is important;
# Calculate a solvent accessible surface of the protein in a separate representation and make it transparent.
+
* how they approach it methodologically;
# You could  combine three representations: (1) the backbone (in '''ribbon view'''), (2) the sidechains of residues that presumably contact DNA, distinctly colored, and (3) a transparent surface of the entire protein. This image should show whether residues annotated as DNA binding form a contiguous binding interface.
+
* in particular, how they interpret the model and what the model tells them that a sequence alignment alone would not have;
}}
+
* what they have recently learned.
 +
* write up your interview on a subpage of your User page of the Student Wiki;
 +
* add information that may be required to understand the context;
 +
* make sure that you included important literature references.
 +
* If this is well done and interesting, parts of this may be used to augment the learning unit. Make sure your interviewee is aware of what the interview is for, and has given her or his consent.
 +
* Make sure contact information for your interviewee is included on your submission page.
 +
* Add a CC-BY tag to your submission.
 +
* When you are done with everything, add the following category tag '''to the end of page''':
 +
::<code><nowiki>[[Category:EVAL-INT-Homology_modelling]]</nowiki></code>.
  
 +
Once the page has been saved with this tag, it is considered "submitted".
 +
'''Do not''' change your submission after this tag has been added. The page will be marked and the category tag will be removed by the instructor.
 +
-->
 +
<!--
 +
{{Smallvspace}}
 +
;Literature research option
 +
: ...
 +
-->
  
DNA binding interfaces are expected to comprise a number of positively charged amino acids, that might form salt-bridges with the phosphate backbone.
+
{{Smallvspace}}
 
 
 
 
{{task|
 
*Study and consider whether this is the case here and which residues might be included.
 
}}
 
 
 
  
 +
;Oral test option
 +
* Work through the tasks described below. Remember to document your work in your journal, but there is no need to format this specially as a report.
 +
* Part of your task will involve writing an R script; refer to the [[BCH441 Code submisson instructions|Code submission instructions]] and link to your page from your Journal.
 +
* Note that the work must be completed [[BCH441 Oral Test instructions| '''before''' your actual test date.]]
  
 +
{{Smallvspace}}
 +
;R code option
 +
* Work through the tasks described in the scenario below and develop code as required.
 +
* Put your code and other documentation on a subpage of your User page on the Student Wiki;
 +
* When you are done, submit the link to your page via Quercus as described above.
  
  
 +
== Contents ==
  
 
{{Vspace}}
 
{{Vspace}}
  
 +
=== Scenario background ===
  
== Further reading, links and resources ==
+
You have collected the APSES domain proteins of MYSPE in your protein database and this is now a pretty nice collection of widely distributed sequences with a shared fold. We can be very confident about our APSES domain alignments, since there are hardly any indels in these sequences - and given a confident alignment we can arrive at a very reasonable structural  model. This, for example would allow us to look at residues in the APSES recognition domain that are conserved among known Mbp1 orthologues, but vary between paralogues - you have all the tools to try this at some point.
<!-- {{#pmid: 19957275}} -->
 
<!-- {{WWW|WWW_GMOD}} -->
 
<!-- <div class="reference-box">[http://www.ncbi.nlm.nih.gov]</div> -->
 
  
{{Vspace}}
+
For this assignment however we are going to look at conservation in the  ankyrin domains and their identification and alignment is a bigger challenge. Interestingly, an ankyrin domain structure is known for one of the homologues in this set- although it is '''not''' in our set of sequences. This is the structure of yeast Swi6, a homologue of Mbp1 that has a non-functional APSES domain; it too is involved in cell-cycle regulation since it dimerizes with Mbp1 in the MBF complex (as well as dimerizing with Swi4 in the SBF complex).
  
  
== Notes ==
+
{{Smallvspace}}
<!-- included from "../components/ABC-INT-Homology_modelling.components.wtxt", section: "notes" -->
 
<!-- included from "ABC-unit_components.wtxt", section: "notes" -->
 
<references />
 
  
{{Vspace}}
+
{{task|1=
 +
;Your common taks for this scenario are as follows. Execute them and document.
  
 +
* Produce a multiple sequence alignment of the yeast Swi6 sequence from [http://www.rcsb.org/structure/1SW6 '''PDB 1SW6'''].<ref>You will probably already have this sequence annotated to MBP1_MYSPE since this is annotated by similarity by SMART. However we need a "real alignment" of the entire sequence this time.</ref> Make sure you include all MBP1 homologues from the database, not just the Mbp1 orthologues.<ref>These will be too many sequences for the Muscle algorithm, use CLUSTAL Omega instead.</ref>
 +
* Following the procedures of [[BIN-SX-Homology_modelling|the Homology Modelling unit]], prepare a homology model of the MBP1_MYSPE ankyrin domains based on the 1SW6 structure.
 +
}}
  
</div>
+
{{Smallvspace}}
<div id="ABC-unit-framework">
 
== Self-evaluation ==
 
<!-- included from "../components/ABC-INT-Homology_modelling.components.wtxt", section: "self-evaluation" -->
 
<!--
 
=== Question 1===
 
  
Question ...
+
{{task|1=
 +
; For the report option...
 +
:Considering the columns of your multiple sequence alignment, discuss and document with reference to your homology model whether the model has solvent exposed residues that are highly conserved. Take particular note whether there are any such residues where MBP1_MYSPE differs from the consensus of the other, aligned sequences. (Such outliers could point to functionally significant residues.)
 +
}}
  
<div class="toccolours mw-collapsible mw-collapsed" style="width:800px">
+
{{Smallvspace}}
Answer ...
 
<div class="mw-collapsible-content">
 
Answer ...
 
  
</div>
+
{{task|1=
  </div>
+
; For the Oral Test option...
 +
:Prepare to discuss during the test, with specific reference to your homology model, whether the model has solvent exposed residues that are highly conserved. Take particular note whether there are any such residues where MBP1_MYSPE differs from the consensus of the other, aligned sequences. (Such outliers could point to functionally significant residues.) Make sure the data upon which you base your conclusions is available in summary on a page of the Student Wiki before your test.
 +
}}
  
  {{Vspace}}
+
{{Smallvspace}}
  
-->
+
{{task|1=
 +
; For the publication image option...
 +
:Create a two-panel publication-quality image to illustrate whether conserved residues of the model form a putative interaction surface. Panel '''(a)''' contains a stereo image of your modelled domain with a transparent surface enclosing a cartoon representation of your model. Highly conserved residues are colored red, and their side-chains are shown. Panel '''(b)''' shows the plot of conservation scores (per residue, not rolling averages) with a horizontal line that indicates your cutoff of what you have considered "highly conserved". Include informative figure captions. Don't forget to describe your methods and submit your code and commands in an appendix.
 +
}}
  
{{Vspace}}
+
{{Smallvspace}}
  
 +
{{task|1=
 +
; For the R-code option
  
 +
:Referring to the User Documentation of the ChimeraX commands <tt>color byattribute ...</tt>, and <tt>defattr ...</tt>, write the msa conservation scores for your model to a text file in the format of a ChimeraX attribute assignment file (cf. the examples linked from the ChimeraX User Documentation). Read the attribute file into ChimeraX using the <tt>remotecontrol rest ...</tt> command and appropriate scripted commands sent via the <tt>CX()</tt> function as demonstrated in the <tt>RPR-ChimeraX-remote.R</tt> script. Structure your code so that you can switch a defined constant at the beginning that defines whether the commands are applied to your homology model, or to the original 1SW6 coordinate set. Produce a stereo scene that shows your model as a cartoon image enclosed by a transparent surface, where both model as well as surface are colored by conservation values. Briefly interpret the results.
  
{{Vspace}}
+
:To evaluate your code, I need to be able to reproduce the scene from your code.
  
 +
:Document your work clearly, and discuss the process of computing attributes in R and mapping them to 3D models with ChimeraX. Can the process be significantly improved?
  
<!-- included from "ABC-unit_components.wtxt", section: "ABC-unit_ask" -->
+
<!-- compute sasa per residue and correlate with conservation -->
  
----
+
<!-- distinguish structurally and functionally consrved residues -->
 +
}}
  
 
{{Vspace}}
 
{{Vspace}}
  
<b>If in doubt, ask!</b> If anything about this learning unit is not clear to you, do not proceed blindly but ask for clarification. Post your question on the course mailing list: others are likely to have similar problems. Or send an email to your instructor.
+
== Notes ==
 +
<references />
  
----
+
{{Vspace}}
  
{{Vspace}}
 
  
 
<div class="about">
 
<div class="about">
Line 185: Line 189:
 
:2017-08-05
 
:2017-08-05
 
<b>Modified:</b><br />
 
<b>Modified:</b><br />
:2017-08-09
+
:2020-10-07
 
<b>Version:</b><br />
 
<b>Version:</b><br />
:0.1
+
:1.3
 
<b>Version history:</b><br />
 
<b>Version history:</b><br />
 +
*1.3 Edit policy update
 +
*1.2 2020 updates. Full rewrite of tasks and evaluation: model ankyrin domains and focus on conservation scores.
 +
*1.1 Corrected posted marks, which were not consistent with the description in the syllabus.
 +
*1.0 Live 2017
 
*0.1 First stub
 
*0.1 First stub
 
</div>
 
</div>
[[Category:ABC-units]]
 
<!-- included from "ABC-unit_components.wtxt", section: "ABC-unit_footer" -->
 
  
 
{{CC-BY}}
 
{{CC-BY}}
  
 +
[[Category:ABC-units]]
 +
{{INTEGRATOR}}
 +
{{LIVE}}
 +
{{EVAL}}
 
</div>
 
</div>
 
<!-- [END] -->
 
<!-- [END] -->

Latest revision as of 05:37, 7 October 2020

Integrator Unit: Homology Modelling

(Integrator unit: create a homology model and assess the role of sequence conservation)


 


Abstract:

This page integrates material from the learning units for working with multiple sequence alignments and structure data in a task for evaluation.


Deliverables:

  • Integrator unit: Deliverables can be submitted for course marks. See below for details.

  • Prerequisites:
    This unit builds on material covered in the following prerequisite units:


     



     



     


    Evaluation

    This "Integrator Unit" should be submitted for evaluation for a maximum of 13 marks if one of the written deliverables is chosen, resp. 24 marks if you choose this for your oral test[1].

    Please note the evaluation types that are available as options for this unit.
    Be mindful of the Marking rubrics.
    If this is submitted for your oral test, please read the Oral test instructions before you begin.
    If your submission includes R code, please read the Code submission instructions before you begin.

    Once you have chosen an option ...

    1. Create a new page on the student Wiki as a subpage of your User Page.
    2. Put all of your writing to submit on this one page.
    3. When you are done with everything, go to the Quercus Assignments page and open the appropriate Integrator Unit assignment. Paste the URL of your Wiki page into the form, and click on Submit Assignment.

    Your link can be submitted only once and not edited. But you may change your Wiki page at any time. However only the last version before the due date will be marked. All later edits will be silently ignored.


     
    Report option
    • Work through the tasks described in the scenario below.
    • Document your results in a short technical report on a subpage of your User page on the Student Wiki. Describe your methods (R-code!) in an appendix linked from your report;
    • When you are done, submit the link to your page via Quercus as described above.


     
    Publication Image option
    • Work through the tasks described in the scenario below.
    • Produce the image as described and "publish" it on a subpage of your User page on the Student Wiki. Describe your methods (R-code! ChimeraX commnds!) in an appendix linked from your report;
    • When you are done, submit the link to your page via Quercus as described above.


     
    Oral test option
    • Work through the tasks described below. Remember to document your work in your journal, but there is no need to format this specially as a report.
    • Part of your task will involve writing an R script; refer to the Code submission instructions and link to your page from your Journal.
    • Note that the work must be completed before your actual test date.


     
    R code option
    • Work through the tasks described in the scenario below and develop code as required.
    • Put your code and other documentation on a subpage of your User page on the Student Wiki;
    • When you are done, submit the link to your page via Quercus as described above.


    Contents

     

    Scenario background

    You have collected the APSES domain proteins of MYSPE in your protein database and this is now a pretty nice collection of widely distributed sequences with a shared fold. We can be very confident about our APSES domain alignments, since there are hardly any indels in these sequences - and given a confident alignment we can arrive at a very reasonable structural model. This, for example would allow us to look at residues in the APSES recognition domain that are conserved among known Mbp1 orthologues, but vary between paralogues - you have all the tools to try this at some point.

    For this assignment however we are going to look at conservation in the ankyrin domains and their identification and alignment is a bigger challenge. Interestingly, an ankyrin domain structure is known for one of the homologues in this set- although it is not in our set of sequences. This is the structure of yeast Swi6, a homologue of Mbp1 that has a non-functional APSES domain; it too is involved in cell-cycle regulation since it dimerizes with Mbp1 in the MBF complex (as well as dimerizing with Swi4 in the SBF complex).


     

    Task:

    Your common taks for this scenario are as follows. Execute them and document.
    • Produce a multiple sequence alignment of the yeast Swi6 sequence from PDB 1SW6.[2] Make sure you include all MBP1 homologues from the database, not just the Mbp1 orthologues.[3]
    • Following the procedures of the Homology Modelling unit, prepare a homology model of the MBP1_MYSPE ankyrin domains based on the 1SW6 structure.


     

    Task:

    For the report option...
    Considering the columns of your multiple sequence alignment, discuss and document with reference to your homology model whether the model has solvent exposed residues that are highly conserved. Take particular note whether there are any such residues where MBP1_MYSPE differs from the consensus of the other, aligned sequences. (Such outliers could point to functionally significant residues.)


     

    Task:

    For the Oral Test option...
    Prepare to discuss during the test, with specific reference to your homology model, whether the model has solvent exposed residues that are highly conserved. Take particular note whether there are any such residues where MBP1_MYSPE differs from the consensus of the other, aligned sequences. (Such outliers could point to functionally significant residues.) Make sure the data upon which you base your conclusions is available in summary on a page of the Student Wiki before your test.


     

    Task:

    For the publication image option...
    Create a two-panel publication-quality image to illustrate whether conserved residues of the model form a putative interaction surface. Panel (a) contains a stereo image of your modelled domain with a transparent surface enclosing a cartoon representation of your model. Highly conserved residues are colored red, and their side-chains are shown. Panel (b) shows the plot of conservation scores (per residue, not rolling averages) with a horizontal line that indicates your cutoff of what you have considered "highly conserved". Include informative figure captions. Don't forget to describe your methods and submit your code and commands in an appendix.


     

    Task:

    For the R-code option
    Referring to the User Documentation of the ChimeraX commands color byattribute ..., and defattr ..., write the msa conservation scores for your model to a text file in the format of a ChimeraX attribute assignment file (cf. the examples linked from the ChimeraX User Documentation). Read the attribute file into ChimeraX using the remotecontrol rest ... command and appropriate scripted commands sent via the CX() function as demonstrated in the RPR-ChimeraX-remote.R script. Structure your code so that you can switch a defined constant at the beginning that defines whether the commands are applied to your homology model, or to the original 1SW6 coordinate set. Produce a stereo scene that shows your model as a cartoon image enclosed by a transparent surface, where both model as well as surface are colored by conservation values. Briefly interpret the results.
    To evaluate your code, I need to be able to reproduce the scene from your code.
    Document your work clearly, and discuss the process of computing attributes in R and mapping them to 3D models with ChimeraX. Can the process be significantly improved?


     

    Notes

    1. Note: the oral test is cumulative. It will focus on the content of this unit but will also cover other material that leads up to it.
    2. You will probably already have this sequence annotated to MBP1_MYSPE since this is annotated by similarity by SMART. However we need a "real alignment" of the entire sequence this time.
    3. These will be too many sequences for the Muscle algorithm, use CLUSTAL Omega instead.


     


    About ...
     
    Author:

    Boris Steipe <boris.steipe@utoronto.ca>

    Created:

    2017-08-05

    Modified:

    2020-10-07

    Version:

    1.3

    Version history:

    • 1.3 Edit policy update
    • 1.2 2020 updates. Full rewrite of tasks and evaluation: model ankyrin domains and focus on conservation scores.
    • 1.1 Corrected posted marks, which were not consistent with the description in the syllabus.
    • 1.0 Live 2017
    • 0.1 First stub

    CreativeCommonsBy.png This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.