Difference between revisions of "BIN-PHYLO-Conservation scores"
m (Created page with "<div id="BIO"> <div class="b1"> Calculating Conservation Scores from Phylogenetic Trees </div> {{Vspace}} <div class="keywords"> <b>Keywords:</b> Quantifying c...") |
m |
||
Line 19: | Line 19: | ||
− | {{ | + | {{DEV}} |
{{Vspace}} | {{Vspace}} | ||
Line 82: | Line 82: | ||
== Contents == | == Contents == | ||
<!-- included from "../components/BIN-PHYLO-Conservation_scores.components.wtxt", section: "contents" --> | <!-- included from "../components/BIN-PHYLO-Conservation_scores.components.wtxt", section: "contents" --> | ||
− | ... | + | |
+ | |||
+ | ===Coloring a 3D model by conservation=== | ||
+ | |||
+ | With the superimposed coordinates, you can begin to get a sense whether either or both binding modes could be appropriate for a protein-DNA complex in your Mbp1 orthologue. But these are geometrical criteria only, and the protein in your species may be flexible enough to adopt a different conformation in a complex, and different again from your model. A more powerful way to analyze such hypothetical complexes is to look at conservation patterns. With VMD, you can import a sequence alignment into the MultiSeq extension and color residues by conservation. The protocol below assumes | ||
+ | |||
+ | *You have prealigned the reference Mbp1 proteins with your species' Mbp1 orthologue; | ||
+ | *You have saved the alignment in a CLUSTAL format. | ||
+ | |||
+ | You can use Jalview or any other MSA server to do so. You can even do this by hand - there should be few if any indels and the correct alignment is easy to see. | ||
+ | |||
+ | {{task|1= | ||
+ | ;Load the Mbp1 APSES alignment into MultiSeq. | ||
+ | |||
+ | :(A) In the MultiSeq Window, navigate to '''File → Import Data...'''; Choose "From Files" and Browse to the location of the alignment you have saved. The File navigation window gives you options which files to enable: choose to Enable <code>ALN</code> files (these are CLUSTAL formatted multiple sequence alignments). | ||
+ | :(B) Open the alignment file, click on '''Ok''' to import the data, it will take a short while to load. If the data can't be loaded, the file may have the wrong extension: .aln is required. | ||
+ | :(C) find the Mbp1_SACCE sequence in the list, click on it and move it to the top of the Sequences list with your mouse (the list is not static, you can re-order the sequences in any way you like). | ||
+ | |||
+ | You will see that the 1MB1 sequence and the APSES domain sequence do not match: at the N-terminus the sequence that corresponds to the PDB structure has extra residues, and in the middle the APSES sequences may have gaps inserted. | ||
+ | |||
+ | ;Bring the 1MB1 sequence in register with the APSES alignment. | ||
+ | :(A)MultiSeq supports typical text-editor selection mechanisms. Clicking on a residue selects it, clicking on a row selects the whole sequence. Dragging with the mouse selects several residues, shift-clicking selects ranges, and option-clicking toggles the selection on or off for individual residues. Using the mouse and/or the shift key as required, select the '''entire first column''' of the sequences you have imported. | ||
+ | :(B) Select '''Edit → Enable Editing... → Gaps only''' to allow changing indels. | ||
+ | :(C) Pressing the spacebar once should insert a gap character before the '''selected column''' in all sequences. Insert as many gaps as you need to align the beginning of sequences with the corresponding residues of 1MB1: <code>S I M ...</code> | ||
+ | :(D) Now insert as many gaps as you need into the structure sequence, to align it completely with the Mbp1_SACCE APSES domain sequence. (Simply select residues in the sequence and use the space bar to insert gaps. (Note: I have noticed a bug that sometimes prevents slider or keyboard input to the MultiSeq window; it fails to regain focus after operations in a different window. I don't know whether this is a Mac related problem or a more general bug in MultiSeq. When this happens I quit VMD and restore the session from a saved state. It is a bit annoying but not mission-critical.) | ||
+ | :(E) When you are done, it may be prudent to save the state of your alignment. Use '''File → Save Session...''' | ||
+ | |||
+ | ;Color by similarity | ||
+ | :(A) Use the '''View → Coloring → Sequence similarity → BLOSUM30''' option to color the residues in the alignment and structure. This clearly shows you where conserved and variable residues are located and allows to analyze their structural context. | ||
+ | :(B) You can adjust the color scale in the usual way by navigating to '''VMD main → Graphics → Colors...''', choosing the Color Scale tab and adjusting the scale midpoint. | ||
+ | :(C) Navigate to the '''Representations''' window and apply the color scheme to your tube-and-sidechain representation: double-click on the NewCartoon representation to hide it and use '''User''' coloring of your ''Tube'' and ''Licorice'' representations to apply the sequence similarity color gradient that MultiSeq has calculated. | ||
+ | |||
+ | <br><div style="padding: 5px; background: #DDDDEE;"> | ||
+ | * Once you have colored the residues of your model by conservation, create another informative stereo-image and paste it into your assignment. | ||
+ | </div> | ||
+ | |||
+ | }} | ||
+ | |||
+ | {{Vspace}} | ||
+ | |||
+ | |||
+ | |||
{{Vspace}} | {{Vspace}} |
Revision as of 04:27, 31 August 2017
Calculating Conservation Scores from Phylogenetic Trees
Keywords: Quantifying conservation scores in trees
Contents
This unit is under development. There is some contents here but it is incomplete and/or may change significantly: links may lead to nowhere, the contents is likely going to be rearranged, and objectives, deliverables etc. may be incomplete or missing. Do not work with this material until it is updated to "live" status.
Abstract
...
This unit ...
Prerequisites
You need to complete the following units before beginning this one:
Objectives
...
Outcomes
...
Deliverables
- Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.
- Journal: Document your progress in your course journal.
- Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page.
Evaluation
Evaluation: NA
- This unit is not evaluated for course marks.
Contents
Coloring a 3D model by conservation
With the superimposed coordinates, you can begin to get a sense whether either or both binding modes could be appropriate for a protein-DNA complex in your Mbp1 orthologue. But these are geometrical criteria only, and the protein in your species may be flexible enough to adopt a different conformation in a complex, and different again from your model. A more powerful way to analyze such hypothetical complexes is to look at conservation patterns. With VMD, you can import a sequence alignment into the MultiSeq extension and color residues by conservation. The protocol below assumes
- You have prealigned the reference Mbp1 proteins with your species' Mbp1 orthologue;
- You have saved the alignment in a CLUSTAL format.
You can use Jalview or any other MSA server to do so. You can even do this by hand - there should be few if any indels and the correct alignment is easy to see.
Task:
- Load the Mbp1 APSES alignment into MultiSeq.
- (A) In the MultiSeq Window, navigate to File → Import Data...; Choose "From Files" and Browse to the location of the alignment you have saved. The File navigation window gives you options which files to enable: choose to Enable
ALN
files (these are CLUSTAL formatted multiple sequence alignments). - (B) Open the alignment file, click on Ok to import the data, it will take a short while to load. If the data can't be loaded, the file may have the wrong extension: .aln is required.
- (C) find the Mbp1_SACCE sequence in the list, click on it and move it to the top of the Sequences list with your mouse (the list is not static, you can re-order the sequences in any way you like).
You will see that the 1MB1 sequence and the APSES domain sequence do not match: at the N-terminus the sequence that corresponds to the PDB structure has extra residues, and in the middle the APSES sequences may have gaps inserted.
- Bring the 1MB1 sequence in register with the APSES alignment.
- (A)MultiSeq supports typical text-editor selection mechanisms. Clicking on a residue selects it, clicking on a row selects the whole sequence. Dragging with the mouse selects several residues, shift-clicking selects ranges, and option-clicking toggles the selection on or off for individual residues. Using the mouse and/or the shift key as required, select the entire first column of the sequences you have imported.
- (B) Select Edit → Enable Editing... → Gaps only to allow changing indels.
- (C) Pressing the spacebar once should insert a gap character before the selected column in all sequences. Insert as many gaps as you need to align the beginning of sequences with the corresponding residues of 1MB1:
S I M ...
- (D) Now insert as many gaps as you need into the structure sequence, to align it completely with the Mbp1_SACCE APSES domain sequence. (Simply select residues in the sequence and use the space bar to insert gaps. (Note: I have noticed a bug that sometimes prevents slider or keyboard input to the MultiSeq window; it fails to regain focus after operations in a different window. I don't know whether this is a Mac related problem or a more general bug in MultiSeq. When this happens I quit VMD and restore the session from a saved state. It is a bit annoying but not mission-critical.)
- (E) When you are done, it may be prudent to save the state of your alignment. Use File → Save Session...
- Color by similarity
- (A) Use the View → Coloring → Sequence similarity → BLOSUM30 option to color the residues in the alignment and structure. This clearly shows you where conserved and variable residues are located and allows to analyze their structural context.
- (B) You can adjust the color scale in the usual way by navigating to VMD main → Graphics → Colors..., choosing the Color Scale tab and adjusting the scale midpoint.
- (C) Navigate to the Representations window and apply the color scheme to your tube-and-sidechain representation: double-click on the NewCartoon representation to hide it and use User coloring of your Tube and Licorice representations to apply the sequence similarity color gradient that MultiSeq has calculated.
- Once you have colored the residues of your model by conservation, create another informative stereo-image and paste it into your assignment.
Further reading, links and resources
Notes
Self-evaluation
If in doubt, ask! If anything about this learning unit is not clear to you, do not proceed blindly but ask for clarification. Post your question on the course mailing list: others are likely to have similar problems. Or send an email to your instructor.
About ...
Author:
- Boris Steipe <boris.steipe@utoronto.ca>
Created:
- 2017-08-05
Modified:
- 2017-08-05
Version:
- 0.1
Version history:
- 0.1 First stub
This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.