Difference between revisions of "BIN-ALI-Alignment"

From "A B C"
Jump to navigation Jump to search
m (Created page with "<div id="BIO"> <div class="b1"> Sequence alignment concepts </div> {{Vspace}} <div class="keywords"> <b>Keywords:</b>  What is an “alignment”? </div> {{Vsp...")
 
m
Line 19: Line 19:
  
  
{{STUB}}
+
{{DEV}}
  
 
{{Vspace}}
 
{{Vspace}}
Line 83: Line 83:
 
== Contents ==
 
== Contents ==
 
<!-- included from "../components/BIN-ALI-Alignment.components.wtxt", section: "contents" -->
 
<!-- included from "../components/BIN-ALI-Alignment.components.wtxt", section: "contents" -->
...
+
<div class="quote-box">
 +
{{Vspace}}
 +
 
 +
;Take care of things, and they will take care of you.
 +
:''Shunryu Suzuki''
 +
</div>
 +
 
 +
 
 +
{{Vspace}}
 +
 
 +
==Introduction==
 +
 
 +
{{Vspace}}
 +
 
 +
<div class="colmask doublepage">
 +
  <div class="colleft">
 +
    <div class="col1">
 +
      <!-- Column 1 start -->
 +
Sequence alignment is a '''very''' large, and important topic.
 +
 
 +
One of the foundations of bioinformatics is the empirical observation that related sequences conserve structure, and often function. Much of what we know about a protein's physiological function is based on the '''conservation''' of that function as the species evolves. Indeed, conservation is a defining aspect of what can rightly be said to be a protein's "function" in the first place. Conservation - or its opposite: ''variation'' - is a consequence of '''selection under constraints''': protein sequences change as a consequence of DNA mutations, this changes the protein's structure, this in turn changes functions and that has multiple effects on a species' reproductive fitness. Detrimental variants may be removed. Variation that is tolerated is largely neutral and therefore found only in positions that are neither structurally nor functionally critical. Conservation patterns can thus provide evidence for many different questions: structural conservation among proteins with similar 3D-structures, functional conservation among homologues with comparable roles, or amino acid propensities as predictors for protein engineering and design tasks.
 +
 
 +
    <!-- Column 1 end -->
 +
    </div>
 +
    <div class="col2">
 +
      <!-- Column 2 start -->
 +
 
 +
We assess conservation by comparing sequences between related proteins. This is the basis on which we can make inferences from well-studied model organisms for species that have not been studied as deeply. The foundation is to measure protein sequence similarity. If two sequences are much more similar than we could expect from chance, we hypothesize that their similarity comes from shared ancestry plus conservation. The measurement of sequence similarity however requires sequence alignment<ref>This is not strictly true in all cases: some algorithms measure similarity through an alignment-free approach, for example by comparing structural features, or domain annotations. These methods are less sensitive, but important when sequences are so highly diverged that no meaningful sequence alignment can be produced.</ref>.
 +
 
 +
A carefully done sequence alignment is a cornerstone for the annotation of the essential properties a gene or protein. It can already tell us a lot about which proteins we expect to have similar functions in different species.
 +
 
 +
 
 +
 
 +
      <!-- Column 2 end -->
 +
    </div>
 +
  </div>
 +
</div>
 +
 
 +
{{Vspace}}
 +
 
 +
 
 +
{{Task|1=
 +
*Read the introductory notes on {{ABC-PDF|ALI-Alignment|Sequence Alignment}}.
 +
}}
 +
 
 +
 
 +
 
  
 
{{Vspace}}
 
{{Vspace}}

Revision as of 04:26, 31 August 2017

Sequence alignment concepts


 

Keywords:  What is an “alignment”?


 



 


Caution!

This unit is under development. There is some contents here but it is incomplete and/or may change significantly: links may lead to nowhere, the contents is likely going to be rearranged, and objectives, deliverables etc. may be incomplete or missing. Do not work with this material until it is updated to "live" status.


 


Abstract

...


 


This unit ...

Prerequisites

You need to complete the following units before beginning this one:


 


Objectives

...


 


Outcomes

...


 


Deliverables

  • Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.
  • Journal: Document your progress in your course journal.
  • Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page.


 


Evaluation

Evaluation: NA

This unit is not evaluated for course marks.


 


Contents

 
Take care of things, and they will take care of you.
Shunryu Suzuki


 

Introduction

 

Sequence alignment is a very large, and important topic.

One of the foundations of bioinformatics is the empirical observation that related sequences conserve structure, and often function. Much of what we know about a protein's physiological function is based on the conservation of that function as the species evolves. Indeed, conservation is a defining aspect of what can rightly be said to be a protein's "function" in the first place. Conservation - or its opposite: variation - is a consequence of selection under constraints: protein sequences change as a consequence of DNA mutations, this changes the protein's structure, this in turn changes functions and that has multiple effects on a species' reproductive fitness. Detrimental variants may be removed. Variation that is tolerated is largely neutral and therefore found only in positions that are neither structurally nor functionally critical. Conservation patterns can thus provide evidence for many different questions: structural conservation among proteins with similar 3D-structures, functional conservation among homologues with comparable roles, or amino acid propensities as predictors for protein engineering and design tasks.

We assess conservation by comparing sequences between related proteins. This is the basis on which we can make inferences from well-studied model organisms for species that have not been studied as deeply. The foundation is to measure protein sequence similarity. If two sequences are much more similar than we could expect from chance, we hypothesize that their similarity comes from shared ancestry plus conservation. The measurement of sequence similarity however requires sequence alignment[1].

A carefully done sequence alignment is a cornerstone for the annotation of the essential properties a gene or protein. It can already tell us a lot about which proteins we expect to have similar functions in different species.



 


Task:



 


Further reading, links and resources

 


Notes

  1. This is not strictly true in all cases: some algorithms measure similarity through an alignment-free approach, for example by comparing structural features, or domain annotations. These methods are less sensitive, but important when sequences are so highly diverged that no meaningful sequence alignment can be produced.


 


Self-evaluation

 



 




 

If in doubt, ask! If anything about this learning unit is not clear to you, do not proceed blindly but ask for clarification. Post your question on the course mailing list: others are likely to have similar problems. Or send an email to your instructor.



 

About ...
 
Author:

Boris Steipe <boris.steipe@utoronto.ca>

Created:

2017-08-05

Modified:

2017-08-07

Version:

0.1

Version history:

  • 0.1 First stub

CreativeCommonsBy.png This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.