Difference between revisions of "Lecture 07"
Jump to navigation
Jump to search
(2 intermediate revisions by the same user not shown) | |||
Line 45: | Line 45: | ||
*[http://www.ebi.ac.uk/t-coffee/ EBI '''T-Coffee''' Web server]<br> | *[http://www.ebi.ac.uk/t-coffee/ EBI '''T-Coffee''' Web server]<br> | ||
*[http://www.ebi.ac.uk/muscle/ EBI '''MUSCLE Web server''']<br> | *[http://www.ebi.ac.uk/muscle/ EBI '''MUSCLE Web server''']<br> | ||
− | *[http://probcons.stanford.edu Stanford '''PROBCONS server'']<br> | + | *[http://probcons.stanford.edu Stanford '''PROBCONS server''']<br> |
+ | *[http://sparks.informatics.iupui.edu/Softwares-Services_files/spem.htm Indiana '''SPEM server]<br> | ||
*[http://cbcsrv.watson.ibm.com/Tmsa.html MUSCA, based on the Teiresias pattern discovery algorithm]<br> | *[http://cbcsrv.watson.ibm.com/Tmsa.html MUSCA, based on the Teiresias pattern discovery algorithm]<br> | ||
*[http://hmmer.janelia.org/ HMMER, a profile hidden Markov model tool]<br> | *[http://hmmer.janelia.org/ HMMER, a profile hidden Markov model tool]<br> | ||
− | |||
*[http://bips.u-strasbg.fr/fr/Products/Databases/BAliBASE/ BAliBASE], [http://bips.u-strasbg.fr/fr/Products/Databases/BAliBASE2/ BAliBASE 2.0] and [http://www-bio3d-igbmc.u-strasbg.fr/~julie/balibase/index.html BAliBASE 3.0]<br> | *[http://bips.u-strasbg.fr/fr/Products/Databases/BAliBASE/ BAliBASE], [http://bips.u-strasbg.fr/fr/Products/Databases/BAliBASE2/ BAliBASE 2.0] and [http://www-bio3d-igbmc.u-strasbg.fr/~julie/balibase/index.html BAliBASE 3.0]<br> | ||
− | |||
*[http://www.ebi.ac.uk/help/formats_frame.html EBI help page on formats]<br> | *[http://www.ebi.ac.uk/help/formats_frame.html EBI help page on formats]<br> | ||
*[http://www.jalview.org/ '''Jalview''' home page]<br> | *[http://www.jalview.org/ '''Jalview''' home page]<br> | ||
− | *[http://www.ch.embnet.org/software/BOX_form.html Embnet BOXSHADE server]<br> | + | *[http://www.ch.embnet.org/software/BOX_form.html Embnet '''BOXSHADE''' server]<br> |
+ | *[http://en.wikipedia.org/wiki/Multiple_sequence_alignment '''Wikipedia''' page on Multiple Sequence Alignment]<br> | ||
Line 162: | Line 162: | ||
======Slide 021====== | ======Slide 021====== | ||
[[Image:07_slide021.jpg|frame|none|Lecture 07, Slide 021<br> | [[Image:07_slide021.jpg|frame|none|Lecture 07, Slide 021<br> | ||
− | One of the best algorithms that aligns sequences without additional database information. Run it on the web via the [http://probcons.stanford.edu Stanford '''PROBCONS server''] | + | One of the best algorithms that aligns sequences without additional database information. Run it on the web via the [http://probcons.stanford.edu Stanford '''PROBCONS server'''], or download the code and install locally. |
]] | ]] | ||
+ | |||
======Slide 022====== | ======Slide 022====== | ||
[[Image:07_slide022.jpg|frame|none|Lecture 07, Slide 022<br> | [[Image:07_slide022.jpg|frame|none|Lecture 07, Slide 022<br> | ||
Line 241: | Line 242: | ||
======Slide 039====== | ======Slide 039====== | ||
[[Image:07_slide039.jpg|frame|none|Lecture 07, Slide 039<br> | [[Image:07_slide039.jpg|frame|none|Lecture 07, Slide 039<br> | ||
− | Three common formats exist for MSA results. A '''CLUSTAL''' formatted alignment is the format in most common use. Take care when formatting input files to ensure the '''first 10 characters in your input file are unique''' and contain '''no special characters'''! I have seen programs break on blanks, hyphens and | + | Three common formats exist for MSA results. A '''CLUSTAL''' formatted alignment is the format in most common use. Take care when formatting input files to ensure the '''first 10 characters in your input file are unique''' and contain '''no special characters'''! I have seen programs break on blanks, hyphens and | (pipe). The latter is especially annoying, since the | character is used in NCBI FASTA files to separate the database identifier from the accession number. (More information at the [http://www.ebi.ac.uk/help/formats_frame.html EBI help page on formats].) |
]] | ]] | ||
+ | |||
======Slide 040====== | ======Slide 040====== | ||
[[Image:07_slide040.jpg|frame|none|Lecture 07, Slide 040<br> | [[Image:07_slide040.jpg|frame|none|Lecture 07, Slide 040<br> |
Latest revision as of 18:57, 7 October 2007
(Previous lecture) ... (Next lecture)
Multiple Sequence Alignment
Objectives for this part of the course
- Understand that MSA is an unsolved, difficult problem with different "best" solutions for different purposes.
- Be familiar with different biological heuristics that distinguish a "good" alignment from a "poor" alignment.
- Understand the importance of benchmarks for assessing the performance of computational tools.
- Be aware of how different biological priorities have resulted in different algorithmic strategies and some of the available tools that represent them.
- Be aware that the most frequently used and referenced tool - CLUSTAL - is no longer state-of-the-art and know which modern tools are much better.
- Confidently be able to survey recent developments and choose an appropriate algorithm.
- Be able to perform and interpret MSAs in practice, know how to prepare input, which formats to use and what common output formats look like.
- Understand strategies to prepare input and improve alignments, based on the requirement of columnwise homology.
- Know about strategies and tools for manual editing of alignments.
Links summary
- Dallas PROMALS Web server
- EBI CLUSTAL web server
- EBI T-Coffee Web server
- EBI MUSCLE Web server
- Stanford PROBCONS server
- Indiana SPEM server
- MUSCA, based on the Teiresias pattern discovery algorithm
- HMMER, a profile hidden Markov model tool
- BAliBASE, BAliBASE 2.0 and BAliBASE 3.0
- EBI help page on formats
- Jalview home page
- Embnet BOXSHADE server
- Wikipedia page on Multiple Sequence Alignment
Exercises
- Read Cedric Notredame's MSA review (2007)
- Read Edgar and Batzoglou's MSA review (2005)
- More exercises will be covered in Assignment 3.
Lecture slides
Uses and Problems
Slide 004
Slide 005
Slide 006
Slide 007
Right, wrong, good and poor
Slide 009
Slide 010
Slide 011
MSA in practice
Slide 013
Slide 014
Slide 015
Slide 016
Slide 017
Slide 019
Slide 020
Slide 021
Slide 022
Slide 023
Slide 024
Slide 025
Slide 026
Slide 027
Slide 028
Slide 029
Slide 030
Slide 031
Slide 032
Slide 033
Slide 034
Slide 035
Editing and printing
Slide 037
Slide 038
Slide 039
Slide 040
Slide 041
Slide 042
Slide 043
Slide 044
Slide 045
Slide 046
Slide 047
Slide 048
Slide 049
Slide 050