Difference between revisions of "Lecture 07"
Jump to navigation
Jump to search
Line 242: | Line 242: | ||
======Slide 039====== | ======Slide 039====== | ||
[[Image:07_slide039.jpg|frame|none|Lecture 07, Slide 039<br> | [[Image:07_slide039.jpg|frame|none|Lecture 07, Slide 039<br> | ||
− | Three common formats exist for MSA results. A '''CLUSTAL''' formatted alignment is the format in most common use. Take care when formatting input files to ensure the '''first 10 characters in your input file are unique''' and contain '''no special characters'''! I have seen programs break on blanks, hyphens and | + | Three common formats exist for MSA results. A '''CLUSTAL''' formatted alignment is the format in most common use. Take care when formatting input files to ensure the '''first 10 characters in your input file are unique''' and contain '''no special characters'''! I have seen programs break on blanks, hyphens and | (pipe). The latter is especially annoying, since the | character is used in NCBI FASTA files to separate the database identifier from the accession number. (More information at the [http://www.ebi.ac.uk/help/formats_frame.html EBI help page on formats].) |
]] | ]] | ||
+ | |||
======Slide 040====== | ======Slide 040====== | ||
[[Image:07_slide040.jpg|frame|none|Lecture 07, Slide 040<br> | [[Image:07_slide040.jpg|frame|none|Lecture 07, Slide 040<br> |
Latest revision as of 18:57, 7 October 2007
(Previous lecture) ... (Next lecture)
Multiple Sequence Alignment
Objectives for this part of the course
- Understand that MSA is an unsolved, difficult problem with different "best" solutions for different purposes.
- Be familiar with different biological heuristics that distinguish a "good" alignment from a "poor" alignment.
- Understand the importance of benchmarks for assessing the performance of computational tools.
- Be aware of how different biological priorities have resulted in different algorithmic strategies and some of the available tools that represent them.
- Be aware that the most frequently used and referenced tool - CLUSTAL - is no longer state-of-the-art and know which modern tools are much better.
- Confidently be able to survey recent developments and choose an appropriate algorithm.
- Be able to perform and interpret MSAs in practice, know how to prepare input, which formats to use and what common output formats look like.
- Understand strategies to prepare input and improve alignments, based on the requirement of columnwise homology.
- Know about strategies and tools for manual editing of alignments.
Links summary
- Dallas PROMALS Web server
- EBI CLUSTAL web server
- EBI T-Coffee Web server
- EBI MUSCLE Web server
- Stanford PROBCONS server
- Indiana SPEM server
- MUSCA, based on the Teiresias pattern discovery algorithm
- HMMER, a profile hidden Markov model tool
- BAliBASE, BAliBASE 2.0 and BAliBASE 3.0
- EBI help page on formats
- Jalview home page
- Embnet BOXSHADE server
- Wikipedia page on Multiple Sequence Alignment
Exercises
- Read Cedric Notredame's MSA review (2007)
- Read Edgar and Batzoglou's MSA review (2005)
- More exercises will be covered in Assignment 3.
Lecture slides
Uses and Problems
Slide 004
Slide 005
Slide 006
Slide 007
Right, wrong, good and poor
Slide 009
Slide 010
Slide 011
MSA in practice
Slide 013
Slide 014
Slide 015
Slide 016
Slide 017
Slide 019
Slide 020
Slide 021
Slide 022
Slide 023
Slide 024
Slide 025
Slide 026
Slide 027
Slide 028
Slide 029
Slide 030
Slide 031
Slide 032
Slide 033
Slide 034
Slide 035
Editing and printing
Slide 037
Slide 038
Slide 039
Slide 040
Slide 041
Slide 042
Slide 043
Slide 044
Slide 045
Slide 046
Slide 047
Slide 048
Slide 049
Slide 050