Difference between revisions of "Tools for the bioinformatics lab"

From "A B C"
Jump to navigation Jump to search
(Created page with "<div id="APB"> <div class="b1"> Title ... </div> {{fix}} Summary ... __TOC__ ==EMBOSS== ===EMBOSS installation=== ;Outdated (written 2006). ===Download=== :1. naviga...")
 
m
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
<div id="APB">
 
<div id="APB">
 
<div class="b1">
 
<div class="b1">
Title ...
+
Tools for the bioinformatics lab
 
</div>
 
</div>
  
Line 114: Line 114:
 
  #---------------------------------------
 
  #---------------------------------------
 
   
 
   
 
 
;That should be all
 
:as usual: e-mail me in case things do not work as expected.
 
  
  
Line 129: Line 125:
  
  
 +
==Phylip==
 +
 +
===Phylip installation===
 +
;Outdated (written 2006).
 +
 +
;Download
 +
:1. navigate to the [http://evolution.genetics.washington.edu/phylip/getme.html download section of the PHYLIP homepage].
 +
:2. read the instructions ... depending on your platform, there may be an easier way than installing from source. Neverthless, since this is the most general, here I will compile from source.
 +
:2. Download the compressed archive.
 +
:3. open a terminal session, navigate to your download directory and type the usual (remember to use the tab key for filename completion :-):
 +
gunzip ''phylip-3.65.tar.gz''
 +
tar -xvf ''phylip-3.65.tar''
 +
cd ''phylip3.65/src''
 +
 +
;Compile
 +
PHYLIP uses graphical routines in some of its programs. These have to be linked against X-terminal libraries. The Makefile should know where to find them on your system. On my Mac I need to type <tt>make&nbsp;-f&nbsp;Makefile.osx&nbsp;install</tt>, on your Linux boxes, it should simply work in the standard way:
 +
Type:
 +
make install
 +
On my system the whole package compiles with almost no nag. Bravo Joe Felsenstein, for understanding the benefit of writing plain, robust, portable code.
 +
 +
The excutables are being put into the directory <tt>''distribution''.exe</tt> and as usual  have to be put on on your <tt>PATH</tt>, your <tt>PATH</tt> changed or (my preferred way) copied to <tt>/usr/local/bin</tt>.
 +
 +
cd ..
 +
ll exe
 +
sudo cp exe/* /usr/local/bin
 +
 +
;Test
 +
...
 +
 +
;That should be all
  
  
 
==Clustal==
 
==Clustal==
  
 +
This is provided for reference, but you '''must'''know that CLUSTAL is an inferior algorithm for protein MSA.
  
 
===Clustal installation===
 
===Clustal installation===
Line 259: Line 286:
  
 
;That should be all
 
;That should be all
:as usual: e-mail me in case things do not work as expected.
 
  
  
Line 267: Line 293:
  
  
 +
&nbsp;
 +
==T-Coffee==
  
==GBrowse==
 
  
===GBrowse:viewing annotations===
+
===T-Coffee installation===
 +
;Outdated (written 2006).
  
Viewing anotations in GBrowse is actually quite straightforward.
+
;Download
 +
:1. navigate to the [http://www.igs.cnrs-mrs.fr/~cnotred/Projects_home_page/t_coffee_home_page.html T-Coffee project homepage]. Find the link to the latest Unix or Windows version (3.79 as of this writing).:
 +
:2. Download this compressed archive.
 +
:3. open a terminal session, navigate to your download directory and type the usual (remember to use the tab key for filename completion :-):
 +
gunzip ''T-COFFEE_distribution_Version_3.79.tar.gz''
 +
tar -xvf ''T-COFFEE_distribution_Version_3.79.tar''
 +
cd ''T-COFFEE_distribution_Version_3.79''
  
If you study the section on third-party annotations in the [http://localhost/gbrowse/tutorial/tutorial.html GBrowse tutorial], you will notice that you can load GFF files from a remote server. So all you actually need to do is write a cgi-script that uploads a GFF formatted record. Try the following: put the following file into your <tt>/usr/local/apache/cgi-bin</tt> directory, call it <tt>'''annotest'''</tt>:
+
;Compile
 +
T-Coffee provides an installation file for its standard installation. Type
 +
install
 +
On my system this compiles with a spate of warnings about incompatible implicit declaration of a built-in function, but the tests appear to run oK. The executable <tt>'''t_coffee'''</tt> was generated and moved to the directory <tt>bin/</tt>. Remove the object files (they are no longer needed after being compiled and linked into the executable. Type
 +
cd t_coffee_source
 +
make clean
 +
cd ..
  
<span style="color:#006699">#!/usr/bin/perl -w</span>
+
;Install
<span style="color:#770000">use</span> strict;
+
To be able to run T-Coffee from the commandline, it needs to be in a directory on your <tt>PATH</tt>. This could either be done by putting the program into a directory on the path, or by modifying the <tt>PATH</tt> appropriately. My preferred way to do this is to keep the executables in <tt>/usr/local/bin</tt>. First I copy the executable to the directory in which I keep my locally installed programs (''type'' <tt>echo $PATH</tt> ''if you are not sure that this directory exists and is on your path on your own machine'').
 
print<span style="color:#AA6666">"</span><span style="color:#4444CC">Content-type: text/plain\n</span><span style="color:#AA6666">"</span><span style="font-style:italic;color:#6666AA"># MIME header</span>
 
print <span style="color:#AA6666">"</span><span style="color:#4444CC">\n</span><span style="color:#AA6666">"</span>;                        <span style="font-style:italic;color:#6666AA"># Blank line: payload begins here</span>
 
print <span style="color:#AA6666">"</span><span style="color:#4444CC">ctgA  example  motif  1  15000  .  +  .  Motif mxy ; Note \"this is a test\"</span><span style="color:#AA6666">"</span>;
 
 
exit;
 
  
Note the special MIME type <tt>text/plain</tt>!
+
sudo cp bin/t_coffee /usr/local/bin
  
Now first execute this by typing into your browser:
+
;Test
 
+
As a final test, type the following (this assumes you are still in the main directory of the distribution:
http://localhost/cgi-bin/annotest
 
 
 
Then acess the [http://localhost/cgi-bin/gbrowse/volvox/ GBrowse tutorial volvox example] and type the same URL into the URL field for "Add remote annotations"...
 
 
 
This shows the principle. Of course, to do something useful, we would like to send some parameters with the request. Type the following script and save it as <tt>/usr/local/apache/cgi-bin/annotate</tt>. Set the right ownership (<tt>sudo chown root annotate</tt>) and permissions (<tt>sudo chmod 755 annotate</tt>).
 
  
 +
t_coffee test/test.pep -in fast_pair -outfile=test.aln -outorder=input>/dev/null
 +
cat test.aln
  
<span style="color:#006699">#!/usr/bin/perl -w</span>
+
should produce the following output
<span style="font-style:italic;color:#6666AA"># reads input from CGI in the form</span>
+
  CLUSTAL FORMAT for T-COFFEE Version_3.79, CPU=0.16 sec, SCORE=24, Nseq=5, Len=87
  <span style="font-style:italic;color:#6666AA"># http://localhost/cgi-bin/annotate?id=ctgA;start=1020;end=12250;accession=1XYZ:20..250</span>
 
<span style="font-style:italic;color:#6666AA"># returns an annotation in GFF format;</span>
 
<span style="font-style:italic;color:#6666AA"># cf. http://www.sanger.ac.uk/Software/formats/GFF/GFF_Spec.shtml</span>
 
 
<span style="color:#770000">use</span> strict;
 
<span style="color:#770000">use</span> CGI;
 
 
<span style="color:#770000">my</span> $input = CGI-&gt;new();
 
 
<span style="color:#770000">my</span> $acc_ID = <span style="color:#AA6666">'</span><span style="color:#4444CC">XXX</span><span style="color:#AA6666">'</span>;
 
<span style="color:#770000">my</span> $acc_start = <span style="color:#AA6666">'</span><span style="color:#4444CC">000</span><span style="color:#AA6666">'</span>;
 
<span style="color:#770000">my</span> $acc_end = <span style="color:#AA6666">'</span><span style="color:#4444CC">000</span><span style="color:#AA6666">'</span>;
 
 
<span style="color:#770000">my</span> $accession=$input-&gt;param(<span style="color:#AA6666">'</span><span style="color:#4444CC">accession</span><span style="color:#AA6666">'</span>);
 
<span style="color:#770000">if</span> ($accession =~ <span style="color:#AA6666">m/</span><span style="color:#4444CC">^([^:]+):(\d+)\.\.(\d+)$</span><span style="color:#AA6666">/</span>) {
 
    $acc_ID = $1;
 
    $acc_start = $2;
 
    $acc_end = $3;
 
}
 
 
<span style="color:#770000">my</span> $seqid = $input-&gt;param(<span style="color:#AA6666">'</span><span style="color:#4444CC">id</span><span style="color:#AA6666">'</span>);
 
<span style="color:#770000">my</span> $source = <span style="color:#AA6666">"</span><span style="color:#4444CC">Annotbot</span><span style="color:#AA6666">"</span>;
 
<span style="color:#770000">my</span> $type = <span style="color:#AA6666">"</span><span style="color:#4444CC">region</span><span style="color:#AA6666">"</span>;        <span style="font-style:italic;color:#6666AA"># cf. SOFA ontology</span>
 
                            <span style="font-style:italic;color:#6666AA"># http://cvs.sourceforge.net/viewcvs.py/song/ontology/sofa.ontology</span>
 
<span style="color:#770000">my</span> $start = $input-&gt;param(<span style="color:#AA6666">'</span><span style="color:#4444CC">start</span><span style="color:#AA6666">'</span>);
 
<span style="color:#770000">my</span> $end = $input-&gt;param(<span style="color:#AA6666">'</span><span style="color:#4444CC">end</span><span style="color:#AA6666">'</span>);
 
<span style="color:#770000">my</span> $score = <span style="color:#0055CC">0.0</span>;
 
<span style="color:#770000">my</span> $strand = <span style="color:#AA6666">'</span><span style="color:#4444CC">-</span><span style="color:#AA6666">'</span>;
 
<span style="color:#770000">my</span> $phase = <span style="color:#0055CC">0</span>;
 
<span style="color:#770000">my</span> @attributes;
 
 
   
 
   
  $attributes[<span style="color:#0055CC">0</span>]= <span style="color:#AA6666">"</span><span style="color:#4444CC">$type \"Test annotation\";</span><span style="color:#AA6666">"</span>;
+
  1aboA          ---nlf-valydfvasgdntlsitkgeklrvlg-----------ynhnge--wceaqtkn
  $attributes[<span style="color:#0055CC">1</span>]= <span style="color:#AA6666">"</span><span style="color:#4444CC">Note \"Accession No. $acc_ID from $acc_start to $acc_end\";</span><span style="color:#AA6666">"</span>;
+
1ycsB          --kgvi-yalwdyepqnddelpmkegdcmtiih-----------rededeiewwwarlnd
 +
1pht            ---gyqyralydykkereedidlhlgdiltvnkgslvalgfsdgqearpeeigwlngyne
 +
1ihvA          -nfrvy-y--------rdsrdpvwkgpakllwk-----------gegavv-----iqdns
 +
  1vie            drvrkk-s--------gaawqgqivgwyctnlt-----------pegyaves-----eah
 +
                                          *                  :            
 
   
 
   
  print<span style="color:#AA6666">"</span><span style="color:#4444CC">Content-type: text/plain\n</span><span style="color:#AA6666">"</span>;  <span style="font-style:italic;color:#6666AA"># MIME header</span>
+
  1aboA          gq-----------gwvpsnyitpvn--
  print <span style="color:#AA6666">"</span><span style="color:#4444CC">\n</span><span style="color:#AA6666">"</span>;                        <span style="font-style:italic;color:#6666AA"># Blank line: payload begins here</span>
+
  1ycsB          ke-----------gyvprnllglyp--
   
+
  1pht            ttgergdfpgtyveyigrkkisp----
  print <span style="color:#AA6666">"</span><span style="color:#4444CC">$seqid\t</span><span style="color:#AA6666">"</span>;
+
  1ihvA          di-----------kvvprrkakiird-
print <span style="color:#AA6666">"</span><span style="color:#4444CC">$source\t</span><span style="color:#AA6666">"</span>;
+
  1vie            pg-----------svqiypvaalerin
print <span style="color:#AA6666">"</span><span style="color:#4444CC">$type\t</span><span style="color:#AA6666">"</span>;
+
                                           
print <span style="color:#AA6666">"</span><span style="color:#4444CC">$start\t</span><span style="color:#AA6666">"</span>;
 
print <span style="color:#AA6666">"</span><span style="color:#4444CC">$end\t</span><span style="color:#AA6666">"</span>;
 
print <span style="color:#AA6666">"</span><span style="color:#4444CC">$score\t</span><span style="color:#AA6666">"</span>;
 
print <span style="color:#AA6666">"</span><span style="color:#4444CC">$strand\t</span><span style="color:#AA6666">"</span>;
 
print <span style="color:#AA6666">"</span><span style="color:#4444CC">$phase\t</span><span style="color:#AA6666">"</span>;
 
<span style="color:#770000">foreach</span> <span style="color:#770000">my</span> $att (@attributes) {
 
    print $att;
 
}
 
 
exit;
 
 
 
Then try this out by typing the following into your browser
 
 
 
http://localhost/cgi-bin/annotate?id=ctgA;start=1020;end=1250;accession=1XYZ:20..250
 
 
 
... and finally paste this into the "remote annotations" field of the Volvox example database. Then try changing some of the parameters.
 
 
 
 
 
===Gbrowse installation===
 
(Outdated: written 2006)
 
 
 
Refer to http://www.gmod.org/ to ensure the installation instructions are current.
 
 
 
;Download
 
:1. navigate to the [http://prdownloads.sourceforge.net/gmod GMOD download pages on sourceforge]
 
:2. Find the most recent version of the Generic-Genome-Browser (1.64 as of this writing). Download this compressed archive.
 
:3. open a terminal session, navigate to your download directory and type the usual (remember to use the tab key for filename completion :-):
 
gunzip ''Generic-Genome-Browser-1.64.tar.gz''
 
  tar -xvf ''Generic-Genome-Browser-1.64.tar''
 
cd ''Generic-Genome-Browser-1.64''
 
 
 
;Compile
 
 
 
Before you continue, read through the entire page of installatio information. There is information on how to install into non-default directories and how to install without requiring root access, and this may be useful for your specific situation. If you decide to go the default way, it is simply a question of typing:
 
perl Makefile.PL
 
make
 
sudo make install
 
make clean
 
 
 
;Test
 
The installation instruction page discuss a quick test run with data that is supplied in the installation. Point your browser to http://localhost/cgi-bin/gbrowse (of course your Apache server has to be running for this to work).
 
 
 
More instructions and a more detailed tutorial are found at http://localhost/gbrowse/tutorial/tutorial.html .
 
  
 +
;That should be all
  
  
Line 398: Line 361:
 
-->
 
-->
 
&nbsp;
 
&nbsp;
 +
 
==Further reading and resources==
 
==Further reading and resources==
 
<!-- {{#pmid:21627854}} -->
 
<!-- {{#pmid:21627854}} -->

Latest revision as of 16:51, 18 September 2012

Tools for the bioinformatics lab


The contents of this page has recently been imported from an older version of this Wiki. This page may contain outdated information, information that is irrelevant for this Wiki, information that needs to be differently structured, outdated syntax, and/or broken links. Use with caution!


Summary ...



EMBOSS

EMBOSS installation

Outdated (written 2006).

Download

1. navigate to the EMBOSS download page on sourceforge and read the information on the latest download there. As of this writing, the latest major release is version 3.0.
2. Download this compressed archive.
3. open a terminal session, navigate to your download directory and type the usual (remember to use the tab key for filename completion :-):
gunzip EMBOSS-3.0.0.tar.gz
tar -xvf EMBOSS-3.0.0.tar
rm EMBOSS-3.0.0.tar
cd EMBOSS-3.0.0

Before you begin, it may be a good idea to browse through some of the files that have been downloaded to get you oriented, these include:

INSTALL
KNOWN_BUGS  (this is an empty file in this release)
README

Compile

EMBOSS requires a number of system specific options to be set and thus will generate its makefile before it can be used, by running the program configure. Type:

configure

Then type

make

Compilation will run for some time. Then type

sudo make install

and finally

make clean

Test

First see whether installation was successful in principle. Typing

ls /usr/local/share/EMBOSS/data/

should list some of the data resources that have been installed and where they are located. Now open a new shell and type

tfm needle

You should see man-like help pages for EMBOSS commands. In fact tfm is itself an EMBOSS command, it runs a program that formats and displays help files. If the above works, it tells you two things: (i) that EMBOSSS programs have been compiled and installed, and (ii) that the installation is on your PATH.

Next, try a simple pairwise alignment. Create two sequence files (2):

HBA.fa
>HBA_HUMAN
VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSH
GSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKL
LSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR
HBB.fa
>HBB_HUMAN
VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLST
PDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDP
ENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH

and type the following comand (note: I am using the multiline command character "\" here to wrap the command, but it could also be type all on one line:

needle -asequence HBA.fa -bsequence HBB.fa \
-gapopen 10.0 -gapextend 0.5 -datafile EBLOSUM62 \
-outfile test.ali

Then typing

cat test.ali

should give you the following output:

########################################
# Program: needle
# Rundate: Sun Mar 02 2006 14:32:03
# Align_format: srspair
# Report_file: test2.ali
########################################

#=======================================
#
# Aligned_sequences: 2
# 1: HBA_HUMAN
# 2: HBB_HUMAN
# Matrix: EBLOSUM62
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 148
# Identity:      63/148 (42.6%)
# Similarity:    88/148 (59.5%)
# Gaps:           9/148 ( 6.1%)
# Score: 290.5
# 
#
#=======================================

HBA_HUMAN          1 -VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF-DL     48
                      .|:|.:|:.|.|.||||  :..|.|.|||.|:.:.:|.|:.:|..| ||
HBB_HUMAN          1 VHLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDL     48

HBA_HUMAN         49 S-----HGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRV     93
                     |     .|:.:||.|||||..|.::.:||:|::....:.||:||..||.|
HBB_HUMAN         49 STPDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHV     98

HBA_HUMAN         94 DPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR    141
                     ||.||:||.:.|:..||.|...||||.|.|:..|.:|.|:..|..||.
HBB_HUMAN         99 DPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH    146


#---------------------------------------
#---------------------------------------



Notes

(1) In these notes, I assume the "." directory is on your PATH - if it is not, you may have to prepend "./" to commands to tell the operating system the executable file for the command is in your current working directory.

(2) My favorite quick and dirty way to create a text file (e.g. called file.txt) based on something I can copy and paste, is to type

cat > file.txt

then I simply paste the contents and close the file with <ctrl>d. Ask me if you don't understand how this works.


Phylip

Phylip installation

Outdated (written 2006).
Download
1. navigate to the download section of the PHYLIP homepage.
2. read the instructions ... depending on your platform, there may be an easier way than installing from source. Neverthless, since this is the most general, here I will compile from source.
2. Download the compressed archive.
3. open a terminal session, navigate to your download directory and type the usual (remember to use the tab key for filename completion :-):
gunzip phylip-3.65.tar.gz
tar -xvf phylip-3.65.tar
cd phylip3.65/src
Compile

PHYLIP uses graphical routines in some of its programs. These have to be linked against X-terminal libraries. The Makefile should know where to find them on your system. On my Mac I need to type make -f Makefile.osx install, on your Linux boxes, it should simply work in the standard way: Type:

make install

On my system the whole package compiles with almost no nag. Bravo Joe Felsenstein, for understanding the benefit of writing plain, robust, portable code.

The excutables are being put into the directory distribution.exe and as usual have to be put on on your PATH, your PATH changed or (my preferred way) copied to /usr/local/bin.

cd ..
ll exe
sudo cp exe/* /usr/local/bin
Test

...

That should be all


Clustal

This is provided for reference, but you mustknow that CLUSTAL is an inferior algorithm for protein MSA.

Clustal installation

Outdated (written 2006).


Download
1. navigate to the CLUSTAL homepage at the EBI:
2. at the top of the page, there are icons for Mac and Linux installations (Windows too). Clicking on the Apple folder downloads the latest precompiled Max OS X version (clustalw1.82.mac-osx.tar.gz as of this writing). Don't do this even if you are on a Mac! (1). Clicking on the folder with the penguin icon takes you to an ftp directory which also contains sources for parallel architecture machines. clustalw1.83.UNIX.tar.gz appears to be the current UNIX version as of this writing. Download this compressed archive.
3. open a terminal session, navigate to your download directory and type the usual (remember to use the tab key for filename completion :-):
gunzip clustalw1.83.UNIX.tar.gz
tar -xvf clustalw1.83.UNIX.tar
cd clustalw1.83
Compile

Type:

make

On my system this compiles with one warning about a redefined symbol which does not appear to be of any consequence. The executable clustalw is being generated. The makefile is not really up to standard since it has no provisions for make test or make install. So we will run our own very simple test and installation. Remove the object files (they are no longer needed after being compiled and linked into the executable. Type

make clean

Since you also do not require the C sources anymore and could download them from the server at anytime if you did, you may also type

rm *.c *.h

to clean up the directory.


Test

The directory contains a test input by the name globin.pep. These are globin sequences in the dated PIR format. I have transformed them into Fasta format below. Copy the following sequences and save them in a file by the name globin.mfa.

>HBB_HUMAN
VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLST
PDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDP
ENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH

>HBB_HORSE
VQLSGEEKAAVLALWDKVNEEEVGGEALGRLLVVYPWTQRFFDSFGDLSN
PGAVMGNPKVKAHGKKVLHSFGEGVHHLDNLKGTFAALSELHCDKLHVDP
ENFRLLGNVLVVVLARHFGKDFTPELQASYQKVVAGVANALAHKYH

>HBA_HUMAN
VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSH
GSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKL
LSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR

>HBA_HORSE
VLSAADKTNVKAAWSKVGGHAGEYGAEALERMFLGFPTTKTYFPHFDLSH
GSAQVKAHGKKVGDALTLAVGHLDDLPGALSNLSDLHAHKLRVDPVNFKL
LSHCLLSTLAVHLPNDFTPAVHASLDKFLSSVSTVLTSKYR

>MYG_PHYCA
VLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRFKHLK
TEAEMKASEDLKKHGVTVLTALGAILKKKGHHEAELKPLAQSHATKHKIP
IKYLEFISEAIIHVLHSRHPGDFGADAQGAMNKALELFRKDIAAKYKELG
YQG

>GLB5_PETMA
PIVDTGSVAPLSAAEKTKIRSAWAPVYSTYETSGVDILVKFFTSTPAAQE
FFPKFKGLTTADQLKKSADVRWHAERIINAVNDAVASMDDTEKMSMKLRD
LSGKHAKSFQVDPQYFKVLAAVIADTVAAGDAGFEKLMSMICILLRSAY

>LGB2_LUPLU
GALTESQAALVKSSWEEFNANIPKHTHRFFILVLEIAPAAKDLFSFLKGT
SEVPQNNPELQAHAGKVFKLVYEAAIQLQVTGVVVTDATLKNLGSVHVSK
GVADAHFPVVKEAILKTIKEVVGAKWSEELNSAWTIAYDELAIVIKKEMN
DAA

Now type

clustalw -options

to verify that the program runs in principle (this will print a list of the commandline options). Then run the following command:

clustalw -infile=globin.mfa

This should have created the two files globin.aln and globin.dnd with the following contents:

globin.aln
CLUSTAL W (1.83) multiple sequence alignment


HBB_HUMAN       --------VHLTPEEKSAVTALWGKVN--VDEVGGEALGRLLVVYPWTQRFFESFGDLST
HBB_HORSE       --------VQLSGEEKAAVLALWDKVN--EEEVGGEALGRLLVVYPWTQRFFDSFGDLSN
HBA_HUMAN       ---------VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF-DLS-
HBA_HORSE       ---------VLSAADKTNVKAAWSKVGGHAGEYGAEALERMFLGFPTTKTYFPHF-DLS-
GLB5_PETMA      PIVDTGSVAPLSAAEKTKIRSAWAPVYSTYETSGVDILVKFFTSTPAAQEFFPKFKGLTT
MYG_PHYCA       ---------VLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRFKHLKT
LGB2_LUPLU      --------GALTESQAALVKSSWEEFNANIPKHTHRFFILVLEIAPAAKDLFSFLKGTSE
                          *:  :   :   *  .           :  .:   * :   *  :   . 

HBB_HUMAN       PDAVMGNPKVKAHGKKVLGAFSDGLAHLDN-----LKGTFATLSELHCDKLHVDPENFRL
HBB_HORSE       PGAVMGNPKVKAHGKKVLHSFGEGVHHLDN-----LKGTFAALSELHCDKLHVDPENFRL
HBA_HUMAN       ----HGSAQVKGHGKKVADALTNAVAHVDD-----MPNALSALSDLHAHKLRVDPVNFKL
HBA_HORSE       ----HGSAQVKAHGKKVGDALTLAVGHLDD-----LPGALSNLSDLHAHKLRVDPVNFKL
GLB5_PETMA      ADQLKKSADVRWHAERIINAVNDAVASMDDT--EKMSMKLRDLSGKHAKSFQVDPQYFKV
MYG_PHYCA       EAEMKASEDLKKHGVTVLTALGAILKKKGH-----HEAELKPLAQSHATKHKIPIKYLEF
LGB2_LUPLU      VP--QNNPELQAHAGKVFKLVYEAAIQLQVTGVVVTDATLKNLGSVHVSKG-VADAHFPV
                      . .:: *.  :   .                  :  *.  *  .  :    : .

HBB_HUMAN       LGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH------
HBB_HORSE       LGNVLVVVLARHFGKDFTPELQASYQKVVAGVANALAHKYH------
HBA_HUMAN       LSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR------
HBA_HORSE       LSHCLLSTLAVHLPNDFTPAVHASLDKFLSSVSTVLTSKYR------
GLB5_PETMA      LAAVIADTVAAG---------DAGFEKLMSMICILLRSAY-------
MYG_PHYCA       ISEAIIHVLHSRHPGDFGADAQGAMNKALELFRKDIAAKYKELGYQG
LGB2_LUPLU      VKEAILKTIKEVVGAKWSEELNSAWTIAYDELAIVIKKEMNDAA---
                :   :  .:            ...       .   :         
globin.dnd
(
(
(
(
HBB_HUMAN:0.08080,
HBB_HORSE:0.08359)
:0.23578,
(
HBA_HUMAN:0.06516,
HBA_HORSE:0.05541)
:0.19444)
:0.07579,
GLB5_PETMA:0.37023)
:0.02699,
MYG_PHYCA:0.37220,
LGB2_LUPLU:0.47094);

Install

To be able to run clustal from the commandline, it needs to be in a directory on your PATH. This could either be done by putting the program into a directory on the path, or by modifying the PATH appropriately. My preferred way to do this is to keep the executables in /usr/local/bin. First I copy the executable to the directory in which I keep my locally installed programs (type echo $PATH if you are not sure that this directory exists and is on your path on your own machine).

sudo cp clustalw /usr/local/bin

Finally I copy the help file into the same directory, so clustalw can find it:

sudo cp clustalw_help /usr/local/bin
That should be all



Notes
(1) The Mac Os X archives contain two compiled binaries and an unintelligible readme.html. The binaries appear to run but miss any helpfiles or documentation. This is useless pseudosupport, the only thing you save yourself is the trivial task of compiling the executables but when you compile from source at least you get the complete kit and everything is nicely in its place.


 

T-Coffee

T-Coffee installation

Outdated (written 2006).
Download
1. navigate to the T-Coffee project homepage. Find the link to the latest Unix or Windows version (3.79 as of this writing).:
2. Download this compressed archive.
3. open a terminal session, navigate to your download directory and type the usual (remember to use the tab key for filename completion :-):
gunzip T-COFFEE_distribution_Version_3.79.tar.gz
tar -xvf T-COFFEE_distribution_Version_3.79.tar
cd T-COFFEE_distribution_Version_3.79
Compile

T-Coffee provides an installation file for its standard installation. Type

install

On my system this compiles with a spate of warnings about incompatible implicit declaration of a built-in function, but the tests appear to run oK. The executable t_coffee was generated and moved to the directory bin/. Remove the object files (they are no longer needed after being compiled and linked into the executable. Type

cd t_coffee_source
make clean
cd .. 
Install

To be able to run T-Coffee from the commandline, it needs to be in a directory on your PATH. This could either be done by putting the program into a directory on the path, or by modifying the PATH appropriately. My preferred way to do this is to keep the executables in /usr/local/bin. First I copy the executable to the directory in which I keep my locally installed programs (type echo $PATH if you are not sure that this directory exists and is on your path on your own machine).

sudo cp bin/t_coffee /usr/local/bin
Test

As a final test, type the following (this assumes you are still in the main directory of the distribution:

t_coffee test/test.pep -in fast_pair -outfile=test.aln -outorder=input>/dev/null
cat test.aln

should produce the following output

CLUSTAL FORMAT for T-COFFEE Version_3.79, CPU=0.16 sec, SCORE=24, Nseq=5, Len=87 

1aboA           ---nlf-valydfvasgdntlsitkgeklrvlg-----------ynhnge--wceaqtkn
1ycsB           --kgvi-yalwdyepqnddelpmkegdcmtiih-----------rededeiewwwarlnd
1pht            ---gyqyralydykkereedidlhlgdiltvnkgslvalgfsdgqearpeeigwlngyne
1ihvA           -nfrvy-y--------rdsrdpvwkgpakllwk-----------gegavv-----iqdns
1vie            drvrkk-s--------gaawqgqivgwyctnlt-----------pegyaves-----eah
                                         *                   :              

1aboA           gq-----------gwvpsnyitpvn--
1ycsB           ke-----------gyvprnllglyp--
1pht            ttgergdfpgtyveyigrkkisp----
1ihvA           di-----------kvvprrkakiird-
1vie            pg-----------svqiypvaalerin
                                           
That should be all


   

Further reading and resources