Difference between revisions of "BioPerl"

Latest revision as of 13:01, 16 September 2012

BioPerl

The contents of this page has recently been imported from an older version of this Wiki. This page may contain outdated information, information that is irrelevant for this Wiki, information that needs to be differently structured, outdated syntax, and/or broken links. Use with caution!

Summary ...

Introductory reading

Stajich (2007) An Introduction to BioPerl. Methods Mol Biol 406:535-48. (pmid: 18287711)

[ PubMed ] [ DOI ] The BioPerl toolkit provides a library of hundreds of routines for processing sequence, annotation, alignment, and sequence analysis reports. It often serves as a bridge between different computational biology applications assisting the user to construct analysis pipelines. This chapter illustrates how BioPerl facilitates tasks such as writing scripts summarizing information from BLAST reports or extracting key annotation details from a GenBank sequence record.

Installation

Installing BioPerl is reasonably straightforward, all that needs to be done is to ensure that the right files (Perl modules) end up in the directories where a normal Perl installation can access them.

See the notes on the Bioperl Wiki (getting Bioperl) and the installation notes for Unix (or for Windows, note that regarding Windows you are on your own, we will only cover Unix in the course).

1: download the appropriate release from the Bioperl site (as of this writing this should be at least 1.5.1, but may be a later version) into your home directory or another directory to which you have write access. (For me this might be /home/steipe/downloads). You might want to create a downloads directory to store original downloads for a while if you don't have one.

2: ensure you have the right permissions to work in the libraries where Perl modules are to be found (i.e. directrories on the Perl path, stored in the environment variable @INC). Best if you have root access, otherwise you need a personal installation (see the INSTALL notes). On my system (Mac OS X), Perl itself and its core modules are installed in System/Library/Perl/ and additional Perl modules that I install from CPAN or elsewhere are installed below the directory /Library/Perl; both these directories are on the Perl path; I'll use the latter directory name for these instructions but you can of course substitute another if necessary.

3: uncompress and un-archive the distribution

$ gunzip current_core_unstable.tar.gz

$ tar -xvf current_core_unstable.tar

$ rm current_core_unstable.tar

4: this should have created the directory bioperl-1.5.1/. Now prepare the modules: creating the makefile will tell you what dependencies are not installed and which parts of Bioperl this will affect, don't worry about this for now, we'll install more things later in the course.

$ cd bioperl-1.5.1

$ perl Makefile.PL

$ make

$ make test

5: the final command should have rattled off a large number of unit tests. Those that can't function because of missing dependencies should have been skipped and the others should mostly have passed. Finally put all modules into their right place on Perl's path. (Note that this command will probably fail if not issued under sudo i.e. with root-user privileges.)

$ sudo make install

6: Now you can navigate back to your home directory and test that the right version of BioPerl is being found and used:

$ cd ~

$ perl -MBio::Perl -le "print Bio::Perl->VERSION;"

If this prints 1.5 instead of complaining about not finding the module, you're all good to go. If you encounter any problems with the installation, please add to the discussion page here!

Installing Bundle::BioPerl

A Bundle is a collection of modules that somehow belong together. Bundle::BioPerl contains most (if not all) of the dependencies that a standard installation of BioPerl uses. See the BioPerl Install notes for details but actually all you need to do is type:

sudo perl -MCPAN -e "install Bundle::BioPerl"

Note that this installation will not be entirely successful at the first run, due to missing dependencies that currently are not obtained from CPAN and need to be compiled. expat appears to be one of these and GD won't install before libgd has been installed. If this happens, look at the logs and figure out what appears to be missing, then search either via CPAN or via Google. Download, following the Perl installation notes or the Unix installation notes. Then simply run the CPAN installation again. You may have to iterate a few times, but CPAN will keep track of what files have been successfully installed and skip these.

Installing the run modules

These modules contain installers and interfaces to important tools such as T-Coffee, EMBOSS applications, Phylogeny ...

1: download the latest stable run modules archive via the Bioperl Wiki Download page.

2: download and install minimally the following supported programs:

CLUSTAL
the EMBOSS package
T-Coffee
PHYLIP
we will add local BLAST at a later time ...

3: read and follow the BioPerl run installation notes:

4: test successful installation: (TBC).

BioPerl ext the "C extension" modules

Download

Navigate to the the Bioperl Wiki Download page and download the latest stable ext modules archive (if you haven't done so already).

Open a terminal session, navigate to your download directory and type the usual (remember to use the tab key for filename completion :-):

gunzip current_ext_stable.tar.gz
tar -xvf current_ext_stable.tar
cd bioperl-ext-1.4
cat README

As you see in the README file, the ext library currently contains two modules: Bio::Seq::align and Bio:SeqIO::staden:: read. The former is a generally useful tool, but the latter is only needed to read a particular format of sequence trace files. We will skip this part of the installation and run only the Makefile.PL that installs the alignment modules, by navigating down to the subdirectory and running only the Makefile that is there:

Prepare, test and install cd Bio/Ext/Align perl Makefile.PL make make test sudo make install That should generate a successfull sequence -alignment test and then move the modules in their right place on the Perl path. Done. BioPerl run Download Navigate to the the Bioperl Wiki Download page and download the latest stable run modules archive (if you haven't done so already). Open a terminal session, navigate to your download directory and type the usual (remember to use the tab key for filename completion :-): gunzip current_run_stable.tar.gz tar -xvf current_run_stable.tar cd bioperl-run-1.4 Prepare Since this is a Perl distribution, we type the usual perl Makefile.PL On my system this generates the following output: [...] External Module Algorithm::Diff, Compute intelligent differences between two files, is not installed on this computer. The TribeMCL module in bioperl-run needs it for generating consensus protein family descriptions Warning: There are some external packages and perl modules, listed above, which bioperl-run uses. This only effects the functionality which is listed above: the rest of bioperl-run will work fine. Thus I install the missing module(s) from CPAN ... cd .. sudo perl -MCPAN -e 'install Algorithm::Diff' This went on without problem, back to bioperl ... cd bioperl-run-1.4 perl Makefile.PL This goes without warnings now, so ... make Test Most of the suppported programs have not been installled by us, so there is no point in testing them. But we can list the availble tests and execute only those that interest us. make show_tests make test_Clustalw This runs oK. However the following tests pass only mostly correctly (The BioPerl community is not overly concerned about distributing code that does not rigorously pass all tests. It seems to me that in most of these cases the problems are with the tests, not with the code.). make test_ProtPars ... tests the protein parsimony module in the PHYLIP package make test_TCoffee make test EMBOSS Nevertheless, despite the reported errors most tests concerning EMBOSS, most test concerning the PHYLIP programs and most tests concerning T-Coffee should pass. Install Finally type: sudo make install That should move the modules in their right place on the Perl path and create man files for the installed components. Done. Programming BioPerl Please see the following tutorials on the Web: BioPerl Wiki - Howto:Beginners The Beginners' instructions in the BioPerl Wiki covers the basic use of Bio::Seq and Bio::SeqIO with first steps of BLAST (however we haven't installed BLAST yet). BioPerl Tutorial The excellent and comprehensive work of many BioPerl authors. A BioPerl course A comprehensive course at the Institut Pasteur. Contains structured chapters covering the essential aspects of BioPerl, sample data and example code, as well as references to the BioPerl tutorial. BioPerl tutorials Further reading and resources The BioPerl project Wiki Introduction to BioPerl (at the project Wiki)

@@ Line 18: / Line 18: @@
 <li><span class="toctext">[[BioPerl exercise restriction]]</span></li>
 <li><span class="toctext">[[BioPerl exercise signal cleavage]]</span></li>
+<li><span class="toctext">[[BioPerl example FASTA|BioPerl example: retrieving sequence data]]</span></li>
 </ul>
 </td></tr></table>

Difference between revisions of "BioPerl"

Latest revision as of 13:01, 16 September 2012

Contents

Introductory reading

Installation

Installing Bundle::BioPerl

Installing the run modules

BioPerl ext the "C extension" modules

BioPerl run

Programming BioPerl

BioPerl tutorials

Further reading and resources

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Sections

Tools