BioPerl exercise restriction

Title...

The contents of this page has recently been imported from an older version of this Wiki. This page may contain outdated information, information that is irrelevant for this Wiki, information that needs to be differently structured, outdated syntax, and/or broken links. Use with caution!

Summary ...

Introductory reading

Introduction

This is an exercise in BioPerl fundamentals: create a sequence object and do something with it.

Task

Write a small program that finds (and prints) the XbaI restriction site in the following nucleotide sequence:

AGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCG
GATAACAATTTCACACAGGAAACAGCTATGACCATGATTACGAATTTCTAGATAACGAGG
GCAAAAAATGAAAAAGACAGCTATCGCGATTGCAGTGGCACTGGCTGGTTTCGCTACCGT

References

Hints

Create a variable that holds the sequence
Create a sequence object using that variable
Create a restriction object with the enzyme you want to use
Use the restriction object's cut_seq() method on the sequence object to create a list of fragments
Print the fragments

Solution(s)

Minimal solution: coding sequence and enzyme name as literals, just exercising the creation and use of sequence and restriction objects:

#!/usr/bin/perl -w

use strict;

use Bio::Seq;
use Bio::Tools::RestrictionEnzyme;

my $enzyme = "XbaI";

my $sequence  = "AGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCG";
   $sequence .= "GATAACAATTTCACACAGGAAACAGCTATGACCATGATTACGAATTTCTAGATAACGAGG";
   $sequence .= "GCAAAAAATGAAAAAGACAGCTATCGCGATTGCAGTGGCACTGGCTGGTTTCGCTACCGT";

# Creation of sequence object
my $seq_obj = Bio::Seq->new(-seq => $sequence, -alphabet => 'dna');

# Creation of Restriction Enzyme Object
my $rest_obj = new Bio::Tools::RestrictionEnzyme(-NAME => "XbaI");

# Print fragments
foreach ($rest_obj->cut_seq($seq_obj)) {
      print "$_\n\n";
   }

exit ();

Compact usable solution: read enzyme name from commandline, pass sequence via STDIN, a bit of sanity handling of input sequence

#!/usr/bin/perl -w
# reads a restriction enzyme from commandline and a sequence from
# STDIN. Sequence can be raw nucleotides or fasta
# usage example  restrict.pl XbaI < test.fa
use strict;

use Bio::Seq;
use Bio::Tools::RestrictionEnzyme;

my $enzyme = shift(@ARGV);

my $sequence = "";
# read the sequence
while (my $string = <STDIN>) {
    if ($string !~ m/^>/) {  # if the string does not begin with a fasta header character ...
        chomp($string);
        $string =~ s/\s+|[^A-Za-z]//g;   # delete all whitespace characters
                                         # or all characters not in the range A-Z,a-z
        $sequence .= $string;            # concatenate
    }
}

# Creation of sequence object
my $seq_obj = Bio::Seq->new(-seq => $sequence, -alphabet => 'dna');

# Creation of Restriction Enzyme Object
my $rest_obj = new Bio::Tools::RestrictionEnzyme(-NAME => $enzyme);

# Get a list of fragments
my @fragments = $rest_obj->cut_seq($seq_obj);

# Outputs enzyme name, restriction site and the fragments generated from cleavage
print "Enzyme: $enzyme \n";
print "Recognition Site: ", $rest_obj->site, "\n\n";
print "List of Fragments: \n";

foreach my $temp (@fragments) {
      print "$temp\n\n";
   }

exit ();

Interactive solution: asks for enzyme and filename, reads sequence from file using Bio::SeqIO (contributed by Jamie)

#!/usr/bin/perl -w
use strict;

use Bio::Seq;
use Bio::SeqIO;
use Bio::Tools::RestrictionEnzyme;

# Introduction and obtaining filename containing sequence/ restriction site
print "Welcome to the Restriction Analysis Tool.\n";
print "Enter Sequence Filename: ";
my $sequencefile = <STDIN>;
chomp($sequencefile);

print "Enter Query Enzyme: ";
my $enz = <STDIN>;
chomp($enz);

# Creation of input object and opening of sequence file
my $seqio_obj = Bio::SeqIO->new(-file => $sequencefile, -format => "fasta");
my $seq_obj = $seqio_obj->next_seq;

# Printing out (verifying) input sequence
print "\nInput Sequence:\n";
print $seq_obj->seq;
print "\n\n";

# Create a new Restriction Enzyme Object
my $rest = new Bio::Tools::RestrictionEnzyme(-NAME => $enz);

# Get a list of fragments resulting when a sequence is cut by the indicated enzyme
my @fragments = $rest->cut_seq($seq_obj);

# Outputs enzyme name, restriction site and the fragments generated from cleavage
print "Enzyme: $enz \n";
print "Recognition Site: ", $rest->site, "\n\n";
print "List of Fragments: \n";

foreach my $temp (@fragments) {
      print "$temp\n\n";
   }

exit ();

BioPerl exercise restriction

Contents

Introductory reading

Introduction

Task

References

Hints

Solution(s)

Further reading and resources

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Sections

Tools