BioPerl exercise restriction
Jump to navigation
Jump to search
BioPerl exercise: restriction site
The contents of this page has recently been imported from an older version of this Wiki. This page may contain outdated information, information that is irrelevant for this Wiki, information that needs to be differently structured, outdated syntax, and/or broken links. Use with caution!
Summary ...
Contents
Introductory reading
Introduction
This is an exercise in BioPerl fundamentals: create a sequence object and do something with it.
Task
Write a small program that finds (and prints) the XbaI restriction site in the following nucleotide sequence:
AGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCG GATAACAATTTCACACAGGAAACAGCTATGACCATGATTACGAATTTCTAGATAACGAGG GCAAAAAATGAAAAAGACAGCTATCGCGATTGCAGTGGCACTGGCTGGTTTCGCTACCGT
References
Hints
- Create a variable that holds the sequence
- Create a sequence object using that variable
- Create a restriction object with the enzyme you want to use
- Use the restriction object's cut_seq() method on the sequence object to create a list of fragments
- Print the fragments
Solution(s)
Minimal solution: coding sequence and enzyme name as literals, just exercising the creation and use of sequence and restriction objects:
#!/usr/bin/perl -w use strict; use Bio::Seq; use Bio::Tools::RestrictionEnzyme; my $enzyme = "XbaI"; my $sequence = "AGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCG"; $sequence .= "GATAACAATTTCACACAGGAAACAGCTATGACCATGATTACGAATTTCTAGATAACGAGG"; $sequence .= "GCAAAAAATGAAAAAGACAGCTATCGCGATTGCAGTGGCACTGGCTGGTTTCGCTACCGT"; # Creation of sequence object my $seq_obj = Bio::Seq->new(-seq => $sequence, -alphabet => 'dna'); # Creation of Restriction Enzyme Object my $rest_obj = new Bio::Tools::RestrictionEnzyme(-NAME => "XbaI"); # Print fragments foreach ($rest_obj->cut_seq($seq_obj)) { print "$_\n\n"; } exit ();
Compact usable solution: read enzyme name from commandline, pass sequence via STDIN, a bit of sanity handling of input sequence
#!/usr/bin/perl -w # reads a restriction enzyme from commandline and a sequence from # STDIN. Sequence can be raw nucleotides or fasta # usage example restrict.pl XbaI < test.fa use strict; use Bio::Seq; use Bio::Tools::RestrictionEnzyme; my $enzyme = shift(@ARGV); my $sequence = ""; # read the sequence while (my $string = <STDIN>) { if ($string !~ m/^>/) { # if the string does not begin with a fasta header character ... chomp($string); $string =~ s/\s+|[^A-Za-z]//g; # delete all whitespace characters # or all characters not in the range A-Z,a-z $sequence .= $string; # concatenate } } # Creation of sequence object my $seq_obj = Bio::Seq->new(-seq => $sequence, -alphabet => 'dna'); # Creation of Restriction Enzyme Object my $rest_obj = new Bio::Tools::RestrictionEnzyme(-NAME => $enzyme); # Get a list of fragments my @fragments = $rest_obj->cut_seq($seq_obj); # Outputs enzyme name, restriction site and the fragments generated from cleavage print "Enzyme: $enzyme \n"; print "Recognition Site: ", $rest_obj->site, "\n\n"; print "List of Fragments: \n"; foreach my $temp (@fragments) { print "$temp\n\n"; } exit ();
Interactive solution: asks for enzyme and filename, reads sequence from file using Bio::SeqIO (contributed by Jamie)
#!/usr/bin/perl -w use strict; use Bio::Seq; use Bio::SeqIO; use Bio::Tools::RestrictionEnzyme; # Introduction and obtaining filename containing sequence/ restriction site print "Welcome to the Restriction Analysis Tool.\n"; print "Enter Sequence Filename: "; my $sequencefile = <STDIN>; chomp($sequencefile); print "Enter Query Enzyme: "; my $enz = <STDIN>; chomp($enz); # Creation of input object and opening of sequence file my $seqio_obj = Bio::SeqIO->new(-file => $sequencefile, -format => "fasta"); my $seq_obj = $seqio_obj->next_seq; # Printing out (verifying) input sequence print "\nInput Sequence:\n"; print $seq_obj->seq; print "\n\n"; # Create a new Restriction Enzyme Object my $rest = new Bio::Tools::RestrictionEnzyme(-NAME => $enz); # Get a list of fragments resulting when a sequence is cut by the indicated enzyme my @fragments = $rest->cut_seq($seq_obj); # Outputs enzyme name, restriction site and the fragments generated from cleavage print "Enzyme: $enz \n"; print "Recognition Site: ", $rest->site, "\n\n"; print "List of Fragments: \n"; foreach my $temp (@fragments) { print "$temp\n\n"; } exit ();
Further reading and resources