Perl basic programming
Jump to navigation
Jump to search
Perl: basic programming examples
The contents of this page has recently been imported from an older version of this Wiki. This page may contain outdated information, information that is irrelevant for this Wiki, information that needs to be differently structured, outdated syntax, and/or broken links. Use with caution!
Simple examples of Perl code.
Parts of this code have been originally contributed by Sohrab Shah, Sanja Rogic, Wil Hsiao and others (let me know if you happen to read this and should be listed here) and these parts were taken from the Canadian Bioinformatics Workshop bioinformatics course, where it has been made available through a Creative Commons license.
Contents
- 1 Contents
- 2 Exercise 1 - First print statement
- 3 Exercise 2 - Numerical variables and operators
- 4 Exercise 3 - String variables and operators
- 5 Exercise 4 - working with user input
- 6 Exercise 5 - Arrays
- 7 Exercise 6 - While and if statements
- 8 Exercise 7 - For and foreach loops
- 9 Exercise 8 - Regular expressions
- 10 Exercise 9 - Subroutines
- 11 Exercise 10 - Input and output files
- 12 Exercise 11 - System calls
- 13 Exercise 12 - Putting it all together
- 14 Further reading and resources
Contents
Exercise 1 - First print statement
#!/usr/bin/perl
use strict;
use warnings;
print "My first Perl program\n"; #also try this with single quotes
print "First line\nsecond line and there is a tab\there\n";
exit();
Notes:
- I always use strict; and use warnings;, even on the shortest programs. Mighty warts from tiny programs grow.
- I always end a program with exit(); even though it is not necessary. Why? It immediately tells me where the program ends and that I have copied it completely from wherever I got it.
Exercise 2 - Numerical variables and operators
#!/usr/bin/perl use strict; use warnings; #assign values to variables $x and $y and print them out my $x = 4; my $y = 5.7; print "x is $x and y is $y\n"; #example of arithmetic expression my $z = $x + $y**2; $x++; print "x is $x and z is $z\n"; #evaluating arithmetic expression within print command print "add 3 to $z: $z + 3\n"; #did it work? print "add 3 to $z:", $z + 3,"\n"; exit();
Notes:
- within "strings", variables are interpolated, but not evaluated!
- however, within 'strings', variables are neither interpolated nor evaluated.
Exercise 3 - String variables and operators
#!/usr/bin/perl use strict; use warnings; #TASK: Concatenate two given sequences, #find the length of the new sequence and #print out the second codon of the sequence #assign strings to variables my $DNA = "GATTACACAT"; my $polyA = "AAAA"; #concatenate two strings my $modifiedDNA = $DNA . $polyA; #calculate the length of $modifiedDNA and #print out the value of the variable and its length my $DNAlength = length($modifiedDNA); print "Modified DNA: $modifiedDNA has length $DNAlength\n"; #extract the second codon in $modifiedDNA my $codon = substr($modifiedDNA,3,3); print "Second codon is $codon\n"; exit();
Exercise 4 - working with user input
#!/usr/bin/perl use strict; use warnings; #TASK: Ask the user for her name and age and #calculate her age in days #get a string from the keyboard print "Please enter your name\n"; my $name = <STDIN>; chomp($name); #getting rid of the new line character #prompt the user for his/her age #get a number from the keyboard print "$name please enter your age\n"; my $age = <STDIN>; chomp($age); #calculate age in days my $age_in_days = $age*365; print "You are approximately $age_in_days days old\n"; exit();
Exercise 5 - Arrays
#!/usr/bin/perl use strict; use warnings; #initialize an array my @bases = ("A","C","G","T"); #print two elements of the array print $bases[0],$bases[2],"\n"; #print the whole array print @bases,"\n"; #try with double quotes #print the number of elements in the array print scalar(@bases),"\n"; exit();
Exercise 6 - While and if statements
#!/usr/bin/perl use strict; use warnings; #TASK: Count the frequency of base G in a given DNA sequence my $DNA = "GATTACACAT"; #initialize $countG and $currentPos my $countG = 0; my $currentPos = 0; my $base; #calculate the length of $DNA my $DNAlength = length($DNA); #for each letter in the sequence check if it is the base G #if 'yes' increment $countG while($currentPos < $DNAlength){ $base = substr($DNA,$currentPos,1); if($base eq "G"){ $countG++; } $currentPos++; } #end of while loop #print out the number of Gs print "There are $countG G bases\n"; exit();
Exercise 7 - For and foreach loops
#!/usr/bin/perl use strict; use warnings; my @array; #initialize a 20-element array with numbers 0,...19 for (my $i=0;$i<20;$i++){ $array[$i] = $i; } #print elements one-by-one using foreach my $element; foreach $element (@array){ print "$element\n"; } exit();
Notes:
- a more Perl-ish way to write the first for loop would be the following (although personally I prefer the first version, often called C-style, as being more explicit).
my $i; for $i (0..19){ $array[$i] = $i; }
Exercise 8 - Regular expressions
#!/usr/bin/perl use strict; use warnings; #TASK: For a given DNA sequence find its RNA transcript, #find its reverse complement and check if #the reverse complement contains a start codon my $DNA = "GATTACACAT"; #transcribe DNA to RNA - T changes to U my $RNA = $DNA; $RNA =~ s/T/U/g; print "RNA sequence is $RNA\n"; #find the reverse complement of $DNA using substitution operator #first - reverse the sequence my $rcDNA = reverse($DNA); $rcDNA =~ s/T/A/g; $rcDNA =~ s/A/T/g; $rcDNA =~ s/G/C/g; $rcDNA =~ s/C/G/g; print "Reverse complement of $DNA is $rcDNA\n"; #did it work? #find the reverse complement of $DNA using translation operator #first - reverse the sequence $rcDNA = reverse($DNA); $rcDNA =~ tr/ACGT/TGCA/; print "Reverse complement of $DNA is $rcDNA\n"; #look for a start codon in te reverse sequence if($rcDNA =~ /ATG/){ print "Start codon found\n"; } else{ print "Start codon not found\n"; } exit();
Exercise 9 - Subroutines
#!/usr/bin/perl use strict; use warnings; #TASK: Make a subroutine that calculates the reverse #complement of a DNA sequence and call it from the main program #body of the main program with the function call my $DNA = "GATTACACAT"; my $rcDNA = revcomp($DNA); print "$rcDNA\n"; exit(); #definition of the function for reverse complement sub revcomp{ my($DNAin) = @_; my($DNAout) = reverse($DNAin); $DNAout =~ tr/ACGT/TGCA/; return $DNAout; }
Notes;
- Parameters are passed into a subroutine via the "anonymous array" @_. Accordingly, parameters must be assigned in a so called list context - note the parentheses around $DNAin! If you would omit the parenetheses, $DNAin would be asigned in scalar context, i.e. it would be assigned the length of the array, which is 1, one string element. This is a very common newcomer's mistake.
Exercise 10 - Input and output files
#!/usr/bin/perl use strict; use warnings; #TASK: Read DNA sequences from ‘DNAseq’ input file – #there is one sequence per line #For each sequence find the reverse complement and #print it to ‘DNAseqRC’ output file #open input and output files open(IN,"DNAseq"); open(OUT,">DNAseqRC"); #read the input file line-by-line #for each line find the reverse complement #print it in the output file my $rcDNA; while(<IN>){ chomp; $rcDNA = revcomp($_); print OUT "$rcDNA\n"; } #close input and output files close(IN); close(OUT); exit(); #definition of the function for reverse complement sub revcomp{ my($DNAin) = @_; my($DNAout) = reverse($DNAin); $DNAout =~ tr/ACGT/TGCA/; return $DNAout; }
Exercise 11 - System calls
#!/usr/bin/perl use strict; use warnings; #TASK: Print a list of all Perl programs you wrote today. #These files can be found in your current directory and #they end with the file extension ‘.pl’ print "List of programs I wrote today:\n"; #system call for 'ls' function - the result goes into a string my $listing = `ls`; #these are back quotes #split the string to get individual files my @files = split(/\n/,$listing); #use foreach to step through the array #if a file contains the string '.pl' print it out my $file; foreach $file (@files){ if($file =~ /\.pl$/){ #regular expression: the '.' is escaped "\." print "$file\n"; } } exit();
Exercise 12 - Putting it all together
#!/usr/bin/perl use strict; use warnings; #TASK: Find the reverse complement of a gene, its GC content #and the GC content of its reverse complement. #The gene is stored in a DNA.fasta file. #body of the main program #open input file open(IN,"DNAfasta"); #get a gene name my $name = <IN>; chomp($name); #concatenate all lines from fasta file in one string my $DNA = ""; while(<IN>){ #input goes into $_ chomp; $DNA = $DNA . $_; } close(IN); #call functions to get the reverse complement and GC content my $DNA_gc = gc_content($DNA); my $DNArc = revcomp($DNA); my $DNArc_gc = gc_content($DNArc); #print out the results print "$name has GC content: $DNA_gc\n"; print "reverse complement of $name has GC content: $DNArc_gc\n"; exit(); #definition of the function for reverse complement sub revcomp{ my($DNAin) = @_; my $DNAout = reverse($DNAin); $DNAout =~ tr/ACGT/TGCA/; return ($DNAout); } #definition of GC content function sub gc_content{ my($DNAin) = @_; my $count = 0; my $DNAlength = length($DNAin); #explode DNA string into an array my @bases = split(//,$DNA); #step through the array and count the occurrences of G and C for (my $i=0;$i<$DNAlength;$i++){ if ($bases[$i] =~ /[GC]/){ $count++; } } #return percentage of GC bases return ($count/$DNAlength); }
Further reading and resources