Difference between revisions of "Perl programming exercises 2"
m (→1. Hello World) |
m (→fastaParser) |
||
(7 intermediate revisions by the same user not shown) | |||
Line 18: | Line 18: | ||
Here are programming exercises that focus on translating a concept into a working script. | Here are programming exercises that focus on translating a concept into a working script. | ||
− | [[ | + | [[Perl_programming_exercises_1| The preceding section]] covers a section I have called syntax examples ... they are simple tasks that ask you to write functioning code syntactically correct. |
Line 32: | Line 32: | ||
==Tasks== | ==Tasks== | ||
− | === | + | ===Hello World=== |
− | |||
− | |||
| | ||
;Executable program | ;Executable program | ||
− | Write a Perl program <code>helloWorld.pl</code> that prints out "Hello World" (or whatever you fancy) to the terminal. Make your program executable (<code>chmod u+x helloWorld.pl</code>) so that you don't need to invoke the Perl interpreter explicitly from the command line (i.e. just "<code>$ helloWorld.pl</code>" should run it, you shouldn't need to type "<code>$ perl helloWorld.pl</code>"). | + | Write a Perl program <code>helloWorld.pl</code> that prints out "Hello World" (or whatever you fancy) to the terminal. Make your program executable (<code>chmod u+x helloWorld.pl</code>) so that you don't need to invoke the Perl interpreter explicitly from the command line (i.e. just "<code>$ ./helloWorld.pl</code>" should run it, you shouldn't need to type "<code>$ perl helloWorld.pl</code>"). |
− | |||
− | |||
− | |||
| | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#EEEEFF; padding:10px;"> | ||
+ | Hints: | ||
<div class="mw-collapsible-content"> | <div class="mw-collapsible-content"> | ||
− | |||
Line 52: | Line 48: | ||
Simply use the print(); function. | Simply use the print(); function. | ||
− | |||
| | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#DDDDFF; padding:10px;"> | ||
+ | Code: | ||
<div class="mw-collapsible-content"> | <div class="mw-collapsible-content"> | ||
+ | |||
| | ||
Line 68: | Line 66: | ||
</source> | </source> | ||
− | |||
− | |||
| | ||
Line 76: | Line 72: | ||
</div> | </div> | ||
</div> | </div> | ||
+ | |||
| | ||
− | === | + | ===cat()=== |
− | |||
− | |||
− | |||
− | |||
− | |||
+ | | ||
+ | ;Keyboard input | ||
+ | Write a Perl program cat.pl that prints to the terminal a single line that you type at the keyboard. | ||
− | ==== | + | |
− | + | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#EEEEFF; padding:10px;"> | |
− | + | Hints: | |
+ | <div class="mw-collapsible-content"> | ||
− | + | | |
− | ; | + | Use the diamond operator to read from STDIN, assign this to a variable, then print the contents of the variable. Just one statement, no loop is required. |
− | |||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#DDDDFF; padding:10px;"> | ||
+ | Code: | ||
+ | <div class="mw-collapsible-content"> | ||
− | |||
− | |||
− | |||
+ | | ||
+ | <source lang="perl"> | ||
+ | #!/usr/bin/perl | ||
+ | use warnings; | ||
+ | use strict; | ||
− | + | my $line; | |
− | ; | ||
− | |||
+ | $line = <STDIN>; | ||
− | + | print( $line, "\n"); | |
− | ; | ||
− | |||
+ | exit(); | ||
+ | </source> | ||
− | |||
− | |||
− | |||
− | + | | |
− | + | </div> | |
− | + | </div> | |
− | + | </div> | |
+ | </div> | ||
− | ==== | + | |
− | + | ===lc()=== | |
− | |||
− | |||
− | |||
− | + | | |
+ | ;More keyboard input | ||
+ | Write a Perl program lc.pl that reads one or many lines from STDIN, converts them to lowercase and prints them to the terminal. Use this interactively, typing input (end by typing <ctrl>D), then use this by redirecting a textfile to your program, then "pipe" the output of the Unix "ls" command into your program. | ||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#EEEEFF; padding:10px;"> | ||
+ | Hints: | ||
+ | <div class="mw-collapsible-content"> | ||
− | |||
− | |||
+ | | ||
+ | Use a while loop to test the successful assignment of <STDIN> to a variable as its loop condition. This way thee loop runs until STDIN reads EOF (End of File). Use the perl lc(); function to change case. Assign the return value to a variable and print it. | ||
− | ==== | + | |
− | + | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#DDDDFF; padding:10px;"> | |
+ | Code: | ||
+ | <div class="mw-collapsible-content"> | ||
− | + | | |
− | |||
<source lang="perl"> | <source lang="perl"> | ||
− | + | #!/usr/bin/perl | |
+ | use warnings; | ||
+ | use strict; | ||
+ | |||
+ | while (my $line = <STDIN>) { | ||
+ | $line = lc($line); | ||
+ | print( $line, "\n"); | ||
+ | } | ||
+ | exit(); | ||
</source> | </source> | ||
− | |||
− | + | | |
− | + | </div> | |
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
− | |||
− | |||
− | |||
− | |||
− | + | | |
+ | ===max=== | ||
− | |||
− | |||
− | |||
− | |||
− | + | | |
− | + | ;Condition | |
+ | Write a Perl program max.pl that prompts for and reads two numbers from STDIN, and outputs the larger of the two numbers to the terminal. Remember to consider the case that the numbers may be equal. | ||
− | + | | |
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#EEEEFF; padding:10px;"> | ||
+ | Hints: | ||
+ | <div class="mw-collapsible-content"> | ||
− | |||
+ | | ||
+ | You need an | ||
<source lang="perl"> | <source lang="perl"> | ||
− | + | if (condition) { do ... } | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
</source> | </source> | ||
+ | construction to print one or the other numbers, depending on the result of the comparison. Remember the difference between numeric and alphanumeric comparisons! You have to chomp(); your input variables, to be able to compare them as numbers. | ||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#DDDDFF; padding:10px;"> | ||
+ | Code: | ||
+ | <div class="mw-collapsible-content"> | ||
− | |||
− | = | + | |
− | + | <source lang="perl"> | |
+ | #!/usr/bin/perl | ||
+ | use warnings; | ||
+ | use strict; | ||
− | + | print("Enter a number: "); # User inputs | |
+ | my $input1 = <STDIN>; | ||
+ | print("Enter a second number: "); | ||
+ | my $input2 = <STDIN>; | ||
− | + | chomp($input1); # Chomp off trailing newline characters | |
− | + | chomp($input2); | |
+ | if ($input1 > $input2) { | ||
+ | print("$input1 is larger than $input2.\n"); | ||
+ | } elsif ($input1 == $input2) { | ||
+ | print ("$input1 and $input2 are equal.\n"); | ||
+ | } else { | ||
+ | print ("$input2 is larger than $input1.\n"); | ||
+ | } | ||
− | + | exit(); | |
− | + | </source> | |
− | + | | |
− | + | </div> | |
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
− | === | + | |
− | + | ===max (with subroutine)=== | |
− | |||
− | + | | |
− | + | ;Subroutine | |
+ | Rewrite max.pl so the comparison is done in a subroutine: pass the two numbers as arguments into a subroutine and return the larger of the two. Such a program may be a useful framework for comparing two datasets with a non-trivial metric. Instead of simply picking the larger value, the subroutine could compare according to some sophisticated algortithm | ||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#EEEEFF; padding:10px;"> | ||
+ | Hints: | ||
+ | <div class="mw-collapsible-content"> | ||
− | |||
− | |||
− | |||
− | + | | |
− | + | Remember that Perl uses the default array "@_" to pass values into subroutines. You need to assign the contents of @_ to variables (or other arrays) in order to be able to use the values. The easiest way to do this, is to assign the array to values in a list - e.g. | |
− | |||
− | |||
− | |||
− | |||
− | |||
<source lang="perl"> | <source lang="perl"> | ||
− | + | my ($a) = @_; or ... | |
− | + | my ($a, $b) = @_; | |
− | |||
− | |||
− | my $ | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
</source> | </source> | ||
− | + | '''Note that the following will not work as expected !''' | |
− | |||
− | |||
− | |||
<source lang="perl"> | <source lang="perl"> | ||
− | + | my $a = @_; | |
− | + | </source> | |
− | |||
− | + | If you would do this, you would be assigning an array "@" to a scalar "$". The problem is that this is legal, the compiler does not complain or warn, but this does not assign the first value in the array, it assigns the integer value of the number of fields the array uses ! This is a fine case of a statement being syntactically correct but logically wrong. If in doubt whether you are doing the right thing, always print your values from within the subroutine, as a development test, to make sure they are what you expect them to be. | |
− | |||
− | |||
− | |||
− | |||
− | |||
+ | <!-- | ||
+ | // note to self: write examples explicitly! | ||
+ | You can return single scalars from subroutines, or you can return multiple values, as lists, as in the following example: | ||
− | + | subroutine returns... main program assigns ... | |
<source lang="perl"> | <source lang="perl"> | ||
− | + | return($a); | |
− | + | my $larger = max($in1, $in2); | |
− | |||
− | |||
− | |||
− | + | return($a, $b); | |
− | my $ | + | |
+ | my ($larger, $smaller) = max($in1, $in2); | ||
+ | </source> | ||
− | |||
− | |||
− | + | --> | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | | |
− | < | + | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#DDDDFF; padding:10px;"> |
+ | Code: | ||
+ | <div class="mw-collapsible-content"> | ||
− | + | | |
− | |||
− | |||
<source lang="perl"> | <source lang="perl"> | ||
#!/usr/bin/perl | #!/usr/bin/perl | ||
Line 331: | Line 327: | ||
+ | | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | |||
+ | |||
+ | | ||
+ | ===anagram=== | ||
− | |||
+ | | ||
+ | ;Array | ||
+ | Write a Perl program anagram.pl that reads a string from STDIN and returns ten random permutations of this string. This will require a number of concepts and techniques of working with arrays - defining an array, assigning values to an array, or to individual fields of an array, using a variable as an index to an array in order to read from or write to specific fields, and more. First split your string into individual elements of an array. Use a subroutine that randomizes this array by looping over every position of the array, and swapping the contents of this position with a randomly chosen other position of the array, except itself. Write this in pseudocode first. The Perl functions you will need are split(); and rand();. | ||
+ | |||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#EEEEFF; padding:10px;"> | ||
+ | Hints: | ||
+ | <div class="mw-collapsible-content"> | ||
+ | |||
+ | |||
+ | | ||
+ | You have to chomp(); your input in order not to shuffle the newline character (“return” character) into your randomized strings; otherwise you’ll end up with strangely shortened versions of your randomized string, split into two parts. To get the array size, use the index of the last array position plus one (remember that array positions are numbered starting at 0, not 1 ! ). To split a string into individual elements of an array, use split(//, $input); with no delimiter, i.e. with no other characters in between the slashes, not even a space. Assigning the result of split(); to an array puts every character of the string into its own array field. When randomizing the array, note that rand(); returns a random rational number, not an integer, so you may need to use int(); to truncate the result of rand(); and just return the integer part. Use variables to store values from the array before the swap, otherwise the original value stored in a given array position will be lost before it can be copied over to the new array position that you want to swap it to. Also note that all array positions should be switched, so you need to consider the case that your random integer is the same as the position of the original value. | ||
+ | |||
+ | When you are done, see what happens when you comment out the chomp(); function, for effect. | ||
+ | |||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#DDDDFF; padding:10px;"> | ||
+ | Code: | ||
+ | <div class="mw-collapsible-content"> | ||
+ | |||
+ | |||
+ | | ||
<source lang="perl"> | <source lang="perl"> | ||
#!/usr/bin/perl | #!/usr/bin/perl | ||
Line 365: | Line 391: | ||
exit(); | exit(); | ||
− | |||
# ===== randomize () ============================= | # ===== randomize () ============================= | ||
Line 426: | Line 451: | ||
deserves some comment. $j starts as the index of the last element in the array. $rand(n) returns a random, rational number from the interval [0,n[ i.e. 0 ≤ number < n. Assume our array had four elements: $rand(3+1) would return numbers from 0.000... to 3.999... Since int() does not round the number, but just truncates its decimals and returns its integer part, we return random integers from 0 to 3, each with uniform probability. That happens to be exactly the range of elements that can be used to randomly point somewhere into our array. | deserves some comment. $j starts as the index of the last element in the array. $rand(n) returns a random, rational number from the interval [0,n[ i.e. 0 ≤ number < n. Assume our array had four elements: $rand(3+1) would return numbers from 0.000... to 3.999... Since int() does not round the number, but just truncates its decimals and returns its integer part, we return random integers from 0 to 3, each with uniform probability. That happens to be exactly the range of elements that can be used to randomly point somewhere into our array. | ||
− | ==== | + | |
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | |||
+ | |||
+ | | ||
+ | ===anastring === | ||
+ | |||
+ | |||
+ | |||
+ | | ||
+ | ;Substring calisthenics | ||
+ | Copy anagram.pl to a file named anastring.pl and use the Perl strlen(); and substr(); functions to permute the string (in place !) instead of shuffling fields of an array. Make a point of programming this incrementally step by step, writing output as you go along to make sure you are doing it right. Of course you could also shuffle using the split() and join() functions on a string ... but that would not be "in place". | ||
+ | |||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#EEEEFF; padding:10px;"> | ||
+ | Hints: | ||
+ | <div class="mw-collapsible-content"> | ||
+ | |||
+ | | ||
+ | This is similar to anagram.pl but uses substr(); on the original string instead of shuffling fields of an array. Remember that substr(); can be used to extract defined substrings as well as to replace them. As with anagram.pl, use variables to store the characters that you want to swap, to prevent the original character from being lost when you overwrite one of the two positions in the string. Use int(); on the result of rand(); to get a random position in the string and think carefully about the range of numbers that this should produce. The range is obviously a function of the string-length - but does it start at 0 or 1 and does it extend to the length itself, or more or less ? Test whether the range you produce is correct. | ||
+ | |||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#DDDDFF; padding:10px;"> | ||
+ | Code: | ||
+ | <div class="mw-collapsible-content"> | ||
+ | |||
+ | |||
+ | | ||
<source lang="perl"> | <source lang="perl"> | ||
#!/usr/bin/perl | #!/usr/bin/perl | ||
Line 460: | Line 515: | ||
+ | | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | |||
+ | |||
+ | | ||
+ | ===anastring (with commandline input)=== | ||
− | |||
+ | |||
+ | | ||
+ | ;Commandline arguments | ||
+ | Modify anastring.pl so you can pass the number of permutations to the program in the commandline ( via @ARGV ), make sure that the default is 1, if no argument is given. This tool could be part of a routine to generate random data to test statistical significance. | ||
+ | |||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#EEEEFF; padding:10px;"> | ||
+ | Hints: | ||
+ | <div class="mw-collapsible-content"> | ||
+ | |||
+ | |||
+ | | ||
+ | The whole commandline that you give to a Perl program is stored in the array-variable named @ARGV. $ARGV[0] is the first argument $ARGV[0] is the second, and so on. To check whether some variable is defined, use the function defined($someVariable); in an if statement. If no command line argument has been typed, $ARGV[0] will be undefined. | ||
+ | |||
+ | |||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#DDDDFF; padding:10px;"> | ||
+ | Code: | ||
+ | <div class="mw-collapsible-content"> | ||
+ | |||
+ | |||
+ | | ||
Simply change: | Simply change: | ||
Line 486: | Line 571: | ||
− | + | | |
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | |||
+ | |||
+ | | ||
+ | ===sort=== | ||
+ | |||
+ | |||
+ | |||
+ | | ||
+ | ;Sorting | ||
+ | Write a Perl program sort.pl that takes in strings (e.g. names) from STDIN, stores them in an array, sorts them in alphabetical order, using the Perl sort(); function and prints them out to the terminal. | ||
+ | |||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#EEEEFF; padding:10px;"> | ||
+ | Hints: | ||
+ | <div class="mw-collapsible-content"> | ||
+ | |||
+ | |||
+ | | ||
+ | Declare a variable to use as an array index and initialize it with the value 0. Assign the entire input string to the current array position $array[$index], then increment the index variable so it points to the next available field. (The field of an array can hold integers, floats, strings, other arrays, hashes, references to arrays, ...) Sort the array using the Perl sort(); function. | ||
+ | |||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#DDDDFF; padding:10px;"> | ||
+ | Code: | ||
+ | <div class="mw-collapsible-content"> | ||
+ | |||
+ | |||
+ | | ||
<source lang="perl"> | <source lang="perl"> | ||
#!/usr/bin/perl | #!/usr/bin/perl | ||
Line 514: | Line 630: | ||
+ | | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | |||
+ | |||
+ | | ||
+ | ===fastaParser=== | ||
− | |||
+ | | ||
+ | ;Hash (associative array) | ||
+ | Write a Perl program fastaParser.pl that takes in a FASTA file of any protein as input and outputs it as three-letter amino acid code separated by spaces to the terminal. Use a hash to store the mapping between the one-letter amino acid code and the three-letter amino acid code. | ||
+ | |||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#EEEEFF; padding:10px;"> | ||
+ | Hints: | ||
+ | <div class="mw-collapsible-content"> | ||
+ | |||
+ | |||
+ | | ||
+ | To parse out the definition line of a FASTA file, use substr(); to get the first character of each line and test to see if it is ">". Read in each line of the FASTA file and store it as an array, character by character (as with anagram.pl). Loop over the contents of the array and retrieve the three-letter code for the amino acid, using a hash that maps one-letter amino acid codes to three-letter amino acid codes. | ||
+ | |||
+ | Hint about the hash: it’s similar in concept to the amino acid code hash that was used in one of the programs written in class… think about which way the amino acid code mapping was applied with that hash and try to apply the principles here. | ||
+ | |||
+ | |||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#DDDDFF; padding:10px;"> | ||
+ | Code: | ||
+ | <div class="mw-collapsible-content"> | ||
+ | |||
+ | |||
+ | | ||
<source lang="perl"> | <source lang="perl"> | ||
#!/usr/bin/perl | #!/usr/bin/perl | ||
Line 587: | Line 734: | ||
} # end sub | } # end sub | ||
− | <source | + | </source> |
+ | |||
+ | |||
+ | | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | |||
+ | |||
+ | | ||
+ | ===factorial=== | ||
+ | |||
+ | |||
+ | |||
+ | | ||
+ | ;Iteration and recursion | ||
+ | Write a Perl program factorial.pl that takes in a number as input and calculates the factorial of that number. Note that this can be done in (at least) two ways: the first way is to use a <code>for</code> loop in the body of the program... | ||
+ | |||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#EEEEFF; padding:10px;"> | ||
+ | Hints: | ||
+ | <div class="mw-collapsible-content"> | ||
+ | |||
+ | | ||
+ | Remember to think about all types of outcomes when designing your conditions in if/else statements: a negative factorial is undefined, and both 0! and 1! are equal to 1. Use die(); rather than exit(); to indicate that an unexpected input has been entered that the program cannot handle. Both cause the program to terminate, but die(); allows you to enter an error message on program exit, e.g. die(“Negative factorial is undefined.”);. Use a for loop to multiply out the factorial of the input number, and use a variable to store the value of the factorial during intermediate steps in calculation. | ||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#DDDDFF; padding:10px;"> | ||
+ | Code: | ||
+ | <div class="mw-collapsible-content"> | ||
− | |||
+ | | ||
<source lang="perl"> | <source lang="perl"> | ||
#!/usr/bin/perl | #!/usr/bin/perl | ||
Line 622: | Line 798: | ||
} # end sub | } # end sub | ||
</source> | </source> | ||
+ | |||
+ | |||
+ | | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | |||
+ | |||
+ | | ||
+ | |||
+ | ===factorial(recursive)=== | ||
+ | | ||
+ | (OPTIONAL) ...the second way is to use a subroutine '''recursively''' to yield the factorial of a number. Try programming it this way as well. | ||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#EEEEFF; padding:10px;"> | ||
+ | Hints: | ||
+ | <div class="mw-collapsible-content"> | ||
+ | |||
+ | |||
+ | | ||
+ | Recursion means a function calls itself. Such a subroutine or program needs defined “base cases”, for which the subroutine can return a value without having to call itself again (allowing the program or subroutine to terminate, otherwise it would just go deeper, and deeper...). The base cases for factRecurse are exactly the same as for factorial.pl – negative factorial should return an error, and both 0! and 1! should return 1. In place of the for loop used in factorial.pl, each recursion of the subroutine in factRecurse.pl performs one small step (the small step that would be performed with each iteration of the for loop) and then applies it to the next subroutine call. | ||
+ | (e.g. $resultOfSomeStep + subRoutine($currentCall – 1)). | ||
+ | |||
+ | | ||
+ | <div class="mw-collapsible mw-collapsed" data-expandtext="Expand" data-collapsetext="Collapse" style="width:90%; border:solid 1px; background-color:#DDDDFF; padding:10px;"> | ||
+ | Code: | ||
+ | <div class="mw-collapsible-content"> | ||
− | |||
+ | | ||
<source lang="perl"> | <source lang="perl"> | ||
#!/usr/bin/perl | #!/usr/bin/perl | ||
Line 655: | Line 859: | ||
+ | | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | | ||
− | |||
− | |||
− | |||
<!-- | <!-- | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
==Notes== | ==Notes== | ||
<references /> | <references /> |
Latest revision as of 22:26, 25 October 2012
Perl programming exercises 2
Small programming exercises for an introduction to Perl.
Contents
- Basic Perl Programming - Programming exercises
Boris Steipe acknowledges contributions by Jennifer Tsai, Sanja Rogic and Sohrab Shah.
Here are programming exercises that focus on translating a concept into a working script.
The preceding section covers a section I have called syntax examples ... they are simple tasks that ask you to write functioning code syntactically correct.
For each task you will find
- a description of the task your code should achieve;
- some hints how to go about solving it - which functions you might use or which strategy; and
- sample code for reference if you are stuck. Should you really need to look up the samples, carefully study the code, put it away and then write your own script from scratch, with different code and perhaps some variation in function. If you merely copy code, or read it with mild interest and move on, you will probably be wasting your time.
- Don't be satisfied until you understand what you are doing.
Tasks
Hello World
- Executable program
Write a Perl program helloWorld.pl
that prints out "Hello World" (or whatever you fancy) to the terminal. Make your program executable (chmod u+x helloWorld.pl
) so that you don't need to invoke the Perl interpreter explicitly from the command line (i.e. just "$ ./helloWorld.pl
" should run it, you shouldn't need to type "$ perl helloWorld.pl
").
Hints:
Simply use the print(); function.
Code:
#!/usr/bin/perl
use warnings;
use strict;
print("Hello World !\n");
exit();
cat()
- Keyboard input
Write a Perl program cat.pl that prints to the terminal a single line that you type at the keyboard.
Hints:
Use the diamond operator to read from STDIN, assign this to a variable, then print the contents of the variable. Just one statement, no loop is required.
Code:
#!/usr/bin/perl
use warnings;
use strict;
my $line;
$line = <STDIN>;
print( $line, "\n");
exit();
lc()
- More keyboard input
Write a Perl program lc.pl that reads one or many lines from STDIN, converts them to lowercase and prints them to the terminal. Use this interactively, typing input (end by typing <ctrl>D), then use this by redirecting a textfile to your program, then "pipe" the output of the Unix "ls" command into your program.
Hints:
Use a while loop to test the successful assignment of <STDIN> to a variable as its loop condition. This way thee loop runs until STDIN reads EOF (End of File). Use the perl lc(); function to change case. Assign the return value to a variable and print it.
Code:
#!/usr/bin/perl
use warnings;
use strict;
while (my $line = <STDIN>) {
$line = lc($line);
print( $line, "\n");
}
exit();
max
- Condition
Write a Perl program max.pl that prompts for and reads two numbers from STDIN, and outputs the larger of the two numbers to the terminal. Remember to consider the case that the numbers may be equal.
Hints:
You need an
if (condition) { do ... }
construction to print one or the other numbers, depending on the result of the comparison. Remember the difference between numeric and alphanumeric comparisons! You have to chomp(); your input variables, to be able to compare them as numbers.
Code:
#!/usr/bin/perl
use warnings;
use strict;
print("Enter a number: "); # User inputs
my $input1 = <STDIN>;
print("Enter a second number: ");
my $input2 = <STDIN>;
chomp($input1); # Chomp off trailing newline characters
chomp($input2);
if ($input1 > $input2) {
print("$input1 is larger than $input2.\n");
} elsif ($input1 == $input2) {
print ("$input1 and $input2 are equal.\n");
} else {
print ("$input2 is larger than $input1.\n");
}
exit();
max (with subroutine)
- Subroutine
Rewrite max.pl so the comparison is done in a subroutine: pass the two numbers as arguments into a subroutine and return the larger of the two. Such a program may be a useful framework for comparing two datasets with a non-trivial metric. Instead of simply picking the larger value, the subroutine could compare according to some sophisticated algortithm
Hints:
Remember that Perl uses the default array "@_" to pass values into subroutines. You need to assign the contents of @_ to variables (or other arrays) in order to be able to use the values. The easiest way to do this, is to assign the array to values in a list - e.g.
my ($a) = @_; or ...
my ($a, $b) = @_;
Note that the following will not work as expected !
my $a = @_;
If you would do this, you would be assigning an array "@" to a scalar "$". The problem is that this is legal, the compiler does not complain or warn, but this does not assign the first value in the array, it assigns the integer value of the number of fields the array uses ! This is a fine case of a statement being syntactically correct but logically wrong. If in doubt whether you are doing the right thing, always print your values from within the subroutine, as a development test, to make sure they are what you expect them to be.
Code:
#!/usr/bin/perl
use warnings;
use strict;
print("Enter a number: "); # User inputs
my $input1 = <STDIN>;
print("Enter a second number: ");
my $input2 = <STDIN>;
if ($input1 eq $input2) { # eq: string equal.
print("Both inputs are equal.\n");
} else {
my $larger = compare($input1, $input2); # compare arguments in subroutine
# and return larger value
print("$larger is larger.\n");
}
exit();
# =======================================================
# Subroutine "compare" returns the larger of two inputs
# or $a if they are equal.
sub compare {
my ($a, $b) = @_; # Pass a list of variables into the subroutine
chomp($a); # Chomp off trailing newline characters
chomp($b); # for numeric comparison
if ($a >= $b) { # numeric greater-or-equal
return($a);
} else {
return($b);
}
} # end subroutine
anagram
- Array
Write a Perl program anagram.pl that reads a string from STDIN and returns ten random permutations of this string. This will require a number of concepts and techniques of working with arrays - defining an array, assigning values to an array, or to individual fields of an array, using a variable as an index to an array in order to read from or write to specific fields, and more. First split your string into individual elements of an array. Use a subroutine that randomizes this array by looping over every position of the array, and swapping the contents of this position with a randomly chosen other position of the array, except itself. Write this in pseudocode first. The Perl functions you will need are split(); and rand();.
Hints:
You have to chomp(); your input in order not to shuffle the newline character (“return” character) into your randomized strings; otherwise you’ll end up with strangely shortened versions of your randomized string, split into two parts. To get the array size, use the index of the last array position plus one (remember that array positions are numbered starting at 0, not 1 ! ). To split a string into individual elements of an array, use split(//, $input); with no delimiter, i.e. with no other characters in between the slashes, not even a space. Assigning the result of split(); to an array puts every character of the string into its own array field. When randomizing the array, note that rand(); returns a random rational number, not an integer, so you may need to use int(); to truncate the result of rand(); and just return the integer part. Use variables to store values from the array before the swap, otherwise the original value stored in a given array position will be lost before it can be copied over to the new array position that you want to swap it to. Also note that all array positions should be switched, so you need to consider the case that your random integer is the same as the position of the original value.
When you are done, see what happens when you comment out the chomp(); function, for effect.
Code:
#!/usr/bin/perl
use warnings;
use strict;
# Constants
my $COUNT = 10; # Number of times to call randomizing subroutine
# Declare variables
my $stringInput; # Initial input string
my @stringArray; # Input string array after splitting into
# an array that stores each character in one array element
print("Enter a string to randomize: "); # Accept user input
$stringInput = <STDIN>;
chomp($stringInput); # Remove newline character from input string
# Split string input into an array that stores each character as a
# separate element
@stringArray = split(//, $stringInput);
# Call randomize 10 times in order to return ten random permutations
# of the input string
for (my $i = 0; $i < $COUNT; $i++) {
# Pass string array to subroutine "randomize"
printRandomized(@stringArray);
}
exit();
# ===== randomize () =============================
# Subroutine "randomize" loops over every position of the array
# passed to it, and swaps the contents of this position with
# a randomly chosen other position of the array. This implements the
# so-called Fisher-Yates shuffle, an efficient in-place shuffle
# that gives equal weight to all N! permutations.
sub printRandomized {
my (@randArray) = @_;
# get array size: since arrays start from index 0, the size
# of the array is equal to the index of the last array element plus 1
# Perl provides three ways to get the array size:
# "$#Array" is the index of the last element.
# You can also assign the array to a scalar, you get its size
# as in "$N_fields = @Array;" This value is one more than "$#Array",
# since the index of the first element is 0, not 1.
# For clean code I prefer the third version: "$size = scalar(@Array);"
# since it is more explicit.
my $arraySize = scalar(@randArray);
# iterate through every element in the array, beginning at
# the last element and counting down
for (my $j = $arraySize - 1; $j > 0; $j--) {
# assign the contents of the first element to a temporary
# variable
my $arrayPos1 = $randArray[$j];
# get random array position less or equal to $j
my $randInt = int(rand($j + 1));
# assign the contents of the second element to a temporary
# variable
my $arrayPos2 = $randArray[$randInt];
# swap the contents of the two array positions
$randArray[$randInt] = $arrayPos1;
$randArray[$j] = $arrayPos2;
} # end for (iterating through elements in the array)
# print result of the randomization
for (my $k = 0; $k < $arraySize; $k++) {
print($randArray[$k]);
} # end for (printing randomized array)
print("\n");
} # end subroutine "printRandomize"
The construct
int(rand($j+1))
deserves some comment. $j starts as the index of the last element in the array. $rand(n) returns a random, rational number from the interval [0,n[ i.e. 0 ≤ number < n. Assume our array had four elements: $rand(3+1) would return numbers from 0.000... to 3.999... Since int() does not round the number, but just truncates its decimals and returns its integer part, we return random integers from 0 to 3, each with uniform probability. That happens to be exactly the range of elements that can be used to randomly point somewhere into our array.
anastring
- Substring calisthenics
Copy anagram.pl to a file named anastring.pl and use the Perl strlen(); and substr(); functions to permute the string (in place !) instead of shuffling fields of an array. Make a point of programming this incrementally step by step, writing output as you go along to make sure you are doing it right. Of course you could also shuffle using the split() and join() functions on a string ... but that would not be "in place".
Hints:
This is similar to anagram.pl but uses substr(); on the original string instead of shuffling fields of an array. Remember that substr(); can be used to extract defined substrings as well as to replace them. As with anagram.pl, use variables to store the characters that you want to swap, to prevent the original character from being lost when you overwrite one of the two positions in the string. Use int(); on the result of rand(); to get a random position in the string and think carefully about the range of numbers that this should produce. The range is obviously a function of the string-length - but does it start at 0 or 1 and does it extend to the length itself, or more or less ? Test whether the range you produce is correct.
Code:
#!/usr/bin/perl
use warnings;
use strict;
my $count = 10; # controls number of anagrams to print
my $string = <STDIN>; # retrieve one string
chomp($string); # if we would not chomp(), we would
# swap the linefeed into our anagrams
my $len = length($string);
my $pos; # a variable to store a random
# position in the string
for (my $i=0; $i < $count; $i++) { # for desired number of anagrams ...
for (my $j=0; $j < $len; $j++) { # for every character in string ...
$pos = int(rand($len)); # calculate random position in string
while ($pos == $j) { # if this is the same integer as $j ...
$pos = int(rand($len)); # ... try again
}
my $tmp = substr($string, $j, 1); # store character j
substr($string, $j, 1) = substr($string, $pos, 1); # swap pos to j
substr($string, $pos, 1) = $tmp; # swap tmp to pos
}
print($string, "\n"); # print the randomized string
}
exit ();
anastring (with commandline input)
- Commandline arguments
Modify anastring.pl so you can pass the number of permutations to the program in the commandline ( via @ARGV ), make sure that the default is 1, if no argument is given. This tool could be part of a routine to generate random data to test statistical significance.
Hints:
The whole commandline that you give to a Perl program is stored in the array-variable named @ARGV. $ARGV[0] is the first argument $ARGV[0] is the second, and so on. To check whether some variable is defined, use the function defined($someVariable); in an if statement. If no command line argument has been typed, $ARGV[0] will be undefined.
Code:
Simply change:
my $count = 10;
to:
my $count = 1;
if (defined($ARGV[0]) ) { $count = $ARGV[0] };
Then use like (for example)
$ anastring.pl 100 < test.txt
assuming you have a file named test.txt with the contents you want to randomize, or
$ echo "acdefghiklmnpqrstvwy" | anastring.pl 100
sort
- Sorting
Write a Perl program sort.pl that takes in strings (e.g. names) from STDIN, stores them in an array, sorts them in alphabetical order, using the Perl sort(); function and prints them out to the terminal.
Hints:
Declare a variable to use as an array index and initialize it with the value 0. Assign the entire input string to the current array position $array[$index], then increment the index variable so it points to the next available field. (The field of an array can hold integers, floats, strings, other arrays, hashes, references to arrays, ...) Sort the array using the Perl sort(); function.
Code:
#!/usr/bin/perl
use warnings;
use strict;
my $index = 0; # initialize a variable to use as index to array
my $currentInput; # stores input values from STDIN
my @arrayOfStrings; # array of strings in their original order
my @sortedArray; # array of strings sorted in alphabetical order
while ($currentInput = <STDIN>) { # Retrieve strings from STDIN
$arrayOfStrings[$index] = $currentInput; # Store in array
$index++; # increment index
}
@sortedArray = sort(@arrayOfStrings); # Sort array fields
for (my $i = 0; $i < $index; $i++) { # Print all array fields
print($sortedArray[$i]);
}
exit();
fastaParser
- Hash (associative array)
Write a Perl program fastaParser.pl that takes in a FASTA file of any protein as input and outputs it as three-letter amino acid code separated by spaces to the terminal. Use a hash to store the mapping between the one-letter amino acid code and the three-letter amino acid code.
Hints:
To parse out the definition line of a FASTA file, use substr(); to get the first character of each line and test to see if it is ">". Read in each line of the FASTA file and store it as an array, character by character (as with anagram.pl). Loop over the contents of the array and retrieve the three-letter code for the amino acid, using a hash that maps one-letter amino acid codes to three-letter amino acid codes.
Hint about the hash: it’s similar in concept to the amino acid code hash that was used in one of the programs written in class… think about which way the amino acid code mapping was applied with that hash and try to apply the principles here.
Code:
#!/usr/bin/perl
use warnings;
use strict;
# Declare variables
my $line; # current line being read in from STDIN
my $char; # store first character in the line (to test for ">")
my @oneLetterLine; # split each line of FASTA file into individual letters
my %oneToThree; # hash that stores mappings from one-letter amino acid
# code to three-letter amino acid code
# Initialize the hash mapping
mapOneToThree();
while ($line = <STDIN>) { # Read input (FASTA format) line by lin
$line = uc($line); # Translate to uppercase characters
$char = substr($line, 0, 1); # Extract first character
if ($char ne ">") { # only if it's not the title line ...
chomp($line); # chomp off the newline character
# store each character in the line as an element in an array
@oneLetterLine = split(//,$line);
# get the size of the array (since arrays start at index 0,
# the size of the array is the last array index plus 1
my $arraySize = $#oneLetterLine + 1;
# print three-letter amino acid code mapping
for (my $i = 0; $i < $arraySize; $i++) {
print($oneToThree{$oneLetterLine[$i]}, " ");
} # end for
} # end if
} # end while
print("\n"); # Print final newline character
exit(); # Exit the program
# =================================================================
# Subroutine to generate the hash that maps one-letter amino acid
# code to three-letter amino acid code
sub mapOneToThree {
$oneToThree{'A'} = 'Ala';
$oneToThree{'C'} = 'Cys';
$oneToThree{'D'} = 'Asp';
$oneToThree{'E'} = 'Glu';
$oneToThree{'F'} = 'Phe';
$oneToThree{'G'} = 'Gly';
$oneToThree{'H'} = 'His';
$oneToThree{'I'} = 'Ile';
$oneToThree{'K'} = 'Lys';
$oneToThree{'L'} = 'Leu';
$oneToThree{'M'} = 'Met';
$oneToThree{'N'} = 'Asn';
$oneToThree{'P'} = 'Pro';
$oneToThree{'Q'} = 'Gln';
$oneToThree{'R'} = 'Arg';
$oneToThree{'S'} = 'Ser';
$oneToThree{'T'} = 'Thr';
$oneToThree{'V'} = 'Val';
$oneToThree{'W'} = 'Trp';
$oneToThree{'Y'} = 'Tyr';
} # end sub
factorial
- Iteration and recursion
Write a Perl program factorial.pl that takes in a number as input and calculates the factorial of that number. Note that this can be done in (at least) two ways: the first way is to use a for
loop in the body of the program...
Hints:
Remember to think about all types of outcomes when designing your conditions in if/else statements: a negative factorial is undefined, and both 0! and 1! are equal to 1. Use die(); rather than exit(); to indicate that an unexpected input has been entered that the program cannot handle. Both cause the program to terminate, but die(); allows you to enter an error message on program exit, e.g. die(“Negative factorial is undefined.”);. Use a for loop to multiply out the factorial of the input number, and use a variable to store the value of the factorial during intermediate steps in calculation.
Code:
#!/usr/bin/perl
use warnings;
use strict;
my $number = <STDIN>; # number is read in from STDIN
chomp($number); # remove newline
print(fact($number),"\n"); # print statement calls subroutine fact()
exit();
# ========================================================
sub fact {
my ($n) = @_; # receive argument via @_
my $factorial = 1;
if ($n < 0) {
die("panic: fact($n) negative factorial is undefined. ");
} elsif ($n == 0 or $n == 1) {
return 1;
} else {
for (my $i = 2; $i <= $n; $i++) {
$factorial = $factorial * $i;
}
} # end if
return $factorial;
} # end sub
factorial(recursive)
(OPTIONAL) ...the second way is to use a subroutine recursively to yield the factorial of a number. Try programming it this way as well.
Hints:
Recursion means a function calls itself. Such a subroutine or program needs defined “base cases”, for which the subroutine can return a value without having to call itself again (allowing the program or subroutine to terminate, otherwise it would just go deeper, and deeper...). The base cases for factRecurse are exactly the same as for factorial.pl – negative factorial should return an error, and both 0! and 1! should return 1. In place of the for loop used in factorial.pl, each recursion of the subroutine in factRecurse.pl performs one small step (the small step that would be performed with each iteration of the for loop) and then applies it to the next subroutine call.
(e.g. $resultOfSomeStep + subRoutine($currentCall – 1)).
Code:
#!/usr/bin/perl
use warnings;
use strict;
my $number = <STDIN>; # number is read in from STDIN
chomp($number); # remove newline
print(fact($number),"\n"); # print statement calls subroutine fact()
exit();
# ========================================================
sub fact {
my ($n) = @_;
if ($n < 0) {
die("panic: fact($n) negative factorial is undefined. ");
} elsif ($n == 0 or $n == 1) {
return(1);
} else {
return( $n * fact($n-1) ); # recursive: subroutine calls itself
}
} # end sub
Further reading and resources