|
|
Line 866: |
Line 866: |
| | | |
| | | |
− | ==Programming Exercises: Hints==
| |
| | | |
− |
| |
− | ====2. cat====
| |
− | Use the diamond operator to read from STDIN, assign this to a variable, then print the contents of the variable. Just one statement, no loop is required.
| |
− |
| |
− |
| |
− | ====3. lc====
| |
− | Use a while loop to test the successful assignment of <STDIN> to a variable as its loop condition. This way thee loop runs until STDIN reads EOF (End of File). Use the perl lc(); function to change case. Assign the return value to a variable and print it.
| |
− |
| |
− |
| |
− | ====4. max====
| |
− | You need an
| |
− | <source lang="perl">
| |
− | if (condition) { do ... }
| |
− | </source>
| |
− | construction to print one or the other numbers, depending on the result of the comparison. Remember the difference between numeric and alphanumeric comparisons! You have to chomp(); your input variables, to be able to compare them as numbers.
| |
− |
| |
− |
| |
− | ====5. max (with subroutine)====
| |
− | Remember that Perl uses the default array "@_" to pass values into subroutines. You need to assign the contents of @_ to variables (or other arrays) in order to be able to use the values. The easiest way to do this, is to assign the array to values in a list - e.g.
| |
− |
| |
− | <source lang="perl">
| |
− | my ($a) = @_; or ...
| |
− | my ($a, $b) = @_;
| |
− | </source>
| |
− |
| |
− | '''Note that the following will not work as expected !'''
| |
− |
| |
− | <source lang="perl">
| |
− | my $a = @_;
| |
− | </source>
| |
− |
| |
− | If you would do this, you would be assigning an array "@" to a scalar "$". The problem is that this is legal, the compiler does not complain or warn, but this does not assign the first value in the array, it assigns the integer value of the number of fields the array uses ! This is a fine case of a statement being syntactically correct but logically wrong. If in doubt whether you are doing the right thing, always print your values from within the subroutine, as a development test, to make sure they are what you expect them to be.
| |
− |
| |
− | <!--
| |
− | // note to self: write examples explicitly!
| |
− |
| |
− | You can return single scalars from subroutines, or you can return multiple values, as lists, as in the following example:
| |
− |
| |
− | subroutine returns... main program assigns ...
| |
− |
| |
− | <source lang="perl">
| |
− | return($a);
| |
− | my $larger = max($in1, $in2);
| |
− |
| |
− |
| |
− | return($a, $b);
| |
− |
| |
− | my ($larger, $smaller) = max($in1, $in2);
| |
− | </source>
| |
− |
| |
− |
| |
− | -->
| |
− |
| |
− | ====6. anagram====
| |
− | You have to chomp(); your input in order not to shuffle the newline character (“return” character) into your randomized strings; otherwise you’ll end up with strangely shortened versions of your randomized string, split into two parts. To get the array size, use the index of the last array position plus one (remember that array positions are numbered starting at 0, not 1 ! ). To split a string into individual elements of an array, use split(//, $input); with no delimiter, i.e. with no other characters in between the slashes, not even a space. Assigning the result of split(); to an array puts every character of the string into its own array field. When randomizing the array, note that rand(); returns a random rational number, not an integer, so you may need to use int(); to truncate the result of rand(); and just return the integer part. Use variables to store values from the array before the swap, otherwise the original value stored in a given array position will be lost before it can be copied over to the new array position that you want to swap it to. Also note that all array positions should be switched, so you need to consider the case that your random integer is the same as the position of the original value.
| |
− |
| |
− | When you are done, see what happens when you comment out the chomp(); function, for effect.
| |
− |
| |
− |
| |
− | ====7. anastring====
| |
− | This is similar to anagram.pl but uses substr(); on the original string instead of shuffling fields of an array. Remember that substr(); can be used to extract defined substrings as well as to replace them. As with anagram.pl, use variables to store the characters that you want to swap, to prevent the original character from being lost when you overwrite one of the two positions in the string. Use int(); on the result of rand(); to get a random position in the string and think carefully about the range of numbers that this should produce. The range is obviously a function of the string-length - but does it start at 0 or 1 and does it extend to the length itself, or more or less ? Test whether the range you produce is correct.
| |
− |
| |
− |
| |
− | ====8. anastring with commandline input====
| |
− | The whole commandline that you give to a Perl program is stored in the array-variable named @ARGV. $ARGV[0] is the first argument $ARGV[0] is the second, and so on. To check whether some variable is defined, use the function defined($someVariable); in an if statement. If no command line argument has been typed, $ARGV[0] will be undefined.
| |
− |
| |
− |
| |
− | ====9. sort====
| |
− | Declare a variable to use as an array index and initialize it with the value 0. Assign the entire input string to the current array position $array[$index], then increment the index variable so it points to the next available field. (The field of an array can hold integers, floats, strings, other arrays, hashes, references to arrays, ...) Sort the array using the Perl sort(); function.
| |
− |
| |
− |
| |
− | ====10. fastaParser====
| |
− | To parse out the definition line of a FASTA file, use substr(); to get the first character of each line and test to see if it is ">". Read in each line of the FASTA file and store it as an array, character by character (as with anagram.pl). Loop over the contents of the array and retrieve the three-letter code for the amino acid, using a hash that maps one-letter amino acid codes to three-letter amino acid codes.
| |
− |
| |
− | Hint about the hash: it’s similar in concept to the amino acid code hash that was used in one of the programs written in class… think about which way the amino acid code mapping was applied with that hash and try to apply the principles here.
| |
− |
| |
− |
| |
− | ====11. factorial====
| |
− | Remember to think about all types of outcomes when designing your conditions in if/else statements: a negative factorial is undefined, and both 0! and 1! are equal to 1. Use die(); rather than exit(); to indicate that an unexpected input has been entered that the program cannot handle. Both cause the program to terminate, but die(); allows you to enter an error message on program exit, e.g. die(“Negative factorial is undefined.”);. Use a for loop to multiply out the factorial of the input number, and use a variable to store the value of the factorial during intermediate steps in calculation.
| |
− |
| |
− |
| |
− | ====12. factRecurse====
| |
− | Recursion is when a subroutine or program calls itself. Such a subroutine or program needs defined “base cases”, for which the subroutine can return a value without having to call itself again (allowing the program or subroutine to terminate). The base cases for factRecurse are exactly the same as for factorial.pl – negative factorial should return an error, and both 0! and 1! should return 1. In place of the for loop used in factorial.pl, each recursion of the subroutine in factRecurse.pl performs one small step (the small step that would be performed with each iteration of the for loop) and then applies it to the next subroutine call.
| |
− | (e.g. $resultOfSomeStep + subRoutine($currentCall – 1)).
| |
− |
| |
− |
| |
− |
| |
− | ==Programming Exercises: Sample Solutions==
| |
− |
| |
− |
| |
− |
| |
− | ====2. cat====
| |
− |
| |
− | <source lang="perl">
| |
− | #!/usr/bin/perl
| |
− | use warnings;
| |
− | use strict;
| |
− |
| |
− | my $line;
| |
− |
| |
− | $line = <STDIN>;
| |
− |
| |
− | print( $line, "\n");
| |
− |
| |
− | exit();
| |
− | </source>
| |
− |
| |
− |
| |
− |
| |
− |
| |
− | ====3. lc====
| |
− |
| |
− | <source lang="perl">
| |
− | #!/usr/bin/perl
| |
− | use warnings;
| |
− | use strict;
| |
− |
| |
− | while (my $line = <STDIN>) {
| |
− | $line = lc($line);
| |
− | print( $line, "\n");
| |
− | }
| |
− | exit();
| |
− | </source>
| |
− |
| |
− |
| |
− |
| |
− | ====4. max.pl (plain version)====
| |
− |
| |
− | <source lang="perl">
| |
− | #!/usr/bin/perl
| |
− | use warnings;
| |
− | use strict;
| |
− |
| |
− | print("Enter a number: "); # User inputs
| |
− | my $input1 = <STDIN>;
| |
− |
| |
− | print("Enter a second number: ");
| |
− | my $input2 = <STDIN>;
| |
− |
| |
− | chomp($input1); # Chomp off trailing newline characters
| |
− | chomp($input2);
| |
− |
| |
− | if ($input1 > $input2) {
| |
− | print("$input1 is larger than $input2.\n");
| |
− | } elsif ($input1 == $input2) {
| |
− | print ("$input1 and $input2 are equal.\n");
| |
− | } else {
| |
− | print ("$input2 is larger than $input1.\n");
| |
− | }
| |
− |
| |
− | exit();
| |
− | </source>
| |
− |
| |
− |
| |
− |
| |
− | ====5. max.pl (subroutine version)====
| |
− |
| |
− | <source lang="perl">
| |
− | #!/usr/bin/perl
| |
− | use warnings;
| |
− | use strict;
| |
− |
| |
− | print("Enter a number: "); # User inputs
| |
− | my $input1 = <STDIN>;
| |
− |
| |
− | print("Enter a second number: ");
| |
− | my $input2 = <STDIN>;
| |
− |
| |
− | if ($input1 eq $input2) { # eq: string equal.
| |
− | print("Both inputs are equal.\n");
| |
− | } else {
| |
− | my $larger = compare($input1, $input2); # compare arguments in subroutine
| |
− | # and return larger value
| |
− | print("$larger is larger.\n");
| |
− | }
| |
− |
| |
− | exit();
| |
− |
| |
− | # =======================================================
| |
− | # Subroutine "compare" returns the larger of two inputs
| |
− | # or $a if they are equal.
| |
− | sub compare {
| |
− |
| |
− | my ($a, $b) = @_; # Pass a list of variables into the subroutine
| |
− |
| |
− | chomp($a); # Chomp off trailing newline characters
| |
− | chomp($b); # for numeric comparison
| |
− |
| |
− | if ($a >= $b) { # numeric greater-or-equal
| |
− | return($a);
| |
− | } else {
| |
− | return($b);
| |
− | }
| |
− |
| |
− | } # end subroutine
| |
− | </source>
| |
− |
| |
− |
| |
− |
| |
− |
| |
− | ====6. anagram.pl====
| |
− |
| |
− | <source lang="perl">
| |
− | #!/usr/bin/perl
| |
− | use warnings;
| |
− | use strict;
| |
− |
| |
− | # Constants
| |
− | my $COUNT = 10; # Number of times to call randomizing subroutine
| |
− |
| |
− | # Declare variables
| |
− | my $stringInput; # Initial input string
| |
− | my @stringArray; # Input string array after splitting into
| |
− | # an array that stores each character in one array element
| |
− |
| |
− | print("Enter a string to randomize: "); # Accept user input
| |
− | $stringInput = <STDIN>;
| |
− |
| |
− | chomp($stringInput); # Remove newline character from input string
| |
− |
| |
− | # Split string input into an array that stores each character as a
| |
− | # separate element
| |
− | @stringArray = split(//, $stringInput);
| |
− |
| |
− | # Call randomize 10 times in order to return ten random permutations
| |
− | # of the input string
| |
− | for (my $i = 0; $i < $COUNT; $i++) {
| |
− | # Pass string array to subroutine "randomize"
| |
− | printRandomized(@stringArray);
| |
− | }
| |
− |
| |
− | exit();
| |
− |
| |
− |
| |
− | # ===== randomize () =============================
| |
− | # Subroutine "randomize" loops over every position of the array
| |
− | # passed to it, and swaps the contents of this position with
| |
− | # a randomly chosen other position of the array. This implements the
| |
− | # so-called Fisher-Yates shuffle, an efficient in-place shuffle
| |
− | # that gives equal weight to all N! permutations.
| |
− | sub printRandomized {
| |
− | my (@randArray) = @_;
| |
− |
| |
− | # get array size: since arrays start from index 0, the size
| |
− | # of the array is equal to the index of the last array element plus 1
| |
− | # Perl provides three ways to get the array size:
| |
− | # "$#Array" is the index of the last element.
| |
− | # You can also assign the array to a scalar, you get its size
| |
− | # as in "$N_fields = @Array;" This value is one more than "$#Array",
| |
− | # since the index of the first element is 0, not 1.
| |
− | # For clean code I prefer the third version: "$size = scalar(@Array);"
| |
− | # since it is more explicit.
| |
− | my $arraySize = scalar(@randArray);
| |
− |
| |
− | # iterate through every element in the array, beginning at
| |
− | # the last element and counting down
| |
− | for (my $j = $arraySize - 1; $j > 0; $j--) {
| |
− | # assign the contents of the first element to a temporary
| |
− | # variable
| |
− | my $arrayPos1 = $randArray[$j];
| |
− |
| |
− | # get random array position less or equal to $j
| |
− | my $randInt = int(rand($j + 1));
| |
− |
| |
− | # assign the contents of the second element to a temporary
| |
− | # variable
| |
− | my $arrayPos2 = $randArray[$randInt];
| |
− |
| |
− | # swap the contents of the two array positions
| |
− | $randArray[$randInt] = $arrayPos1;
| |
− | $randArray[$j] = $arrayPos2;
| |
− | } # end for (iterating through elements in the array)
| |
− |
| |
− |
| |
− | # print result of the randomization
| |
− | for (my $k = 0; $k < $arraySize; $k++) {
| |
− | print($randArray[$k]);
| |
− | } # end for (printing randomized array)
| |
− | print("\n");
| |
− |
| |
− | } # end subroutine "printRandomize"
| |
− | </source>
| |
− |
| |
− |
| |
− |
| |
− | The construct
| |
− |
| |
− | <source lang="perl">
| |
− | int(rand($j+1))
| |
− | </source>
| |
− |
| |
− | deserves some comment. $j starts as the index of the last element in the array. $rand(n) returns a random, rational number from the interval [0,n[ i.e. 0 ≤ number < n. Assume our array had four elements: $rand(3+1) would return numbers from 0.000... to 3.999... Since int() does not round the number, but just truncates its decimals and returns its integer part, we return random integers from 0 to 3, each with uniform probability. That happens to be exactly the range of elements that can be used to randomly point somewhere into our array.
| |
− |
| |
− | ====7. anastring====
| |
− |
| |
− | <source lang="perl">
| |
− | #!/usr/bin/perl
| |
− | use warnings;
| |
− | use strict;
| |
− |
| |
− | my $count = 10; # controls number of anagrams to print
| |
− |
| |
− | my $string = <STDIN>; # retrieve one string
| |
− | chomp($string); # if we would not chomp(), we would
| |
− | # swap the linefeed into our anagrams
| |
− | my $len = length($string);
| |
− | my $pos; # a variable to store a random
| |
− | # position in the string
| |
− |
| |
− | for (my $i=0; $i < $count; $i++) { # for desired number of anagrams ...
| |
− | for (my $j=0; $j < $len; $j++) { # for every character in string ...
| |
− | $pos = int(rand($len)); # calculate random position in string
| |
− | while ($pos == $j) { # if this is the same integer as $j ...
| |
− | $pos = int(rand($len)); # ... try again
| |
− | }
| |
− |
| |
− | my $tmp = substr($string, $j, 1); # store character j
| |
− | substr($string, $j, 1) = substr($string, $pos, 1); # swap pos to j
| |
− | substr($string, $pos, 1) = $tmp; # swap tmp to pos
| |
− | }
| |
− | print($string, "\n"); # print the randomized string
| |
− | }
| |
− |
| |
− | exit ();
| |
− | </source>
| |
− |
| |
− |
| |
− |
| |
− | ====8. anastring, with commandline argument====
| |
− |
| |
− | Simply change:
| |
− |
| |
− | <source lang="perl">
| |
− | my $count = 10;
| |
− | </source>
| |
− |
| |
− | to:
| |
− |
| |
− | <source lang="perl">
| |
− | my $count = 1;
| |
− | if (defined($ARGV[0]) ) { $count = $ARGV[0] };
| |
− | </source>
| |
− |
| |
− |
| |
− | Then use like (for example)
| |
− |
| |
− | $ anastring.pl 100 < test.txt
| |
− |
| |
− | assuming you have a file named test.txt with the contents you want to randomize, or
| |
− |
| |
− | $ echo "acdefghiklmnpqrstvwy" | anastring.pl 100
| |
− |
| |
− |
| |
− | ====9. sort.pl====
| |
− |
| |
− | <source lang="perl">
| |
− | #!/usr/bin/perl
| |
− | use warnings;
| |
− | use strict;
| |
− |
| |
− | my $index = 0; # initialize a variable to use as index to array
| |
− | my $currentInput; # stores input values from STDIN
| |
− | my @arrayOfStrings; # array of strings in their original order
| |
− | my @sortedArray; # array of strings sorted in alphabetical order
| |
− |
| |
− | while ($currentInput = <STDIN>) { # Retrieve strings from STDIN
| |
− | $arrayOfStrings[$index] = $currentInput; # Store in array
| |
− | $index++; # increment index
| |
− | }
| |
− |
| |
− | @sortedArray = sort(@arrayOfStrings); # Sort array fields
| |
− |
| |
− | for (my $i = 0; $i < $index; $i++) { # Print all array fields
| |
− | print($sortedArray[$i]);
| |
− | }
| |
− |
| |
− | exit();
| |
− | </source>
| |
− |
| |
− |
| |
− |
| |
− |
| |
− | ====10. fastaParser.pl====
| |
− |
| |
− |
| |
− | <source lang="perl">
| |
− | #!/usr/bin/perl
| |
− | use warnings;
| |
− | use strict;
| |
− |
| |
− | # Declare variables
| |
− | my $line; # current line being read in from STDIN
| |
− | my $char; # store first character in the line (to test for ">")
| |
− | my @oneLetterLine; # split each line of FASTA file into individual letters
| |
− | my %oneToThree; # hash that stores mappings from one-letter amino acid
| |
− | # code to three-letter amino acid code
| |
− |
| |
− | # Initialize the hash mapping
| |
− | mapOneToThree();
| |
− |
| |
− |
| |
− | while ($line = <STDIN>) { # Read input (FASTA format) line by lin
| |
− | $line = uc($line); # Translate to uppercase characters
| |
− | $char = substr($line, 0, 1); # Extract first character
| |
− |
| |
− | if ($char ne ">") { # only if it's not the title line ...
| |
− | chomp($line); # chomp off the newline character
| |
− |
| |
− | # store each character in the line as an element in an array
| |
− | @oneLetterLine = split(//,$line);
| |
− |
| |
− | # get the size of the array (since arrays start at index 0,
| |
− | # the size of the array is the last array index plus 1
| |
− | my $arraySize = $#oneLetterLine + 1;
| |
− |
| |
− | # print three-letter amino acid code mapping
| |
− | for (my $i = 0; $i < $arraySize; $i++) {
| |
− | print($oneToThree{$oneLetterLine[$i]}, " ");
| |
− |
| |
− | } # end for
| |
− | } # end if
| |
− | } # end while
| |
− |
| |
− | print("\n"); # Print final newline character
| |
− |
| |
− | exit(); # Exit the program
| |
− |
| |
− | # =================================================================
| |
− | # Subroutine to generate the hash that maps one-letter amino acid
| |
− | # code to three-letter amino acid code
| |
− | sub mapOneToThree {
| |
− |
| |
− | $oneToThree{'A'} = 'Ala';
| |
− | $oneToThree{'C'} = 'Cys';
| |
− | $oneToThree{'D'} = 'Asp';
| |
− | $oneToThree{'E'} = 'Glu';
| |
− | $oneToThree{'F'} = 'Phe';
| |
− | $oneToThree{'G'} = 'Gly';
| |
− | $oneToThree{'H'} = 'His';
| |
− | $oneToThree{'I'} = 'Ile';
| |
− | $oneToThree{'K'} = 'Lys';
| |
− | $oneToThree{'L'} = 'Leu';
| |
− | $oneToThree{'M'} = 'Met';
| |
− | $oneToThree{'N'} = 'Asn';
| |
− | $oneToThree{'P'} = 'Pro';
| |
− | $oneToThree{'Q'} = 'Gln';
| |
− | $oneToThree{'R'} = 'Arg';
| |
− | $oneToThree{'S'} = 'Ser';
| |
− | $oneToThree{'T'} = 'Thr';
| |
− | $oneToThree{'V'} = 'Val';
| |
− | $oneToThree{'W'} = 'Trp';
| |
− | $oneToThree{'Y'} = 'Tyr';
| |
− |
| |
− | } # end sub
| |
− | <source lang="perl">
| |
− |
| |
− |
| |
− |
| |
− | ====11. factorial.pl====
| |
− |
| |
− | <source lang="perl">
| |
− | #!/usr/bin/perl
| |
− | use warnings;
| |
− | use strict;
| |
− |
| |
− | my $number = <STDIN>; # number is read in from STDIN
| |
− | chomp($number); # remove newline
| |
− | print(fact($number),"\n"); # print statement calls subroutine fact()
| |
− | exit();
| |
− |
| |
− | # ========================================================
| |
− | sub fact {
| |
− |
| |
− | my ($n) = @_; # receive argument via @_
| |
− | my $factorial = 1;
| |
− |
| |
− | if ($n < 0) {
| |
− | die("panic: fact($n) negative factorial is undefined. ");
| |
− | } elsif ($n == 0 or $n == 1) {
| |
− | return 1;
| |
− | } else {
| |
− | for (my $i = 2; $i <= $n; $i++) {
| |
− | $factorial = $factorial * $i;
| |
− | }
| |
− | } # end if
| |
− |
| |
− | return $factorial;
| |
− | } # end sub
| |
− | </source>
| |
− |
| |
− |
| |
− |
| |
− |
| |
− |
| |
− | ====12. factRecurse.pl====
| |
− |
| |
− | <source lang="perl">
| |
− | #!/usr/bin/perl
| |
− | use warnings;
| |
− | use strict;
| |
− |
| |
− | my $number = <STDIN>; # number is read in from STDIN
| |
− | chomp($number); # remove newline
| |
− | print(fact($number),"\n"); # print statement calls subroutine fact()
| |
− | exit();
| |
− |
| |
− | # ========================================================
| |
− | sub fact {
| |
− |
| |
− | my ($n) = @_;
| |
− |
| |
− | if ($n < 0) {
| |
− | die("panic: fact($n) negative factorial is undefined. ");
| |
− | } elsif ($n == 0 or $n == 1) {
| |
− | return(1);
| |
− | } else {
| |
− | return( $n * fact($n-1) ); # recursive: subroutine calls itself
| |
− | }
| |
− | } # end sub
| |
− | </source>
| |
− |
| |
− |
| |
− |
| |
− |
| |
− |
| |
− |
| |
− |
| |
− |
| |
− | <!--
| |
− | ==Exercises==
| |
− | <section begin=exercises />
| |
− | <section end=exercises />
| |
− |
| |
− |
| |
− |
| |
| ==Notes== | | ==Notes== |
| <references /> | | <references /> |