RPR-Functions
R Functions
Keywords: Anatomy of a function anatomy, arguments, parameters and values; the concept of functional programming
Contents
This unit is under development. There is some contents here but it is incomplete and/or may change significantly: links may lead to nowhere, the contents is likely going to be rearranged, and objectives, deliverables etc. may be incomplete or missing. Do not work with this material until it is updated to "live" status.
Abstract
...
This unit ...
Prerequisites
You need to complete the following units before beginning this one:
Objectives
...
Outcomes
...
Deliverables
- Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.
- Journal: Document your progress in your Course Journal. Some tasks may ask you to include specific items in your journal. Don't overlook these.
- Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page.
Evaluation
Evaluation: NA
- This unit is not evaluated for course marks.
Contents
Functions
R is considered an (impure) functional programming language and thus the focus of R programs is on functions. The key advantage is that this encourages programming without side-effects and this makes it easier to reason about the correctness of programs. Function parameters[1] are instantiated for use inside functions as the function's arguments, and a single result is returned[2]. The return values can either be assigned to a variable, or used directly as the argument of another function - intermediate assignment is not required.
Functions are either built-in (i.e. available in the basic R installation), loaded via specific packages (see above), or they can be easily defined by you (see below). In general a function is invoked through a name, followed by one or more arguments in parentheses, separated by commas. Whenever I refer to a function, I write the parentheses to identify it as such and not a constant or other keyword eg. log()
. Here are some examples for you to try and play with:
cos(pi) #"pi" is a predefined constant.
sin(pi) # Note the rounding error. This number is not really different from zero.
sin(30 * pi/180) # Trigonometric functions use radians as their argument - this conversion calculates sin(30 degrees)
exp(1) # "e" is not predefined, but easy to calculate.
log(exp(1)) # functions can be arguments to functions - they are evaluated from the inside out.
log(10000) / log(10) # log() calculates natural logarithms; convert to any base by dividing by the log of the base. Here: log to base 10.
exp(complex(r=0, i=pi)) #Euler's identity
There are several ways to populate the argument list for a function and R makes a reasonable guess what you want to do. Arguments can either be used in their predefined order, or assigned via an argument name. Let's look at the complex()
function to illustrate this. Consider the specification of a complex number in Euler's identity above. The function complex()
can work with a number of arguments that are given in the documentation (see: ?complex
). These include length.out
, real
, imaginary
, and some more. The length.out
argument creates a vector with one or more complex numbers. If nothing else is specified, this will be a vector of complex zero(s). If there are two, or three arguments, they will be placed in the respective slots. However, since the arguments are named, we can also define which slot of the argument list they should populate. Consider the following to illustrate this:
complex(1)
complex(4)
complex(1, 2) # imaginary part missing: if it's missing it defaults to zero
complex(1, 2, 3) # one complex number
complex(4, 2, 3) # four complex numbers
complex(real = 0, imaginary = pi) # defining values via named parameters
complex(imaginary = pi, real = 0) # same thing - if names are used, order is not important
complex(re = 0, im = pi) # names can be abbreviated ...
complex(r = 0, i = pi) # ... to the shortest string that is unique among the named parameters.
# A strongly advise against this to keep your code readable for others.
complex(i = pi, 1, 0) # Think: what have I done here? Why does this work?
exp(complex(i = pi, 1, 0)) # (The complex number above is the same as in Euler's identity.)
Task:
A frequently used function is seq()
.
- Read the help page about
seq()
- Use
seq()
to generate a sequence of integers from -5 to 3. Pass arguments in default order, don't use argument names. - Use
seq()
to generate a sequence of numbers from -2 to 2 in intervals of 1/3. This time, use argument names. - Use
seq()
to generate a sequence of 30 numbers between 1 and 100. Pass the arguments in the following order:length.out
,to
,from
.
Writing your own functions
R is a "functional programming language" and most if not all serious work will involve writing your own functions. This is easy and gives you access to flexible, powerful and reusable solutions. You have to understand the "anatomy" of an R function however.
- Functions are assigned to function names. They are treated like any other R object and you can have vectors of functions, and functions that return functions etc.
- Data gets into the function via the function's parameters.
- Data is returned from a function via the
return()
statement[3]. One and only one object is returned. However the object can be a list, and thus contain values of arbitrary complexity. This is called the "value" of the function. Well-written functions have no side-effects like changing global variables.
#defining the function:
myFunction <- function(<myParameters>) {
result <- <do something with my parameters>
return(result)
}
Task:
Quick Exercise
This exercise is similar to the while loop exercise. The only difference is to put the code into a function. Write a function rocketShip(n) so that you can start the countdown call from any number. For example if the rocketShip countdown from 7, the output would be:
[1] "5" "4" "3" "2" "1" "0" "Blast Off!"
Solution:
rocketShip <- function(n) {
start <- n
call <- c(start)
countdown <- start
while (countdown > 0) {
countdown <- countdown - 1
call <- c(call, countdown)
}
call <- c(call, "Blast Off!")
return(call)
}
rocketShip(7)
The scope of functions is local: this means all variables within a function are lost upon return, and global variables are not overwritten by a definition within a function. However variables that are defined outside the function are also available inside.
Here is a simple example: a function that takes a binomial species name as input and creates a five-letter code as output:
biCode <- function(s) {
substr(s, 4, 5) <- substr(strsplit(s,"\\s+")[[1]][2], 1, 2)
return (toupper(substr(s, 1, 5)))
}
biCode("Homo sapiens") # HOMSA
biCode("saccharomyces cerevisiae") # SACCE
We can use loops and control structures inside functions. For example the following creates a vector containing n Fibonacci numbers.
fibSeq <- function(n) {
if (n < 1) { return( 0 ) }
else if (n == 1) { return( 1 ) }
else if (n == 2) { return( c(1, 1) ) }
else {
v <- numeric(n)
v[1] <- 1
v[2] <- 1
for ( i in 3:n ) {
v[n] <- v[n-2] + v[n-1]
}
return( v )
}
}
The function template looks like:
<name> <- function (<parameters>) {
<statements>
}
In this statement, the function is assigned to the name - any valid name in R. Once it is assigned, it the function can be invoked with name()
. The parameter list (the values we write into the parentheses followin the function name) can be empty, or hold a list of variable names. If variable names are present, you need to enter the corresponding parameters when you execute the function. These assigned variables are available inside the function, and can be used for computations. This is called "passing the variable into the function".
You have encountered a function to choose YFO names. In this function, your Student ID was the parameter. Here is another example to play with: a function that calculates how old you are. In days. This is neat - you can celebrate your 10,000 birthday - or so.
Task:
Copy, explore and run ...
- Define the function ...
# A lifedays calculator function
myLifeDays <- function(date = NULL) { # give "date" a default value so we can test whether it has been set
if (is.null(date)) {
print ("Enter your birthday as a string in \"YYYY-MM-DD\" format.")
return()
}
x <- strptime(date, "%Y-%m-%d") # convert string to time
y <- format(Sys.time(), "%Y-%m-%d") # convert "now" to time
diff <- round(as.numeric(difftime(y, x, unit="days")))
print(paste("This date was ", diff, " days ago."))
}
- Use the function (example)
myLifeDays("1932-09-25") # Glenn Gould's birthday
Here is a good opportunity to play and practice programming: modify this function to accept a second argument. When a second argument is present (e.g. 10000) the function should print the calendar date on which the input date will be that number of days ago. Then you could use it to know when to celebrate your 10,000th lifeDay, or your 777th anniversary day or whatever.
Further reading, links and resources
Notes
- ↑ The terms parameter and argument have similar but distinct meanings. A parameter is an item that appears in the function definition, an argument is the actual value that is passed into the function.
- ↑ However a function may have side-effects, such as writing something to console, plotting graphics, saving data to a file, or changing the value of variables outside the function scope. Avoid the latter, it is fragile and poor practice.
- ↑ Actually the return() statement is optional, if missing, the result of the last expression is returned. I consider it poor practice to omit return(), this gives rise to error-prone code.
Self-evaluation
If in doubt, ask! If anything about this learning unit is not clear to you, do not proceed blindly but ask for clarification. Post your question on the course mailing list: others are likely to have similar problems. Or send an email to your instructor.
About ...
Author:
- Boris Steipe <boris.steipe@utoronto.ca>
Created:
- 2017-08-05
Modified:
- 2017-08-05
Version:
- 0.1
Version history:
- 0.1 First stub
This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.