RPR-Syntax basics
Basics of R syntax
Keywords: Simple commands and basic syntax, operators, variables, class, mode, and attributes
Contents
This unit is under development. There is some contents here but it is incomplete and/or may change significantly: links may lead to nowhere, the contents is likely going to be rearranged, and objectives, deliverables etc. may be incomplete or missing. Do not work with this material until it is updated to "live" status.
Abstract
...
This unit ...
Prerequisites
You need to complete the following units before beginning this one:
Objectives
...
Outcomes
...
Deliverables
- Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.
- Journal: Document your progress in your course journal.
- Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page.
Evaluation
Evaluation: NA
- This unit is not evaluated for course marks.
Contents
Simple commands
The R command line evaluates expressions. Expressions can contain constants, variables, operators and functions of the various datatypes that R recognizes.
Operators
The common arithmetic operators are recognized in the usual way.
Task:
Try the following operators on numbers:
5
5 + 3
5 + 1 / 2 # Think first is this 3 or 5.5
3 * 2 + 1
3 * (2 + 1)
2^3 # Exponentiation
8 ^ (1/3) # Third root via exponentiation
7 %% 2 # Modulo operation (remainder of integer division)
7 %/% 2 # Integer division
# Logical operators return TRUE or FALSE
# Unary:
! TRUE
! FALSE
# Binary
1 == 2
1 != 2
1 < 2
1 > 2
1 > 1
1 >= 1
1 < 1
1 <= 1
# & AND
TRUE & TRUE
TRUE & FALSE
FALSE & FALSE
# | OR
TRUE | TRUE
TRUE | FALSE
FALSE | FALSE
# Predict what this will return
!(FALSE | (! FALSE))
Task:
Exercise
Given the expression shown below, the value of lastNum is 9:
numbers <- c(16, 20, 3, 5, 9)
lastNum <- tail(numbers, 1)
Write R expressions:
1) To check whether lastNum is under 6 or greater than 10
2) To check whether lastNum is between 10 and 20, including 10 and excluding 20
3) To output TRUE if lastNum equals lastNum plus 1 divided by 2 from 1) (Hint use parentheses)
Solutions:
1) lastNum < 6 | lastNum > 10
2) lastNum >= 10 & lastNum < 20
3) ((lastNum + 1) / 2) < 6 | ((lastNum + 1) / 2) > 10
Variables
In order to store the results of expressions and computations, you can freely assign them to variables. Variables are created by R whenever you first use them (i.e. space in memory is allocated to the variable and a value is stored in that space.) Variable names distinguish between upper case and lower case letters. There are a small number of reserved names that you are not allowed to redefine, and a very small number of predefined constants, such as pi
. However these constants can be overwritten - be careful: R will allow you to define pi <- 3
but casually redefining the foundations of mathematics may lead to unintended consequences. Read more about variable names at:
?make.names
?reserved
To assign a value to a constant, use the assignment operator <-
. This is the default way of assigning values in R. You could also use the =
sign, but there are subtle differences. (See: ?"<-"
). There is a variant of the assignment operator <<-
which is sometimes used inside functions. It assigns to a global context. This is possible, but not preferred since it generates a side effect of a function.
a <- 5
a
a + 3
b <- 8
b
a + b
a == b # not assignment: equality test
a != b # not equal
a < b # less than
Note that all of R's data types (as well as functions and other objects) can be assigned to variables.
There are very few syntactic restrictions on variable names (discussed eg. here) but this does not mean esoteric names are good. For the sake of your sanity, use names that express the meaning of the variable, and that are unique. Many R developers use dotted.variable.names
, some people use the pothole_style
, my personal preference is to write camelCaseNames
. And while the single letters c f n s Q
are syntactically valid variable names, they coincide with commands for the debugger browser and will execute debugger commands, rather than displaying variable values when you are debugging. Finally, try not to use variable names that coincide with parameter names in functions. Alas, you see this often in code, but such code can be hard to read because the semantics of the actual argument versus the parameter name becomes obscured. It's just common sense really: don't call different things by the same name.
# I don't like...
col <- c("red", "grey")
hist(rnorm(200), col=col)
# I prefer instead something like...
rgStripes <- c("red", "grey")
hist(rnorm(200), col=rgStripes)
Further reading, links and resources
Notes
Self-evaluation
If in doubt, ask! If anything about this learning unit is not clear to you, do not proceed blindly but ask for clarification. Post your question on the course mailing list: others are likely to have similar problems. Or send an email to your instructor.
About ...
Author:
- Boris Steipe <boris.steipe@utoronto.ca>
Created:
- 2017-08-05
Modified:
- 2017-08-05
Version:
- 0.1
Version history:
- 0.1 First stub
This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.