Difference between revisions of "RPR-Syntax basics"
m |
m |
||
(7 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | <div id=" | + | <div id="ABC"> |
− | + | <div style="padding:5px; border:1px solid #000000; background-color:#b3dbce; font-size:300%; font-weight:400; color: #000000; width:100%;"> | |
Basics of R syntax | Basics of R syntax | ||
− | + | <div style="padding:5px; margin-top:20px; margin-bottom:10px; background-color:#b3dbce; font-size:30%; font-weight:200; color: #000000; "> | |
− | + | (Simple commands and basic syntax, operators, variables, class, mode, and attributes) | |
− | + | </div> | |
− | |||
− | |||
− | |||
− | Simple commands and basic syntax, operators, variables, class, mode, and attributes | ||
</div> | </div> | ||
− | {{ | + | {{Smallvspace}} |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | < | + | <div style="padding:5px; border:1px solid #000000; background-color:#b3dbce33; font-size:85%;"> |
− | <div | + | <div style="font-size:118%;"> |
− | + | <b>Abstract:</b><br /> | |
<section begin=abstract /> | <section begin=abstract /> | ||
− | + | This unit discusses simple R commands and basic syntax, operators, and R objects (variables). | |
− | |||
<section end=abstract /> | <section end=abstract /> | ||
+ | </div> | ||
+ | <!-- ============================ --> | ||
+ | <hr> | ||
+ | <table> | ||
+ | <tr> | ||
+ | <td style="padding:10px;"> | ||
+ | <b>Objectives:</b><br /> | ||
+ | This unit will ... | ||
+ | * ... introduce basic operations of R syntax; | ||
+ | * ... provide examples for use of operators; | ||
+ | * ... discuss variable names. | ||
+ | </td> | ||
+ | <td style="padding:10px;"> | ||
+ | <b>Outcomes:</b><br /> | ||
+ | After working through this unit you ... | ||
+ | * ... can evaluate R expressions by typing them on the console; | ||
+ | * ... know how to write and debug complex R expressions that are deeply nested with parentheses; | ||
+ | * ... are able to avoid common issues when choosing variable names. | ||
+ | </td> | ||
+ | </tr> | ||
+ | </table> | ||
+ | <!-- ============================ --> | ||
+ | <hr> | ||
+ | <b>Deliverables:</b><br /> | ||
+ | <section begin=deliverables /> | ||
+ | <li><b>Time management</b>: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.</li> | ||
+ | <li><b>Journal</b>: Document your progress in your [[FND-Journal|Course Journal]]. Some tasks may ask you to include specific items in your journal. Don't overlook these.</li> | ||
+ | <li><b>Insights</b>: If you find something particularly noteworthy about this unit, make a note in your [[ABC-Insights|'''insights!''' page]].</li> | ||
+ | <section end=deliverables /> | ||
+ | <!-- ============================ --> | ||
+ | <hr> | ||
+ | <section begin=prerequisites /> | ||
+ | <b>Prerequisites:</b><br /> | ||
+ | This unit builds on material covered in the following prerequisite units:<br /> | ||
+ | *[[RPR-Console|RPR-Console (Console and scripts)]] | ||
+ | *[[RPR-Help|RPR-Help (Getting help for R)]] | ||
+ | *[[RPR-Installation|RPR-Installation (Installing R and RStudio)]] | ||
+ | *[[RPR-Setup|RPR-Setup (Setup R to work with it)]] | ||
+ | <section end=prerequisites /> | ||
+ | <!-- ============================ --> | ||
+ | </div> | ||
− | {{ | + | {{Smallvspace}} |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | {{ | + | {{Smallvspace}} |
− | + | __TOC__ | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
{{Vspace}} | {{Vspace}} | ||
Line 75: | Line 71: | ||
=== Evaluation === | === Evaluation === | ||
− | |||
− | |||
<b>Evaluation: NA</b><br /> | <b>Evaluation: NA</b><br /> | ||
− | :This unit is not evaluated for course marks. | + | <div style="margin-left: 2rem;">This unit is not evaluated for course marks.</div> |
+ | == Contents == | ||
− | + | Unlike most other work in this course or workshop, the syntax uits are not condensed into an R project. You should actually type the code and execute it in the console. This is quite important at the beginning, to develop your recognition of pattern in the code, and the "muscle"-memory of typing it correctly. Take this seriously, this will pay off tremendously in the future. | |
− | |||
− | |||
− | |||
− | |||
− | |||
==Simple commands== | ==Simple commands== | ||
− | The '''R''' command line evaluates expressions. Expressions can contain constants, variables, operators and functions of the various datatypes that '''R''' recognizes. | + | The '''R''' command line evaluates '''expressions'''. Expressions can contain constants, variables, operators and functions of the various datatypes that '''R''' recognizes. |
{{Vspace}} | {{Vspace}} | ||
Line 102: | Line 92: | ||
{{task|1= | {{task|1= | ||
− | + | Open an RStudio session and try the following operators on numbers: | |
− | < | + | <pre> |
5 | 5 | ||
5 + 3 | 5 + 3 | ||
− | 5 + 1 / 2 # Think first is this 3 or 5.5 | + | 5 + 1 / 2 # Think first: is this 3 or 5.5 |
3 * 2 + 1 | 3 * 2 + 1 | ||
3 * (2 + 1) | 3 * (2 + 1) | ||
Line 117: | Line 107: | ||
# Logical operators return TRUE or FALSE | # Logical operators return TRUE or FALSE | ||
# Unary: | # Unary: | ||
− | ! TRUE | + | TRUE |
+ | FALSE | ||
+ | ! TRUE # read carefully: the "!" (meaning "not") is easily overlooked | ||
! FALSE | ! FALSE | ||
− | # Binary | + | # Binary operators |
1 == 2 | 1 == 2 | ||
Line 132: | Line 124: | ||
1 <= 1 | 1 <= 1 | ||
− | # & AND | + | # & (means AND) |
TRUE & TRUE | TRUE & TRUE | ||
TRUE & FALSE | TRUE & FALSE | ||
FALSE & FALSE | FALSE & FALSE | ||
− | # | OR | + | # | (means OR) |
TRUE | TRUE | TRUE | TRUE | ||
TRUE | FALSE | TRUE | FALSE | ||
Line 145: | Line 137: | ||
!(FALSE | (! FALSE)) | !(FALSE | (! FALSE)) | ||
− | </ | + | </pre> |
}} | }} | ||
Line 151: | Line 143: | ||
{{task|1= | {{task|1= | ||
− | + | ;Practice | |
+ | :Given the expression shown below, the value of lastNum is 9: | ||
− | + | <pre> | |
+ | numbers <- c(16, 20, 3, 5, 9) # the c() function collects elements into a vector | ||
+ | numbers | ||
− | + | lastNum <- tail(numbers, 1) # explain what this does | |
− | + | lastNum | |
− | lastNum <- tail(numbers, 1) | ||
− | </ | + | |
+ | # Note: expressions in parentheses: | ||
+ | # when we assign, e.g. ... | ||
+ | numbers <- sample(1:20, 5) | ||
+ | # ... we can get the value of the vector "numbers" with ... | ||
+ | print(numbers) | ||
+ | # ... or just ... | ||
+ | numbers | ||
+ | |||
+ | # But we can also put the entire expression in parentheses, and when it is | ||
+ | # evaluated, which results in the assignment, the value is also printed. | ||
+ | (numbers <- sample(1:20, 5)) | ||
+ | # so: when you see parentheses around an entire expression, remember that all | ||
+ | # the parentheses do is to perform some evaluation, and then print the | ||
+ | # resulting object. I use this idiom lot for compactness in teaching code. | ||
+ | # In general, you usually don't need this in scripts that you develop, but for | ||
+ | # teaching I often need you to study the contents of a variable. | ||
+ | |||
+ | |||
+ | |||
+ | </pre> | ||
Write R expressions: | Write R expressions: | ||
− | 1) To check whether lastNum is | + | 1) To check whether lastNum is less than 6 or greater than 10 |
− | 2) To check whether lastNum is | + | 2) To check whether lastNum is in the interval [10, 20). (By the rules of mathematical notation this means 10 is included but 20 is not). |
− | 3) To output TRUE if lastNum | + | 3) To output TRUE if the following operation gives 2: |
+ | * take lastNum | ||
+ | * divide it by 7 | ||
+ | * subtract the integer part and the first digit after the decimal point (hint: multiply by 10, then integer division by 1 gives you ... what) | ||
+ | * multiply by 100 | ||
+ | * integer divide by 1 | ||
+ | * take the third root | ||
+ | (Hints: use lots of parentheses and compare the final result to 2. To debug, select parts of the code and execute separately. If the console gets stuck because it is expecting a closing parenthesis, and all you see is the "+" sign, simply press <escape> to abort evaluation.) | ||
− | + | <div class="toccolours mw-collapsible mw-collapsed" style="width:800px"> | |
− | < | + | Check your answers (but don't cheat) ... |
+ | <div class="mw-collapsible-content"> | ||
+ | <pre> | ||
1) lastNum < 6 | lastNum > 10 | 1) lastNum < 6 | lastNum > 10 | ||
2) lastNum >= 10 & lastNum < 20 | 2) lastNum >= 10 & lastNum < 20 | ||
− | 3) (( | + | 3) ((((9/7) - ((((9/7) * 10) %/% 1 )/10)) * 100) %/% 1 )^(1/3) == 2 |
+ | |||
+ | </pre> | ||
+ | |||
+ | </div> | ||
+ | </div> | ||
− | |||
}} | }} | ||
Line 181: | Line 208: | ||
===Variables=== | ===Variables=== | ||
+ | In order to store the results of expressions and computations, you can freely assign them to variables<ref>We call these "variables" because of what function they perform in our code, the actually '''are''' R "objects".</ref>. Variables are created by '''R''' whenever you first use them (''i.e.'' space in memory is allocated to the variable and a value is stored in that space.) Variable names distinguish between upper case and lower case letters. There are a small number of reserved names that you are not allowed to redefine, and R syntax contains very small number of predefined constants, such as <code>pi</code>. However these constants can be overwritten - be careful: '''R''' will ''allow'' you to define <code>pi <- 3</code> but casually redefining the foundations of mathematics may lead to unintended consequences. Read more about variable names at: | ||
− | + | <pre> | |
− | |||
− | < | ||
?make.names | ?make.names | ||
?reserved | ?reserved | ||
− | </ | + | </pre> |
− | To assign a value to a constant, use the assignment operator {{c|<-}}. This is the default way of assigning values in '''R'''. You could also use the <code>=</code> sign, but there are subtle differences. (See: {{c|?"<-"}}). There is a variant of the assignment operator {{c|<<-}} which is sometimes used inside functions. It assigns to a global context. This is possible, but not preferred since it generates a side effect of a function. | + | To assign a value to a constant, use the assignment operator {{c|<-}}. This is the default way of assigning values in '''R'''. You could also use the <code>=</code> sign, but there are subtle differences. (See: {{c|?"<-"}}). There is a variant of the assignment operator {{c|<<-}} which is sometimes used inside functions. It assigns to a global context. This is possible, but not preferred since it generates a side effect of a function. Don't do this. Just forget that {{c|<<-}} even exists. |
− | < | + | <pre> |
a <- 5 | a <- 5 | ||
a | a | ||
Line 201: | Line 227: | ||
a != b # not equal | a != b # not equal | ||
a < b # less than | a < b # less than | ||
− | </ | + | </pre> |
Note that '''all''' of '''R''''s data types (as well as functions and other objects) can be assigned to variables. | Note that '''all''' of '''R''''s data types (as well as functions and other objects) can be assigned to variables. | ||
− | There are very few syntactic restrictions on variable names ([http://stackoverflow.com/questions/9195718/variable-name-restrictions-in-r discussed eg. here]) but this does not mean esoteric names are good. For the sake of your sanity, use names that express the meaning of the variable, and that are unique. Many '''R''' developers use {{c|dotted.variable.names}}, some people use the {{c|pothole_style}}, my personal preference is to write {{c| | + | There are very few syntactic restrictions on variable names ([http://stackoverflow.com/questions/9195718/variable-name-restrictions-in-r discussed eg. here]) but this does not mean esoteric names are good. For the sake of your sanity, use names that express the meaning of the variable, and that are unique. Many '''R''' developers use loquatious {{c|dotted.variable.names}}, some people use the puttering {{c|pothole_style}}, my personal preference is to write noble names in {{c|camelCase}}. And while the single letters {{c|c f n s Q}} are syntactically valid variable names, they coincide with commands for the debugger browser and will execute debugger commands, rather than displaying variable values when you are debugging. Finally, try not to use variable names that are the same as parameter names in functions. You see this often in code, but such code can be hard to read because the semantics of the actual argument versus the parameter name becomes obscured. It's just common sense really: don't call different things by the same name. |
− | < | + | <pre> |
# I don't like... | # I don't like... | ||
− | col <- c(" | + | col <- c("#E5F2FF", "#F5F5F5") |
− | hist(rnorm( | + | hist(rnorm(2000), breaks = 25, col = col) |
# I prefer instead something like... | # I prefer instead something like... | ||
− | + | stripes <- c("#E5F2FF", "#F5F5F5") | |
− | hist(rnorm( | + | hist(rnorm(2000), breaks = 25, col = stripes) |
− | |||
− | |||
− | |||
− | |||
− | |||
+ | </pre> | ||
{{Vspace}} | {{Vspace}} | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
== Self-evaluation == | == Self-evaluation == | ||
− | |||
<!-- | <!-- | ||
=== Question 1=== | === Question 1=== | ||
Line 261: | Line 264: | ||
--> | --> | ||
+ | == Further reading, links and resources == | ||
+ | <!-- {{#pmid: 19957275}} --> | ||
+ | <!-- {{WWW|WWW_GMOD}} --> | ||
+ | <!-- <div class="reference-box">[http://www.ncbi.nlm.nih.gov]</div> --> | ||
+ | == Notes == | ||
+ | <references /> | ||
{{Vspace}} | {{Vspace}} | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
<div class="about"> | <div class="about"> | ||
Line 289: | Line 282: | ||
:2017-08-05 | :2017-08-05 | ||
<b>Modified:</b><br /> | <b>Modified:</b><br /> | ||
− | : | + | :2018-05-04 |
<b>Version:</b><br /> | <b>Version:</b><br /> | ||
− | :0.1 | + | :1.0.1 |
<b>Version history:</b><br /> | <b>Version history:</b><br /> | ||
− | *0.1 | + | *1.0.1 Maintenance |
+ | *1.0 Completed to first live version | ||
+ | *0.1 Material collected from previous tutorial | ||
</div> | </div> | ||
− | |||
− | |||
{{CC-BY}} | {{CC-BY}} | ||
+ | [[Category:ABC-units]] | ||
+ | {{UNIT}} | ||
+ | {{LIVE}} | ||
</div> | </div> | ||
<!-- [END] --> | <!-- [END] --> |
Latest revision as of 09:29, 25 September 2020
Basics of R syntax
(Simple commands and basic syntax, operators, variables, class, mode, and attributes)
Abstract:
This unit discusses simple R commands and basic syntax, operators, and R objects (variables).
Objectives:
|
Outcomes:
|
Deliverables:
Prerequisites:
This unit builds on material covered in the following prerequisite units:
Contents
Evaluation
Evaluation: NA
Contents
Unlike most other work in this course or workshop, the syntax uits are not condensed into an R project. You should actually type the code and execute it in the console. This is quite important at the beginning, to develop your recognition of pattern in the code, and the "muscle"-memory of typing it correctly. Take this seriously, this will pay off tremendously in the future.
Simple commands
The R command line evaluates expressions. Expressions can contain constants, variables, operators and functions of the various datatypes that R recognizes.
Operators
The common arithmetic operators are recognized in the usual way.
Task:
Open an RStudio session and try the following operators on numbers:
5 5 + 3 5 + 1 / 2 # Think first: is this 3 or 5.5 3 * 2 + 1 3 * (2 + 1) 2^3 # Exponentiation 8 ^ (1/3) # Third root via exponentiation 7 %% 2 # Modulo operation (remainder of integer division) 7 %/% 2 # Integer division # Logical operators return TRUE or FALSE # Unary: TRUE FALSE ! TRUE # read carefully: the "!" (meaning "not") is easily overlooked ! FALSE # Binary operators 1 == 2 1 != 2 1 < 2 1 > 2 1 > 1 1 >= 1 1 < 1 1 <= 1 # & (means AND) TRUE & TRUE TRUE & FALSE FALSE & FALSE # | (means OR) TRUE | TRUE TRUE | FALSE FALSE | FALSE # Predict what this will return !(FALSE | (! FALSE))
Task:
- Practice
- Given the expression shown below, the value of lastNum is 9:
numbers <- c(16, 20, 3, 5, 9) # the c() function collects elements into a vector numbers lastNum <- tail(numbers, 1) # explain what this does lastNum # Note: expressions in parentheses: # when we assign, e.g. ... numbers <- sample(1:20, 5) # ... we can get the value of the vector "numbers" with ... print(numbers) # ... or just ... numbers # But we can also put the entire expression in parentheses, and when it is # evaluated, which results in the assignment, the value is also printed. (numbers <- sample(1:20, 5)) # so: when you see parentheses around an entire expression, remember that all # the parentheses do is to perform some evaluation, and then print the # resulting object. I use this idiom lot for compactness in teaching code. # In general, you usually don't need this in scripts that you develop, but for # teaching I often need you to study the contents of a variable.
Write R expressions:
1) To check whether lastNum is less than 6 or greater than 10
2) To check whether lastNum is in the interval [10, 20). (By the rules of mathematical notation this means 10 is included but 20 is not).
3) To output TRUE if the following operation gives 2:
- take lastNum
- divide it by 7
- subtract the integer part and the first digit after the decimal point (hint: multiply by 10, then integer division by 1 gives you ... what)
- multiply by 100
- integer divide by 1
- take the third root
(Hints: use lots of parentheses and compare the final result to 2. To debug, select parts of the code and execute separately. If the console gets stuck because it is expecting a closing parenthesis, and all you see is the "+" sign, simply press <escape> to abort evaluation.)
Check your answers (but don't cheat) ...
1) lastNum < 6 | lastNum > 10 2) lastNum >= 10 & lastNum < 20 3) ((((9/7) - ((((9/7) * 10) %/% 1 )/10)) * 100) %/% 1 )^(1/3) == 2
Variables
In order to store the results of expressions and computations, you can freely assign them to variables[1]. Variables are created by R whenever you first use them (i.e. space in memory is allocated to the variable and a value is stored in that space.) Variable names distinguish between upper case and lower case letters. There are a small number of reserved names that you are not allowed to redefine, and R syntax contains very small number of predefined constants, such as pi
. However these constants can be overwritten - be careful: R will allow you to define pi <- 3
but casually redefining the foundations of mathematics may lead to unintended consequences. Read more about variable names at:
?make.names ?reserved
To assign a value to a constant, use the assignment operator <-
. This is the default way of assigning values in R. You could also use the =
sign, but there are subtle differences. (See: ?"<-"
). There is a variant of the assignment operator <<-
which is sometimes used inside functions. It assigns to a global context. This is possible, but not preferred since it generates a side effect of a function. Don't do this. Just forget that <<-
even exists.
a <- 5 a a + 3 b <- 8 b a + b a == b # not assignment: equality test a != b # not equal a < b # less than
Note that all of R's data types (as well as functions and other objects) can be assigned to variables.
There are very few syntactic restrictions on variable names (discussed eg. here) but this does not mean esoteric names are good. For the sake of your sanity, use names that express the meaning of the variable, and that are unique. Many R developers use loquatious dotted.variable.names
, some people use the puttering pothole_style
, my personal preference is to write noble names in camelCase
. And while the single letters c f n s Q
are syntactically valid variable names, they coincide with commands for the debugger browser and will execute debugger commands, rather than displaying variable values when you are debugging. Finally, try not to use variable names that are the same as parameter names in functions. You see this often in code, but such code can be hard to read because the semantics of the actual argument versus the parameter name becomes obscured. It's just common sense really: don't call different things by the same name.
# I don't like... col <- c("#E5F2FF", "#F5F5F5") hist(rnorm(2000), breaks = 25, col = col) # I prefer instead something like... stripes <- c("#E5F2FF", "#F5F5F5") hist(rnorm(2000), breaks = 25, col = stripes)
Self-evaluation
Further reading, links and resources
Notes
- ↑ We call these "variables" because of what function they perform in our code, the actually are R "objects".
About ...
Author:
- Boris Steipe <boris.steipe@utoronto.ca>
Created:
- 2017-08-05
Modified:
- 2018-05-04
Version:
- 1.0.1
Version history:
- 1.0.1 Maintenance
- 1.0 Completed to first live version
- 0.1 Material collected from previous tutorial
This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.