Expected Preparations:
|
|||||||
|
|||||||
Keywords: R projects; working with git version control via RStudio; the history mechanism and why not to use it; .Rprofile to customize startup behaviour; the working directory | |||||||
|
|||||||
Objectives:
|
Outcomes:
|
||||||
|
|||||||
Deliverables: Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit. Journal: Document your progress in your Course Journal. Some tasks may ask you to include specific items in your journal. Don’t overlook these. Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page. |
|||||||
|
|||||||
Evaluation: NA: This unit is not evaluated for course marks. |
This unit discusses the setup of a working session with RStudio.
Your Course Folder should already exist.
Take note! When you write a Windows paths in an R command, you have to use the “wrong” forward slash to separte directories and files. R will translate these “Unix-style”” paths into Windows-style paths automatically when it negotiates with the operating system. But the backslash is interpreted as an “escape” character that gives the character the follows it a special meaning.1
Folder name and path examples:
/Users/Pierette/Documents/BCB420
◁ Looking good on
a Mac.
C:\Users\Pulcinella\Documents\CBW
◁ Looking good on
a Windows computer.
"C:/Users/Pulcinella/Documents/CBW"
◁ Looking good
inside R on a Windows computer (note the quotation
marks!).
C:\Users\Pantalone\Documents\BCH1441 (2017)
◁ Wrong.
No special characters please.
/Users/Brighella/Documents/UofT Stuffz/Courses/more/Comp Sys biol. course
◁ Wrong.
Please read instructions more carefully.
C:\Users\Tartaglia\Documents\KUWTK\<Coursecode>
◁ I
can’t even …
We will make extensive use of “projects” in class. Read more about projects in RStudio here.
We will also make extensive use of version control. In fact, we will now load a project via Git version control from its free, public repository on GitHub.
Task…
Read more about Version Control in RStudio here.
git
on your
computer.Then do the following:
https://github.com/hyginn/R_Exercise-BasicSetup
as the Repository URL.<tab>
character, the Project
directory name field should then autofill to read
R_Exercise-BasicSetup
init()
to begin.init()
into the console pane.An R script should load.
To locate a file in a computer, one has to specify the
filename and the directory in which the file is stored; this is
also called the path of the file. However R uses a default
“working directory”“, which is assumed if no path is specified.
This”working directory” for R is either the directory
in which the R-program has been installed, or some
other directory, that has been defined in a startup script, or
specifically defined with the command
setwd(“
at any time. You can execute
the command getwd()
to list what the Working Directory is
currently set to:
> getwd()
[1] "/Users/steipe/R"
In RStudio, the contents of the working directory is listed in the Files Pane (lower-right).
It is convenient to put all your R-input and output
files into a project specific directory and then define this to be the
“Working Directory”. Use the setwd()
command for this.
setwd()
requires an argument that you type between
the parentheses: a string with the directory path, or a variable
containing such a string. Strings in R are delimited with “
or ’
characters. If the directory does not exist, an Error will be reported. Make sure you have
created the directory. On Mac and Unix systems, the usual shorthand
notation for relative paths can be used: ~
for the home
directory, .
for the current directory, ..
for
the parent of the current directory.
If you use a Windows system, you need know that
backslashes – “" – have a special meaning for R, they
work as escape characters. For example the string”” means
newline, and “ means tab. Thus R gets
confused when you put backslashes into string literals, such as
Windows path names. R has a simple solution: you
simply use forward slashes instead of backslashes when you specify
paths, and R will translate them correctly when it
talks to your operating system. Instead of C:
you write
C:/documents/projectfiles
. Also note that on Windows the
~
tilde is a shorthand for the directory in which
R is installed, not the user’s home directory.
My home directory…
> setwd("~") # Note: ~ is the "tilde" - the squiggly line - not the straight hyphen
> getwd()
[1] "/Users/steipe"
Relative path: home directory, up one level, then down into chen’s home directory)
> setwd("~/../chen")
> getwd()
[1] "/Users/chen"
Absolute path: specify the entire string)
> setwd("/Users/steipe/abc/R_samples")
> getwd()
[1] "Users/steipe/abc/R_samples"
In RStudio you can use the Session ▸ Set Working Directory menu. This includes the useful option to set the current project directory as the working directory3.
Task…
Since you have gone through the script of the BasicSetup project, your working directory should be set to this project directory (I have configured the project to do this automatically.)
setwd(“<your/path/and/directory/name>”)
to
set the Working Directory to the Course Folder.getwd()
and
list.files()
.The Working Directory functions can also be accessed through the Menu, under Misc.
Often, when working on a project, you would like to start off in your
working directory right away when you start up R,
instead of typing the setwd()
command. This is easily done
in a special R-script that is executed automatically on
startup4.
The name of the script is .Rprofile
and R
expects to find it in the user’s home directory. You can edit these
files with a simple text editor like Textedit (Mac), Notepad (windows)
or Gedit (Linux) - or, of course, by opening it in RStudio - don’t
forget that a code editor is also a text editor5.
Besides setting the working directory, other items that might go into such a file could be
For more details, use R’s help function:
> ?Startup
Task…
Just for information:
This way you could change it and save the changes. However, don’t do that now but
During an R session, you might define a large number of R-objects: variables, data structures, functions etc., and you might load packages and scripts. All of this information is stored in the so-called “Workspace”. When you quit R you have the option to save the Workspace; it will then be restored in your next session. Now, you might think: how convenient - I can just stop R, and when I restart it, it will go into the same state as it was. But no. Restoring the Workspace from a previous state is actually a bad idea: if you load data or variables in a startup script, they may be overwritten with a corrupted version that you happened to save in the workspace when you last quit. This is very hard to troubleshoot. Essentially, when you save and reload your Workspace habitually, you have overlapping and potentially conflicting behaviour of your startup script and the Workspace restore.
What I recommend instead is the following:
save()
and later load()
them explicitly (or
better: use saveRDS()
and readRDS()
… why?).
In fact, restoring the Workspace does the same thing, but you have less
control regarding whether the version of your objects are correct, and
what temporary variables may be loaded as well.List the current workspace contents: initially it only
contains the init()
function that was loaded from the
.Rprofile script on startup.
> ls()
[1] "init"
>
Initialize three variables
> a <- 3
> b <- 4
> c <- sqrt(a^2 +b^2)
> ls()
[1] "a" "b" "c" "init"
>
Save one item in an .RData file.
> save(a, file = "tmp.RData")
Remove one item from the Workspace. (Note: the argument for
rm()
is not the string “a”, but the variable name
a. No quotation marks!)
> rm(a)
> ls()
[1] "b" "c" "init"
>
Load what you previously saved.
> load("tmp.RData")
> ls()
[1] "a" "b" "c" "init"
Note: you can save()
more than one item in an .RData
file. When you then load()
the file, all of the objects it
contains are loaded. You don’t assign these objects -
they are being restored.
We can use the output of ls()
as input to
rm()
to remove all items from the workspace.
(cf. ?rm
for details)
rm(list = ls())
> ls()
character(0)
>
The contents of the workspace is displayed in RStudio’s Environment Pane (top-right). You can see a little “broom” icon at the top that you can click to remove all items from the workspace.
If in doubt, ask! If anything about this contents is not clear to you, do not proceed but ask for clarification. If you have ideas about how to make this material better, let’s hear them. We are aiming to compile a list of FAQs for all learning units, and your contributions will count towards your participation marks.
Improve this page! If you have questions or comments, please post them on the Quercus Discussion board with a subject line that includes the name of the unit.
[END]
For example C:Documents
would be
interpreted as C:Documents<linebreak>ew
because
is the linebreak character. Even though that’s actually
the path name on Windows, in an R command you have to write
C:Documents/new
↩︎
The Terminal app is in the Utilities sub-folder of your Applications folder.↩︎
Projects that I create for teaching are configured to use this option by default, thus once the project is loaded, the Working Directory should already be correctly set.↩︎
Actually, the first script that runs is
Rprofile.site which is found on Linux and Windows
machines in the C:Files-{version}
directory. But not on
Macs.↩︎
Operating systems commonly hide files whose name starts
with a period “.” from normal directory listings. All
files however are displayed in RStudio’s File pane. Nevertheless, it is
useful to know how to view such files by default. On Macs, you can
configure the Finder to show you such “hidden files” by default. To do
this: (i) Open a terminal window; (ii) Type: $defaults write
com.apple.Finder AppleShowAllFiles YES
(iii) Restart the Finder
by accessing Force quit (under the Apple menu),
selecting the Finder and clicking Relaunch. (iV) If you
ever want to revert this, just do the same thing but set the default to
NO
instead.↩︎