RPR-Setup

From "A B C"
Revision as of 05:32, 17 August 2017 by Boris (talk | contribs)
Jump to navigation Jump to search

Setup R to work with it


 

Keywords:  R projects; working with git version control via RStudio; the history mechanism and why not to use it; .Rprofile to customize startup behaviour; the working directory


 



 


Caution!

This unit is under development. There is some contents here but it is incomplete and/or may change significantly: links may lead to nowhere, the contents is likely going to be rearranged, and objectives, deliverables etc. may be incomplete or missing. Do not work with this material until it is updated to "live" status.


 


Abstract

...


 

This unit ...

Prerequisites

You need to complete the following units before beginning this one:


 

Objectives

...


 

Outcomes

...


 

Deliverables

  • Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.
  • Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page.


 

Evaluation

Evaluation: NA

This unit is not evaluated for course marks.


 


Contents

"Projects"

We will make extensive use of "projects" in class. Read more about projects in RStudio here.


 

Git Version control

We will also make extensive use of version control. In fact, we will now load a project via Git version control from its free, public repository on GitHub.

Task:


 

Then do the following:

  • open RStudio
  • Select File → NewProject...
  • Click on Version Control
  • Click on Git
  • Enter https://github.com/hyginn/R_Exercise-BasicSetup as the Repository URL.
  • Type a <tab> character, the Project directory name field should then autofill to read R_Exercise-BasicSetup
  • Click on Browse... to find your project directory. (The one that you have created above). Click Open.
  • Click Create Project; the project files should be downloaded and the console should prompt you to type init() to begin.
  • Type init() into the console pane.

An R script should load.

  • Explore the script and follow its instructions.


 

What could possibly go wrong?...


I get an error message
"Git not found".
The simplest reason is that you may have had RStudio open while installing git. Just restart RStudio.
The executable for Git (the Git "program" - "git.exe" on Windows, "git" elsewhere) needs to be on your system's path, or correctly specified in RStudio's options. The correct "path" to Git will depend on your operating system, and how git was installed. To find where git is installed –
On Mac and Unix systems, open a Terminal window[1] and type which git. This will either print the path (Yay), or tell you that git is not found. The latter could have two reasons: either git has no been installed in the first place, or it has been installed in a non-standard location by whatever installation manager you have used. Ask Google to help you figure out how to solve your specific case.
On Windows you can find the location of the executable by searching "git.exe" in your "programs and files". Once it's been found, right click on it and select "Open file location" from the options. It might be in C:\Program Files\Git\cmd\git.exe but the exact location depends on your operating system.
Once you know the path to your git executable, open FilePreferences, click on the Git/SVN option, click on the Browse button, and find the correct folder. On Macs you may need to click <shift> <command> G to open the "Go to ..." dialogue, then type the top-folder of the path (e.g. /usr) and click your way down to folder where the program lives. Find the installation directory and select git.exe. Then click "ok".
Then try again to create the project and let us know what happened in case it still did not work.


 
I get an error message like "directory exists and is not empty".
A directory with the name of the project already exists in the location in which you are asking RStudio to create the project. Either delete the existing directory, or install the project into a different parent directory.


 
The git icon has disappeared.
I have seen this happen when somehow the path to git has changed.
(A) Make sure the correct path to git is set in your FilePreferences → Git/SVN.
(B) Open ToolsProject options...Git/SVN. Next to Version control system git must be selected, not (None). If it is (None), change this to git. If that's not an option, the path is not correct. Go back to (A).
(C) I think you may need to restart RStudio then and reload your project via the Files → Recent projects... menu for the git icon and the version control options to reappear.



 

Working directory

To locate a file in a computer, one has to specify the filename and the directory in which the file is stored; this is also called the path of the file. The "working directory" for R is either the directory in which the R-program has been installed, or some other directory, as initialized by a startup script. You can execute the command getwd() to list what the "Working Directory" is currently set to:


> getwd()
[1] "/Users/steipe/R"

In RStudio, the contents of the working directory is listed in the Files Pane.

It is convenient to put all your R-input and output files into a project specific directory and then define this to be the "Working Directory". The R working directory is the directory that R uses when you don't specify a path. Think of it as the default directory. Use the setwd() command for this. setwd() requires an argument that you type between the parentheses: a string with the directory path, or a variable containing such a string. Strings in R are delimited with " or ' characters. If the directory does not exist, an Error will be reported. Make sure you have created the directory. On Mac and Unix systems, the usual shorthand notation for relative paths can be used: ~ for the home directory, . for the current directory, .. for the parent of the current directory.

If you use a windows system, you need know that backslashes – "\" – have a special meaning for R, they work as escape characters. For example the string "\n" means newline, and "\t" means tab. Thus R gets confused when you put backslashes into string literals, such as Windows path names. R has a simple solution: you simply use forward slashes instead of backslashes when you specify paths, and R will translate them correctly when it talks to your operating system. Instead of C:\documents\projectfiles you write C:/documents/projectfiles. Also note that on Windows the ~ tilde is a shorthand for the directory in which R is installed, not the user's home directory.


My home directory...

> setwd("~") # Note: ~ is the "tilde" - the squiggly line - not the straight hyphen
> getwd()
[1] "/Users/steipe"

Relative path: home directory, up one level, then down into chen's home directory)

> setwd("~/../chen")
> getwd()
[1] "/Users/chen"

Absolute path: specify the entire string)

> setwd("/Users/steipe/abc/R_samples")
> getwd()
[1] "Users/steipe/abc/R_samples"


In RStudio you can use the Session → Set Working Directory menu. This includes the useful option to set the current project directory as the working directory.


Task:
Since you have gone through the script of the BasicSetup project, your working directory should be set to this project directory (I have configured the project to do this automatically.)

  1. Figure out the path to its parent directory - i.e. the course- or workshop directory you created at the beginning.
  2. Use setwd("<your/path/and/directory/name>") to set the working directory to the course directory.
  3. Confirm that this has worked by typing getwd() and list.files().

The Working Directory functions can also be accessed through the Menu, under Misc.


 
  1. The Terminal app is in the Utilities sub-folder of your Applications folder.