Difference between revisions of "CSB Assignment Week 4"
m |
m |
||
(5 intermediate revisions by the same user not shown) | |||
Line 2: | Line 2: | ||
<div class="b1"> | <div class="b1"> | ||
Assignments for Week 4<br/> | Assignments for Week 4<br/> | ||
− | <span style="font-size: 70%"> | + | <span style="font-size: 70%">Setting up your local environment; Data in R</span> |
</div> | </div> | ||
Line 156: | Line 156: | ||
::* Add your name to the collaborator list and save your changed copy. | ::* Add your name to the collaborator list and save your changed copy. | ||
::* Commit your change. Make sure you always add a commit message to your commits. | ::* Commit your change. Make sure you always add a commit message to your commits. | ||
− | ::* '''sync''' again, to " | + | ::* '''sync''' again, to "push" your commit to github. |
::* Go to the [https://github.com/hyginn/Ontoscope Ontoscope repository] to confirm your commit has arrived. | ::* Go to the [https://github.com/hyginn/Ontoscope Ontoscope repository] to confirm your commit has arrived. | ||
Line 166: | Line 166: | ||
{{Vspace}} | {{Vspace}} | ||
− | ==A little bit of reading== | + | ==A little bit of (light) reading== |
{{Vspace}} | {{Vspace}} | ||
Line 173: | Line 173: | ||
{{#pmid: 24763340}} | {{#pmid: 24763340}} | ||
+ | |||
+ | Do you agree? Are there useful tools that we should know about? After all, the article is over a year old and in this game that's a lot. Anything here that we should adopt? Software design needs clearly defined requirements. There are '''functional requirements''' and '''non-functional requirements''' such as the ones the article discusses. Are there others we need to act on and add to our [http://steipe.biochemistry.utoronto.ca/abc/students/index.php/BCB420_2016_Tasks task list]? | ||
{{Vspace}} | {{Vspace}} | ||
+ | ==Advancing your '''R''' skills== | ||
+ | |||
+ | {{task|1= | ||
+ | |||
+ | Reading and writing data is another of the truly essential '''R''' skills. This brief tutorial reviews the basics: text-files, csv tables, and .Rdata objects. Load the following tutorial with its associated file as an RStudio project from github. | ||
+ | |||
+ | |||
+ | * Open RStudio | ||
+ | * Select '''File → New Project ...''' | ||
+ | * Choose '''Version control → Git ''' | ||
+ | * Enter the repository URL for the tutorial: https://github.com/hyginn/R_Exercise-Data | ||
+ | * Click on '''Create Project'''. | ||
+ | |||
+ | (In case the R script source-code does not appear in the left-hand pane, click on the file name R_Exercise-Data.R in the lower-right hand pane.) | ||
+ | |||
+ | }} | ||
Latest revision as of 18:44, 25 February 2016
Assignments for Week 4
Setting up your local environment; Data in R
< Assignment 3 | Assignment 5 > |
Note! This assignment is currently active. All significant changes will be announced on the mailing list.
Assigned material - concepts, exercises and reading - will be reflected in next week's evaluation and feedback session. Please remember to contribute to self-evaluation questions by Tuesday at noon.
Contents
Warm up
Sometimes it is easy to identify gender from names, especially if the name is taken from a religious tradition. Abraham comes to mind, or Eve. So you read a novel where one sunny day, in a schoolyard, Abraham is looking at Pat, and Pat is looking at Eve.
Can you (reasonably) know if a boy is looking at a girl? [I don't know... I mean I don't know the answer ...]
Seriously?
You're probably uncertain about Pat: Patrick? Patricia?
Hm. Do you need a hint... [Ok. A hint please...]
Does it matter?[Sorry. Of course it matters whether Pat is a girl or a boy - how could it not?]
If you think it matters, you're both right and wrong. You're right in the sense that it makes a difference. But you're wrong that it matters for the answer: if Pat is a girl, the boy Abraham looks at a girl. But if Pat is a boy, then he is looking at the girl Eve. So the answer is: yes, you can reasonably know that a boy is looking at a girl. Often, you just need to enumerate all possibilities...
Now: why "reasonably"? Well, I can't exclude that some parents named their daughter Abraham, or that they were confused about Eve ... but hey, it's just a puzzle.
Collaborative Environment
The setup of folders and files for collaborative work needs some care. We want to be able to share data and code. But we don't want to mirror all our local experiments to everyone else on the team. We want version control for all files. But not for our very large, read-only datasets. And we want our environment to be predictable - for ourselves and others. Here is a common scheme for everyone to adopt[1].
Project folder and file layout. R has a "home directory", commonly abbreviated with the tilde character "~
". A file called .Rprofile
is executed whenever R starts up and is therefore useful to define global settings. Course-related material should go into a folder called BCB420
somewhere on your computer. This folder contains three folders: Ontoscope
is the shared code repository that is mirrored on github and through which you publish the assets you develop. dev
is your local development folder where you develop and experiment. Your intermediate stages of development will be in that folder. data
contains large data-files that we don't want to put under version control.
Task:
Go through the following steps to set up this structure.
- Create Folders
- Create the following folders on your computer:
BCB420
- this will contain all course-related files. And within that folder:
dev
- this will contain your local code edits and experimentsdata
- this will contain large, stable assets that don't need to be changed.
- Create your
.Rprofile
- Many common operating systems will not allow you to edit files whose name is prefixed with a period/dot. Here's what you do instead.
- Open RStudio.
# First, use RStudio's Session -> Set working Directory -> Choose Directory...
# option to find your new <code>dev</code> directory and set it as the
# "Working Directory". The type:
getwd()
# ... to see the correct path to "dev".
# Here's how you find your home directory:
path.expand("~")
# You can use this to edit/create .Rprofile in the right place.
file.edit(paste(path.expand("~"), ".Rprofile", sep=""))
# Once the file is opened,
# add the following three lines to your .Rprofile
DEVDIR <- "/path/to/your/BCB420/dev/"
setwd(DEVDIR)
# Yes, the third line is empty. Make it a habit to end the lat line of your
# script with a carriage return. This matters sometimes. Also, please, please
# for the love of everything that is holy: don't actually type "/path/to/your.."
# I hope it's obvious that that is just a placeholder for the real path on
# _your_ computer which you have just previously printed to the console
# from where you can copy it.
# Save, and close .Rprofile.
# Exit, and restart RStudio.
# Confirm that your .Rprofile does what it should. Type:
DEVDIR
getwd()
# ... to demonstrate that the object exists, has the right contents, and that
# the setwd() command has correctly changed the working directory. If that doesn't
# work, contact me or the list to troubleshoot and fix.
- Setup to use github
- Create a github account and install the github desktop client on your computer.
- Clone the Repository
- Go through the following steps to create your local copy of the shared assets.
- Navigate to the Ontoscope repository
- Find the button that looks like a computer monitor with an arrow. If you hover over it it should explain that it is used to save the repository to your local computer. Click on that button. An "External Protocol Request" warning should appear. Click to Launch Application.
- Your github desktop application will open. Find and select your
BCB420
folder to clone the repository into and keep the name. Click on Clone. - Check that the folder has been created in the right spot and contains the same files you see in the github repository on the Web.
- Email me your github user name so I can add you as collaborator to the repository.
- Initialize your development folder
- Copy
codeTemplate.R
from theOntoscope
folder to yourdev
folder. Rename it tomyCode.R
to avoid confusion. You can edit and adapt this template for your own code. But you should also place yourdev
folder under version control, so that you can work effectively... - In your github desktop client click the (+) button (Add a repository). Choose
Ontoscope-dev
as the Name (don't just call it "dev", you may be working on several projects in the future...). Then Choose... yourdev
folder and Create Repository.... This will now appear under the Other category in the side-bar: you have full version control over the folder, but it is not mirrored to github.
- Checking in
- Let's make sure this works. After I have added you as collaborator, add your name to the Readme document ...
- Open the github desktop client.
- Select the Ontoscope repository.
- Click the sync button to download the most recent version of all assets.
- Open your local copy of
Readme.md
in notepad, or RStudio. - Add your name to the collaborator list and save your changed copy.
- Commit your change. Make sure you always add a commit message to your commits.
- sync again, to "push" your commit to github.
- Go to the Ontoscope repository to confirm your commit has arrived.
- If you have added your name before Tuesday's class session, you have earned yourself 3 marks for the quiz.
A little bit of (light) reading
Here are some useful observations on scientific data in the lab...
Goodman et al. (2014) Ten simple rules for the care and feeding of scientific data. PLoS Comput Biol 10:e1003542. (pmid: 24763340) |
Do you agree? Are there useful tools that we should know about? After all, the article is over a year old and in this game that's a lot. Anything here that we should adopt? Software design needs clearly defined requirements. There are functional requirements and non-functional requirements such as the ones the article discusses. Are there others we need to act on and add to our task list?
Advancing your R skills
Task:
Reading and writing data is another of the truly essential R skills. This brief tutorial reviews the basics: text-files, csv tables, and .Rdata objects. Load the following tutorial with its associated file as an RStudio project from github.
- Open RStudio
- Select File → New Project ...
- Choose Version control → Git
- Enter the repository URL for the tutorial: https://github.com/hyginn/R_Exercise-Data
- Click on Create Project.
(In case the R script source-code does not appear in the left-hand pane, click on the file name R_Exercise-Data.R in the lower-right hand pane.)
- That is all.
Footnotes and references
- ↑ You might be used to do things differently, but for this project, do it this way. But if you think this can be improved, let's talk.
- Ask, if things don't work for you!
- If anything about the assignment is not clear to you, please ask on the mailing list. You can be certain that others will have had similar problems. Success comes from joining the conversation.
- Do consider how to ask your questions so that a meaningful answer is possible. the following two links:
- How to create a Minimal, Complete, and Verifiable example on stackoverflow and ...
- How to make a great R reproducible example
- ... are required reading.
< Assignment 3 | Assignment 5 > |