Difference between revisions of "CSB Assignment Week 2"

From "A B C"
Jump to navigation Jump to search
m
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
<div id="CSB">
 
<div id="CSB">
 
<div class="b1">
 
<div class="b1">
Assignments for Week 2
+
Assignments for Week 2<br/>
 +
<span style="font-size: 70%">Collaboration tools, initializing our project.</span>
 
</div>
 
</div>
  
 +
<table style="width:100%;"><tr>
 +
<td style="height:30px; vertical-align:middle; text-align:left; font-size:80%;">[[CSB_Assignment_Week_1|&lt;&nbsp;Assignment&nbsp;1]]</td>
 +
<td style="height:30px; vertical-align:middle; text-align:right; font-size:80%;">[[CSB_Assignment_Week_3|Assignment&nbsp;3&nbsp;&gt;]]</td>
 +
</tr></table>
  
{{Inactive}}
+
{{Active}}
  
Assigned material will be reflected on next week's quiz. Please remember to contribute to [http://steipe.biochemistry.utoronto.ca/abc/students/index.php/BCB420_Quiz_questions '''quiz questions'''] by Tuesday, 20:00.
+
Assigned material - concepts, exercises and reading - will be reflected in next week's evaluation and feedback session. Please remember to contribute to [http://steipe.biochemistry.utoronto.ca/abc/students/index.php/BCB420_self-evaluation_questions '''self-evaluation questions'''] by Tuesday at noon.
  
 +
{{Vspace}}
  
 
__TOC__
 
__TOC__
  
 
+
{{Vspace}}
;Special dates:
 
* Post your workflow sketch by Monday.
 
* Do the '''R''' tasks when they are announced.
 
* Post your quiz questions by Tuesday, 20:00.
 
* All other tasks are due by next week's class.
 
 
 
  
 
==Warm up==
 
==Warm up==
Line 25: Line 25:
 
You go to the Toronto Zoo. You see [http://www.torontozoo.com/ExploretheZoo/AnimalDetails.asp?pg=370 giraffes], [http://www.torontozoo.com/ExploretheZoo/AnimalDetails.asp?pg=619 ostriches] and a [http://www.torontozoo.com/ExploretheZoo/AnimalDetails.asp?pg=525 green tree python]. Altogether they have 30 eyes and 44 legs.
 
You go to the Toronto Zoo. You see [http://www.torontozoo.com/ExploretheZoo/AnimalDetails.asp?pg=370 giraffes], [http://www.torontozoo.com/ExploretheZoo/AnimalDetails.asp?pg=619 ostriches] and a [http://www.torontozoo.com/ExploretheZoo/AnimalDetails.asp?pg=525 green tree python]. Altogether they have 30 eyes and 44 legs.
  
'''How many necks do these animals have?''' <span class="mw-customtoggle-01" style="background:#F2F2FF;padding:5px;font-size:80%;margin-left:50px">[''I&nbsp;don't&nbsp;know...'']</span>
+
'''How many necks does this group of animals have?''' <span class="mw-customtoggle-01" style="background:#F2F2FF;padding:5px;font-size:80%;margin-left:50px">[''I&nbsp;don't&nbsp;know...'']</span>
  
 
<div class="mw-collapsible-content" style="padding:10px;">
 
<div class="mw-collapsible-content" style="padding:10px;">
Line 31: Line 31:
 
<div class="mw-collapsible mw-collapsed" id="mw-customcollapsible-02" style="border:solid 1px #000000;padding:10px;background-color:#F2F2FF">
 
<div class="mw-collapsible mw-collapsed" id="mw-customcollapsible-02" style="border:solid 1px #000000;padding:10px;background-color:#F2F2FF">
 
Seriously?<br />
 
Seriously?<br />
This not so hard.<br />
+
This may be easier than you think.<br />
 
Maybe you are wondering [http://onlinelibrary.wiley.com/doi/10.1002/jmor.20037/full whether snakes have necks]? <small>(TLDR; It's complicated. <small>But: yes.</small>)</small><br />
 
Maybe you are wondering [http://onlinelibrary.wiley.com/doi/10.1002/jmor.20037/full whether snakes have necks]? <small>(TLDR; It's complicated. <small>But: yes.</small>)</small><br />
Or do you need a hint ... <span class="mw-customtoggle-02" style="background:#EAEAFF;padding:5px;font-size:80%;margin-left:50px">[''Ok.&nbsp;A&nbsp;hint&nbsp;please...'']</span>
+
Or do you need a hint? <span class="mw-customtoggle-02" style="background:#EAEAFF;padding:5px;font-size:80%;margin-left:50px">[''Ok.&nbsp;A&nbsp;hint&nbsp;please...'']</span>
  
 
<div class="mw-collapsible-content" style="padding:10px;">
 
<div class="mw-collapsible-content" style="padding:10px;">
Line 42: Line 42:
 
<div class="mw-collapsible-content" style="padding:5px;border:solid 1px #000000;background-color:#E2E2FF">
 
<div class="mw-collapsible-content" style="padding:5px;border:solid 1px #000000;background-color:#E2E2FF">
  
It's really quite simple. Thirty eyes are in fifteen heads. Fifteen heads attached to fifteen necks. Fifteeeen. No more. No less.<br />
+
It's really quite simple. Thirty eyes are in fifteen heads. Fifteen heads are attached to fifteen necks. Fifteeeen. No more. No less.<br />
<small>How many of each? You can't tell. Not less than two giraffes. Not more than twenty ostriches. And no more then ten giraffes. And one snake. But that wasn't the question.</small>
+
<small>How many of each? You could calculate this by substitution. Eight giraffes. Six ostriches. And one snake. But that wasn't the question.</small>
  
 
</div>
 
</div>
Line 55: Line 55:
 
<section end=warm-up />
 
<section end=warm-up />
  
==Towards systems discovery==
 
  
In class, we have discussed a number of data sources, the exemplar workflows of the papers you have posted, and some strategies to determine whether genes could be functionally interacting, or "collaborating" with each other. I have distilled the data sources and the strategies into tables that I have posted on the Student Wiki's project resource section.
+
==Software Development==
  
{{task|
+
{{task|1=
;Existing databases and strategies
+
Great ideas will come to nothing if we can't put them into code.
# Study the [http://steipe.biochemistry.utoronto.ca/abc/students/index.php/BCB420_2015_Data_Sources '''Data Sources'''] page on the student Wiki. Navigate to the linked databases. Browse around. Get a sense of what data is available and how it can be accessed.
 
# For one of the databases, fill in the data access information.
 
# Study the [http://steipe.biochemistry.utoronto.ca/abc/students/index.php/BCB420_2015_Systems_Discovery_Strategies '''System Discovery Strategies'''] page on the student Wiki.
 
##Think about the listed strategies.
 
##See if you can add information.
 
##See if you can add a strategy.
 
##See if you can add a comment.
 
  
&nbsp;
+
* For an introduction to concepts of developing software, carefully read the '''[[Software Development]]''' page on this Wiki.
;New workflows
+
* Work through the [[R_knitr|'''knitr and RMarkdown''']] tutorial.
I have put a [http://steipe.biochemistry.utoronto.ca/abc/students/index.php/BCB420_2015_Workflow_Collection '''Workflow Collection'''] page on the student Wiki.
+
* Get a github account if you don't have one yet.
  
# Create a "Project" subpage on your User page (follow the instructions from [[CSB_Assignment_Week_1#Student_Wiki|Assignment 1]]). On that page draft a workflow for '''data driven systems discovery''' using data/strategies of your choice. Keep this maximally brief (not more than three or four sentences). But be specific: make sure that the data you need is actually available, the algorithms are defined, and the computations are tractable. Discuss this on the list if you wish, or simply ask for feedback on your idea.
 
# '''Transclude''' your paragraph to the Workflow Collection (instructions are there).
 
 
}}
 
}}
  
 +
{{Vspace}}
  
&nbsp;
+
==Advancing your '''R''' skills==
 
 
 
 
&nbsp;
 
 
 
==Software Development==
 
 
 
* Habits (Projects, IDE and debugging, Version control)
 
* Collaboration (Wiki, Git, Etherpad)
 
* Development (TDD, Literate Development)
 
* Testing (Unit testing, Integration testing)
 
 
 
 
 
*Work through '''R Studio''' development tutorial (TBD)
 
*Work through '''github''' tutorial (TBD)
 
 
 
  
&nbsp;
+
{{Vspace}}
  
 +
{{#lst:Software_Development|RStudio_projects}}
  
&nbsp;
+
{{Vspace}}
  
==Pre-reading==
+
{{#lst:CSB_Assignment_Week_1|assignment_footer}}
In week 3, we will discuss various aspects of working with genome-scale data sets. For many experimental approaches, the ultimate outcome is a list of genes and the challenge is how to infer information from what such lists have in common:
 
{{#lst:CSB_Gene_lists|reading}}
 
  
  
 +
<table style="width:100%;"><tr>
 +
<td style="height:30px; vertical-align:middle; text-align:left; font-size:80%;">[[CSB_Assignment_Week_1|&lt;&nbsp;Assignment&nbsp;1]]</td>
 +
<td style="height:30px; vertical-align:middle; text-align:right; font-size:80%;">[[CSB_Assignment_Week_3|Assignment&nbsp;3&nbsp;&gt;]]</td>
 +
</tr></table>
  
 
[[Category:Computational_Systems_Biology]]
 
[[Category:Computational_Systems_Biology]]
  
 
</div>
 
</div>

Latest revision as of 16:45, 2 February 2016

Assignments for Week 2
Collaboration tools, initializing our project.

< Assignment 1 Assignment 3 >

Note! This assignment is currently active. All significant changes will be announced on the mailing list.

 
 

Assigned material - concepts, exercises and reading - will be reflected in next week's evaluation and feedback session. Please remember to contribute to self-evaluation questions by Tuesday at noon.


 


 

Warm up

You go to the Toronto Zoo. You see giraffes, ostriches and a green tree python. Altogether they have 30 eyes and 44 legs.

How many necks does this group of animals have? [I don't know...]

Seriously?
This may be easier than you think.
Maybe you are wondering whether snakes have necks? (TLDR; It's complicated. But: yes.)
Or do you need a hint? [Ok. A hint please...]

Maybe you are just confused by some irrelevant information.[No. I still don't get it...]

It's really quite simple. Thirty eyes are in fifteen heads. Fifteen heads are attached to fifteen necks. Fifteeeen. No more. No less.
How many of each? You could calculate this by substitution. Eight giraffes. Six ostriches. And one snake. But that wasn't the question.


Software Development

Task:
Great ideas will come to nothing if we can't put them into code.

  • For an introduction to concepts of developing software, carefully read the Software Development page on this Wiki.
  • Work through the knitr and RMarkdown tutorial.
  • Get a github account if you don't have one yet.


 

Advancing your R skills

 


Task:

In this task you set up a github repository for an R project. You create the project on your local machine, write some code and customize it. Then you upload your local project to your github repository.


  • Navigate to your github account (create an account if you don't have one yet.)
  • Click on New Repository on your profile page in the "Your repositories" section
  • Give the repository a useful, stable name for a project - "sample_project" or "test_project" would be appropriate for this task.
  • Check Initialize this repository with a README
    • ... and click Create repository.


 
  • Open RStudio
  • Click on File → New Project ...
  • Select Version Control
  • Select Git
  • Paste the Repository URL of your Github project ...
  • Use the github repository name as your Project directory name and Browse... to select a good folder where the project folder can live on your computer.
  • Click Create Project.
  • Click File → New File → R Script
  • Write some code: for example, write a function that returns n random passwords of length l (where n and l are parameters). l is the number of syllables to use. A syllable is defined as a consonant or consonant cluster (onset) and a vowel or diphthong (nucleus).
  • Save the script file in your project directory. Make sure that it has the extension .R (RStudio should have added the extension by default.)
  • Set your working directory: use Session → Set Working Directory → To Project Directory
  • Use rm(list = ls()) to clear your workspace.
  • Open Tools → Project Options... and set the following options:
  • in the General pane:
    • Restore most recetly opened project ... No
    • Restore previously open source documents ... Yes
    • Restore .RData ... No
    • Save workspace ... No
    • Always save history ... No

This will ensure you start the project with a clean slate and the same environment every time. If you do not do this, your project may not be fully reproducible code that can be shared among many people.


  • In the Code editing pane:
    • Check all options ON
    • Set Tab width to 3 spaces
    • Make sure Text-encoding is UTF-8

All other options are probably oK in their default state - packrat should be OFF (but you may turn it ON if you need it at some point and you understand what it does.)


  • Click OK.
  • Now open a new text file (not R script), add a single blank line and save this in your project directory under the name .Rprofile. R uses this file to customize a working session. Right now there is nothing in the file but you can always put something into the file later.
  • Close the .RProfile pane, only your R script should be open. This window will be restored when you open your project and you don't want .Rprofile to be opened every time you start the project.
  • Quit RStudio.


  • Open the project folder and delete the .Rhistory file (there is probably one there from when the project was originally created.)
  • Now restart R Studio and click File → Open Project in a New Window.... Everything should be there:
    • Your workspace should be empty
    • Your working directory should be the project folder. Test this with getwd()
    • Your source file should be open to use and develop further.

One thing left to do: you haven't synchronized your project on Github yet.

  • Open the version control interface with Tools → Version Control & Commit ...
  • Click on each files' check-box to "stage" it, in turn, write a meaningful commit message, and click on Commit. Close the log windows.
  • Finally: click on Push to upload all changes to your github repository.
  • In your browser, navigate back to your github repository and verify that all changes have arrived.


Your project setup is now complete. To review the ideas we covered here, look up the following two R Studio support pages:


 


 
That is all.


 

Footnotes and references

 



 


 
Ask, if things don't work for you!
If anything about the assignment is not clear to you, please ask on the mailing list. You can be certain that others will have had similar problems. Success comes from joining the conversation.
... are required reading.


 



< Assignment 1 Assignment 3 >