Difference between revisions of "RPR-OBJECTS-Data frames"

From "A B C"
Jump to navigation Jump to search
m
m
Line 19: Line 19:
  
  
{{DEV}}
+
{{LIVE}}
  
 
{{Vspace}}
 
{{Vspace}}
Line 47: Line 47:
 
=== Objectives ===
 
=== Objectives ===
 
<!-- included from "../components/RPR-Objects-Dataframes.components.wtxt", section: "objectives" -->
 
<!-- included from "../components/RPR-Objects-Dataframes.components.wtxt", section: "objectives" -->
...
+
This unit will ...
 +
* ... introduce ;
 +
* ... discuss ;
 +
* ... teach ;
  
 
{{Vspace}}
 
{{Vspace}}
Line 54: Line 57:
 
=== Outcomes ===
 
=== Outcomes ===
 
<!-- included from "../components/RPR-Objects-Dataframes.components.wtxt", section: "outcomes" -->
 
<!-- included from "../components/RPR-Objects-Dataframes.components.wtxt", section: "outcomes" -->
...
+
After working through this unit you ...
 +
* ... have done;
 +
* ... know how ;
 +
* ... can ;
  
 
{{Vspace}}
 
{{Vspace}}
Line 178: Line 184:
 
:2017-08-05
 
:2017-08-05
 
<b>Modified:</b><br />
 
<b>Modified:</b><br />
:2017-08-05
+
:2017-09-10
 
<b>Version:</b><br />
 
<b>Version:</b><br />
:0.1
+
:1.0
 
<b>Version history:</b><br />
 
<b>Version history:</b><br />
*0.1 First stub
+
*1.0 Completed to first live version
 +
*0.1 Material collected from previous tutorial
 
</div>
 
</div>
 
[[Category:ABC-units]]
 
[[Category:ABC-units]]

Revision as of 15:57, 10 September 2017

Abstract

...


 


This unit ...

Prerequisites

You need to complete the following units before beginning this one:


 


Objectives

This unit will ...

  • ... introduce ;
  • ... discuss ;
  • ... teach ;


 


Outcomes

After working through this unit you ...

  • ... have done;
  • ... know how ;
  • ... can ;


 


Deliverables

  • Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.
  • Journal: Document your progress in your Course Journal. Some tasks may ask you to include specific items in your journal. Don't overlook these.
  • Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page.


 


Evaluation

Evaluation: NA

This unit is not evaluated for course marks.


 


Contents

Data frames

Data frames are one of the most important types of data object in bioinformatics because they emulate our mental model of data in a spreadsheet and can be used to implement datamodels.

Usually the result of reading external data from an input file is a data frame. The file below is included with the BasicSetup project files - it is called plasmidData.tsv, and you can click on it in the Files Pane to open and inspect it.

Name	Size	Marker	Ori	Sites
pUC19	2686	Amp	ColE1	EcoRI, SacI, SmaI, BamHI, XbaI, PstI, HindIII
pBR322	4361	Amp, Tet	ColE1	EcoRI, ClaI, HindIII
pACYC184	4245	Tet, Cam	p15A	ClaI, HindIII

This data set uses tabs as column separators and it has a header line. Similar files can be exported from Excel or other spreadsheet programs. Read this as a data frame as follows:

plasmidData <- read.table("plasmidData.tsv", sep="\t", header=TRUE, stringsAsFactors = FALSE)
plasmidData   # show what the data frame contains

Note the argument stringsAsFactors = FALSE. If this is TRUE instead, R will convert all strings in the input to factors and this may lead to problems. Make it a habit to turn this behaviour off, you can always turn a column of strings into factors when you actually mean to have factors.

You can view the data frame contents by clicking on the spreadsheet icon behind its name in the Environment Pane.


 



 


Further reading, links and resources

 


Notes


 


Self-evaluation

 



 




 

If in doubt, ask! If anything about this learning unit is not clear to you, do not proceed blindly but ask for clarification. Post your question on the course mailing list: others are likely to have similar problems. Or send an email to your instructor.



 

About ...
 
Author:

Boris Steipe <boris.steipe@utoronto.ca>

Created:

2017-08-05

Modified:

2017-09-10

Version:

1.0

Version history:

  • 1.0 Completed to first live version
  • 0.1 Material collected from previous tutorial

CreativeCommonsBy.png This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.