Literate Programming with R

Contents
RMarkdown
R Notebooks
Further Reading
Questions, comments
References

Expected Preparations:

	[RPR] Introduction
	The units listed above are part of this course and contain important preparatory material.

Keywords: Literate programming principles; R Markdown; R Notebooks

Objectives:

This unit will …

… introduce the philosophy behind “Literate Programming”;
… teach the practice with an example that uses knitr in the RStudio environment;
… point you to R notebooks;

Outcomes:

After working through this unit you …

… can produce your own “Literate Programs” with knitr or in an R notebook.

Deliverables:

Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.

Journal: Document your progress in your Course Journal. Some tasks may ask you to include specific items in your journal. Don’t overlook these.

Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page.

Evaluation:

NA: This unit is not evaluated for course marks.

Documentation of results using R markdown and R notebooks.

Literate programming(W) is an idea that software is best described in a natural language, focussing on the logic of the program, i.e. the why of code, not the what. The goal is to ensure that model, code, and documentation become a single unit, and that all this information is stored in one and only one location. The product should be consistent between its described goals and its implementation, seamless in capturing the process from start (data input) to end (visualization, interpretation), and reversible (between analysis, design and implementation).

In literate programming, narrative and computer code are kept in the same file. This source document is typically written in Markdown or LaTeX syntax and includes the programming code as well as text annotations, tables, formulas etc. The supporting software can weave human-readable documentation from this, or tangle executable code. Literate programming with both Markdown and LaTex is supported by R Studio and this makes the R Studio interface a useful development environment for this paradigm. While it is easy to edit source files with a different editor and process files in base R after loading the {{R|utils|Sweave()|Sweave() and Stangle()}} functions or the {{R|knitr}} package. In our context here we will use R Studio because it conveniently integrates the functionality we need.


{{R|knitr}} is an R package for literate programming(W). It is integrated with R Studio.



RMarkdown
Markdown is an extremely simple and informal way of structuring
documents that is useful if for some reason you feel html is too
complicated. That’s really all it does: format documents in a simple way
so they can be displayed as Web pages. For Markdown documentation, see
here.. The concept is quite similar to Wiki markup syntax,
the syntax is (regrettably) different, and for a number of features
there there are (regrettably) several different ways to achieve the same
results.
RMarkdown
is an R package that is integrated with R
Studio and allows integrating R code with
Markdown documents. knitr can work with Markdown files, and this gives
additional output options, such as PDF and MSWord documents.
Let’s give it a try: we’ll write and document an R
function that will find us a random phobia to ponder on.

Task…



Open your normal R Studio session
Select File ▸ New File ▸ RMarkdown…. When you do
this the first time, R Studio will ask you whether you
want to install/update a number of required packages. Click
Yes.
Enter “Random Phobia” as the Title and your name
as the Author, select to create a
Document, and check HTML as the
default output option. R Studio will load some default
text and markup into the script pane which we can edit.
Choose Help ▸ Cheatssheets ▸ R Markdown Cheat
Sheet and R Markdown Reference Guide to
download two PDFs via your browser. Browse the contents to get an idea
where you can clarify concepts as you go through this example.

Let’s introduce our plan:

We’ll create a markdown document containing code and
explanations
We’ll knit it into an HTML document and examine it
Then we’ll have a look at its structure, to learn how it works.

** Creating the Document** * First, delete everything except the
header block from your new Markdown document. * There is a document in
the data folder called data/RandomPhobiaPage.txt. Open
that, copy its entire contents, and paste it under the header block. *
Save the document as myScripts/RandomPhobias.Rmd
** Knitting the Document to HTML * Right above the edit
window, next to the search (looking glass) icon, there is an icon of a
ball of wool with a knitting needle … click on Knit**. * The
.HTML document will be created and opened.
** Inspecting the Markdown source**
Note the following Markdown elements in the code:

After the header-block, three “backticks” delimit a functional
block (curly braces). In this case the block injects a local bit of .css
to create striped tables later on. This is entirely optional and can be
deleted in your own code if you have no need for it.
Then there is another code block. This one is crucial. the first
letter in the curly braces is r and this bit of code will
be run as R code. include=FALSE ensures the code block is
not actually shown in the output, but it sets a knitr:: option. This is
what is know as a “code chunk”. It is delimited by three backticks
``` and has directives and options for the chunk in the
first line. It is labelled as R code, and note that
after the {r  we have added an (optional) label for the
chunk. That is useful, because we can rapidly navigate between chunks
(click on the navigation menu at the bottom of the script
pane), and we can refer to the labels to execute chunks that are coded
later in the document at an earlier stage. This is an important idea of
literate programming: the flow of the document should not be determined
by the requirements of the code, but by the logic of the narrative.
TLDR; label your chunks. It’s useful.

: Other options can be added after a comma, for example we can
suppress printing of a chunk into the document altogether, if we think
it is not relevant for the document, by adding the option
echo=FALSE¹.

Then the text begins. It is writting in markdown syntax which is
a simple way to annotate text. Note the conventions to create headers
and links etc.
To execute a particular chunk, simply place the cursor into the
chunk and select Chunks ▸ Run Current Chunk from the
menu at the top of the script pane.
More text describes how we screen-scrape tables from a Wikipedia
page, and the code chunks run that code.

: Some things to note here:


Enclosing a piece of text in “backticks”
Text... formats that text as “code” -
typically in a fixed-width font.


For this chunk we have set the option cache as
TRUE. This is a very useful and well conceived mechanism
that avoids recomputing code that takes a long time or should otherwise
be limited. The results of a cached chunk of code are stored locally and
retrieved when the file is weaved. Only if anything within the
chunk is changed (or cache is set to FALSE),
is the chunk evaluated again. This prevents us from excessively pounding
on Websites as we develop our script, which can save a lot of time as
our projects become large and the calculations become complex.

As the code continues, we have more of this mix of markup code and
R code. Two important options in the chunk header:
echo=FALSE prevents the contents of the chunk to be
printed. We don’t want this code in our output, we only want the
result.
results=‘asis’ prevents the results from being marked
up. The raw HTML is sent to the output document.

But note the following: the code chunk that creates the table calls a
function randRow(M) that we have not defined yet. In an
R script this would not work. But in a knitr document
we can reference a chunk of code anywhere in (and
outside) of the document and thus define our function before the
renderPhobiaTable chunk is executed. This is important for
literate programming, where we don’t want to be constrained by the
requirements of the code.
You have a working markdown page, and that should go a long way to
help you write your own.


 



R Notebooks
R Notebooks take the concept into the RStudio editor itself, rather
than constructing a Webpage. On one hand, you become dependent on the
RStudio editor, on the other hand, you directly edit and comment as you
are developing. This is true “Literate
Programming”.

Task…


Read about the concept here
and follow along with the exercise.


 




Further Reading


R
Markdown Reference Guide (PDF @ RStudio)




Using
rvest to Scrape an HTML Table (R Bloggers)



 



Thoughts
on notebooks and literate programming (Tony Hirst, via R Bloggers)




Questions, comments
If in doubt, ask! If anything about this contents is
not clear to you, do not proceed but ask for clarification. If you have
ideas about how to make this material better, let’s hear them. We are
aiming to compile a list of FAQs for all learning units, and your
contributions will count towards your participation marks.
Improve this page! If you have questions or
comments, please post them on the Quercus Discussion board with a
subject line that includes the name of the unit.


References






About this page …



Page ID: RPR-Literate_programming

Author:

Boris Steipe ( <boris.steipe@utoronto.ca>
)

Created:

2017-09-17

Last modified:

2022-09-14

Version:

2.0

Version History:

–  2.0 Update after complete rewrite of sample .Rmd - don’t assemble the
page piecewise




–  1.2 Change from require() to requireNamespace() and use
<package>::<function>() idiom.




–  1.1 bugfix, comment on header tags in a table, add an eval
question




–  1.0 First live version




–  0.1 First stub



Tagged with:

–  Unit




–  Live




–  Has further reading








 

[END]




For a complete list of chunk options, see the documentation by knitr’s
author, Xie Yihui.↩︎

Literate Programming with R

Boris Steipe

Contents

RMarkdown

R Notebooks

Further Reading

Questions, comments

References