Rmarkdown

From 太極
Revision as of 13:27, 20 November 2019 by Brb (talk | contribs) (→‎Chunk options)
Jump to navigation Jump to search

Markdown language

According to wikipedia:

Markdown is a lightweight markup language, originally created by John Gruber with substantial contributions from Aaron Swartz, allowing people “to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML)”.

  • Markup is a general term for content formatting - such as HTML - but markdown is a library that generates HTML markup.
  • Convert mediawiki to markdown using online conversion tool from pandoc.

Github markdown Readme.md

How to nest code within a list using Markdown

https://meta.stackexchange.com/questions/3792/how-to-nest-code-within-a-list-using-markdown

Continuous publication

Open collaborative writing with Manubot Himmelstein et al 2019

Syntax

Comment: https://stackoverflow.com/questions/4823468/comments-in-markdown. The html method does not work. I need to try use Shift+ CMD + c.

Comment out some chunks/part of Rmd file.

Table

Simple example

| Column 1       | Column 2     | Column 3     |
| :------------- | :----------: | -----------: |
|  Cell Contents | More Stuff   | And Again    |
| You Can Also   | Put Pipes In | Like this \| |

Rmarkdown

HTML5 slides examples

Software requirement

Slide #22 gives an instruction to create

  • regular html file by using RStudio -> Knit HTML button
  • HTML5 slides by using pandoc from command line.

Files:

  • Rcmd source: 009-slides.Rmd Note that IE 8 was not supported by github. For IE 9, be sure to turn off "Compatibility View".
  • markdown output: 009-slides.md
  • HTML output: 009-slides.html

We can create Rcmd source in Rstudio by File -> New -> R Markdown.

There are 4 ways to produce slides with pandoc

  • S5
  • DZSlides
  • Slidy
  • Slideous

Use the markdown file (md) and convert it with pandoc

pandoc -s -S -i -t dzslides --mathjax html5_slides.md -o html5_slides.html

If we are comfortable with HTML and CSS code, open the html file (generated by pandoc) and modify the CSS style at will.

Syntax

Mastering R presentations

YAML

Some examples

---
title: "My Title"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
  pdf_document:
    toc: true
    number_sections: true
classoption: landscape    
---
---
title: "My Title"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output: latex_document
---

Chunk options

Some options:

  • echo=FALSE. whether to include R source code in the output file
  • message=FALSE. whether to preserve messages emitted by message() (similar to warning)
  • results = 'hide'. hide results; this option only applies to normal R output (not warnings, messages or errors) like print() or cat().
  • include. whether to include the chunk output in the final output document;
  • warning. whether to preserve warnings (produced by warning()) in the output like we run R code in a terminal (if FALSE, all warnings will be printed in the console instead of the output document).
  • error. whether to preserve errors (from stop()); by default, the evaluation will not stop even in case of errors!! if we want R to stop on errors, we need to set this option to FALSE
  • comment. Remove Hashes in R Output from R Markdown and Knitr
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, cache = TRUE, warning = FALSE, 
                      message = FALSE, verbose = FALSE)
```

Figure caption

Figure size

Different ways to set figure size in RMarkdown

1. Define size in YAML header

--- 
title: "My Document" 
output: html_document: 
fig_width: 6 
fig_height: 4 
--- 

2. Global chunk

knitr::opts_chunk$set(fig.width=12, fig.height=8)

3. Chunk options

{r fig1, fig.height = 3, fig.width = 5}

{r fig3, fig.width = 5, fig.asp = .62}

{r fig4, out.width = '40%'}

TinyTex for pdf output

https://github.com/yihui/tinytex

On NIH/biowulf, there is no 'pdflatex' program. So a pdf file cannot be generated.

I install tinytex. At the end, many latex executable files (pdflatex, bibtex, ...) are installed under ~/bin directory.

> install.packages("tinytex")
trying URL 'https://yihui.name/gh/tinytex/tools/install-unx.sh'
Content type 'text/plain; charset=utf-8' length 616 bytes
==================================================
...
tlmgr: package log updated: /spin1/home/linux/USERNAME/.TinyTeX/texmf-var/web2c/tlmgr.log
TinyTeX installed to /spin1/home/linux/USERNAME/.TinyTeX
You may have to restart your system after installing TinyTeX to make sure ~/bin appears in your PATH variable (https://github.com/yihui/tinytex/issues/16).

Built-in examples from rmarkdown

# This is done on my ODroid xu4 running Ubuntu Mate 15.10 (Wily)
# I used sudo apt-get install pandoc in shell
# and install.packages("rmarkdown") in R 3.2.3

library(rmarkdown)
rmarkdown::render("~/R/armv7l-unknown-linux-gnueabihf-library/3.2/rmarkdown/rmarkdown/templates/html_vignette/skeleton/skeleton.Rmd")
# the output <skeleton.html> is located under the same dir as <skeleton.Rmd>

Note that the image files in the html are embedded Base64 images in the html file. See

Templates

Knit button

  • It calls rmarkdown::render()
  • R Markdown = knitr + Pandoc
  • rmarkdown::render () = knitr::knit() + a system() call to pandoc

Pandoc's Markdown

Originally Pandoc is for html.

Extensions

  • YAML metadata
  • Latex Math
  • syntax highlight
  • embed raw HTML/Latex (raw HTML only works for HTML output and raw Latex only for Latex/pdf output)
  • tables
  • footnotes
  • citations

Types of output documents

  • Latex/pdf, HTML, Word
  • beamer, ioslides, Slidy, reval.js
  • Ebooks
  • ...

Some examples:

pandoc test.md -o test.html
pandoc test.md -s --mathjax -o test.html
pandoc test.md -o test.docx
pandoc test.md -o test.pdf
pandoc test.md --latex-engine=xlelatex -o test.pdf
pandoc test.md -o test.epb

Check out ?rmarkdown::pandoc_convert()/

When you click the Knit button in RStudio, you will see the actual command that is executed.

Global options

Suppose I want to create a simple markdown only documentation without worrying about executing code, instead of adding eval = FALSE to each code chunks, I can insert the following between YAML header and the content. Even bash chunks will not be executed.

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, eval = FALSE)
```

Examples/gallery

Some examples of creating papers (with references) based on knitr can be found on the Papers and reports section of the knitr website.

Read the docs Sphinx theme and journal article formats

http://blog.rstudio.org/2016/03/21/r-markdown-custom-formats/

rmarkdown news

Useful tricks when including images in Rmarkdown documents

http://blog.revolutionanalytics.com/2017/06/rmarkdown-tricks.html

tables

kable

Quicker knitr kables in RStudio notebook

xtable

The package assume the document type is html or pdf. Other types like doc does not work.

xtableList() can create a list of tables; see xtable List of Tables Gallery.

Below is an Rmarkdown example that would generate a pdf file with desired tables. Pay attention to various options here because the default options won't work.

---
title: "xtable in rmarkdown"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output: pdf_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, message=FALSE)
```

```{r , results='asis'}
require(xtable)
data(mtcars)
mtcars <- mtcars[, 1:6]
mtcarsList <- split(mtcars, f = mtcars$cyl)
### Reduce the size of the list elements
mtcarsList[[1]] <- mtcarsList[[1]][1,]
mtcarsList[[2]] <- mtcarsList[[2]][1:2,]
mtcarsList[[3]] <- mtcarsList[[3]][1:3,]
attr(mtcarsList, "subheadings") <- paste0("Number of cylinders = ",
names(mtcarsList))
attr(mtcarsList, "message") <- c("Line 1 of Message",
                                 "Line 2 of Message")
xList <- xtableList(mtcarsList)
print.xtableList(xList, comment=FALSE)
```

The following example is to give alternative colors on rows. Note the option header-includes and tables in YAML section

---
title: "testxtable"
header-includes:
   - \usepackage{colortbl}
output: pdf_document
tables: true
---

```{r cars, results='asis'}
library(xtable)
mydf <- data.frame(id = 1:10, var1 = rnorm(10), var2 = runif(10))
rws <- seq(1, (nrow(mydf)-1), by = 2)
col <- rep("\\rowcolor[gray]{0.95}", length(rws))
print(xtable(mydf), booktabs = TRUE,
      add.to.row = list(pos = as.list(rws), command = col))
```

Tips

print(xt, size="\\fontsize{9pt}{10pt}\\selectfont") 

kableExtra

https://cran.r-project.org/web/packages/kableExtra/index.html

Gmisc: create Table 1 used in medical articles

https://cran.r-project.org/web/packages/Gmisc/index.html

tableone

https://cran.r-project.org/web/packages/tableone/

printr

https://cran.r-project.org/web/packages/printr/index.html

DT

https://cran.r-project.org/web/packages/DT/index.html

sparkTable

https://cran.r-project.org/web/packages/sparkTable/

sparkTable: Generating Graphical Tables for Websites and Documents with R. It depends on the Rglpk package which requires the glpk library. However, sparkTable is not maintained anymore.

skimr for useful and tidy summary statistics is new and provide a histogram next to each variable.

How to align table and plot in rmarkdown html_document

https://stackoverflow.com/a/54359010

RMarkdown Template that Manages Academic Affiliations

RMarkdown Template that Manages Academic Affiliations

Converting Rmarkdown to F1000Research LaTeX Format

BiocWorkflowTools package and paper

Internal links

If my section header is written as "## my section Header". Then I can link to it by using "[linked phrase](#my-section-header)".

Note here

  • Use one number sign (#) even it is a subsection
  • Use the hyphen sign to connect the space character
  • Use lower cases even the header contains capital letters

Another easier way to use the Heading identifiers as described in pandoc. In the header use "# section 1{#s1}" and in the toc or paragraph use '[my section name](#s1)".

Hyperlink color in pdf

Add below to the yaml for PDF documents. See R Markdown: The Definitive Guide.

urlcolor: blue 

Colored text

Blue text . See How to apply color in Markdown?.

icons for rmarkdown

https://ropensci.org/technotes/2018/05/15/icon/

Reproducible data analysis

Interactive document: Shiny

See R Markdown Cheat Sheet.

When I follow the direction to add the code to the end of this Rmd file, I see

  • I can't run "Build" anymore. An error will come out: Error in numericInput("n", "How many cars?", 5) : could not find function "numericInput".
  • After I click "Run Document", the Rmd file will be displayed in either RStudio or a regular browser using R's built-in web server (http://127.0.0.1:YYYY/XXX.Rmd).

Automatic document production with R

https://itsalocke.com/improving-automatic-document-production-with-r/

Documents with logos, watermarks, and corporate styles

http://ellisp.github.io/blog/2017/09/09/rmarkdown

rticles and pinp for articles

Tips

Cache

  • If my markdown file is called abc.Rmd, then two cache directories (abc_cache & abc_files) will be created by default. See rmarkdown’s site generator.
    • *_files: figures.
    • *_cache: *.RData, *.rdb, __packages.
    • When knitting failed due to my error, I will rm these two directories, fix my code and knit again.
  • Cache not work
  • Examples
    > system.time(rmarkdown::render("~/Downloads/tmp.Rmd")) # first time
    ...
    Output created: tmp.html
       user  system elapsed 
      3.123   0.108   5.426 
    # It will create two directories: tmp_files, tmp_cache
    
    > system.time(rmarkdown::render("~/Downloads/tmp.Rmd")) # Second time
    ...
    Output created: tmp.html
       user  system elapsed 
      0.239   0.019   0.317

warning

If we enable cache, be careful on the consequence of just modifying one chunk code. In the following example, if we just modify chunk1 chunk, it will not modify the result from the chunk2 cache.

```{r chunk1}
x <- 1
```

```{r chunk2, echo=FALSE}
print(x)
```

interfere with RStudio

If I use RStudio to knit an Rmd file and the Rmd file has cache = TRUE, it will remember this option. For example, I also open another Rmd file and try to run rmarkdown::render("XXX.Rmd"), it will create XXX_cache folder even the new Rmd file has cache = FALSE. Solution: knit the new file in a terminal.

read knitr/Rmd cache

read knitr/Rmd cache in interactive session?. Use lazyLoad() function without specifying any extension.

lazyLoad("unnamed-chunk-1_3c3ad57469a118cd9c584c8d941d2c09")

Note:

  1. lazyLoad("~/Path/unnamed-chunk-1_3c3ad57469a118cd9c584c8d941d2c09") will give an error cannot open 'xxx.rdb:' No such file or directory. It does not recognize the symbol "~/".
  2. It is better to use a chunk name without "-" (or some special characters like ".") in order to find the file using the chunk name. Then I can use grep("myChunkName_.+\\.rdb", ., value=T) %>% sub("\\..*$", "", .) to find the chunk's filename. Also when we use pipe to run lazyLoad(), it does not work unless we specify an argument; see Pipe in magrittr package is not working for function load().
list.files('myProject/fileName_cache/latex', full.names = T) %>% 
  grep("myChunkName_.+\\.rdb", ., value=T) %>% 
  sub("\\..*$", "", .) %>% lazyLoad(envir = globalenv())

RStudio

RStudio is the best editor.

Markdown has two drawbacks: 1. it does not support TOC natively. 2. RStudio cannot show headers in the editor.

Therefore, use rmarkdown format instead of markdown.

Writing a R book and self-publishing it in Amazon

Create professional reports from R scripts, with custom styles

How to create professional reports from R scripts, with custom styles

Publish R results

5 amazing free tools that can help with publishing R results and blogging

Scheduling R Markdown Reports via Email

http://www.analyticsforfun.com/2016/01/scheduling-r-markdown-reports-via-email.html

Create presentation file (beamer)

  1. Create Rmd file first in Rstudio by File -> R markdown. Select Presentation > choose pdf (beamer) as output format.
  2. Edit the template created by RStudio.
  3. Click 'Knit pdf' button (Ctrl+Shift+k) to create/display the pdf file.

An example of Rmd is

---
title: "My Example"
author: You Know Me
date: Dec 32, 2014
output: beamer_presentation
---

## R Markdown

This is an R Markdown presentation. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. 
For more details on using R Markdown see <http://rmarkdown.rstudio.com>.

When you click the **Knit** button a document will be generated that includes both content as well as the output of any 
embedded R code chunks within the document.

## Slide with Bullets

- Bullet 1
- Bullet 2
- Bullet 3. Mean is $\frac{1}{n} \sum_{i=1}^n x_i$.
$$ 
\mu = \frac{1}{n} \sum_{i=1}^n x_i
$$

## New slide

![picture of BDGE](/home/brb/Pictures/BDGEFinished.png)

## Slide with R Code and Output

```{r}
summary(cars)
```

## Slide with Plot

```{r, echo=FALSE}
plot(cars)
```

R notebook vs R markdown in RStudio

Difference between R MarkDown and R NoteBook

There is no coding difference. The difference is in the rendering. The file extension is the same.

R notebook

  • It adds html_notebook in the output option in the header.
  • You can then preview the rendering quickly without having to knit it (does not execute any of your R code chunks). If you manually 'Run' the chunks, the result will be shown up in preview.
  • It also refreshes the preview every time you save.
  • However in that preview you don't have the code output (no figures, no tables..)
  • You can mix several output options in your header so that you can keep the preview and keep your knit options for export
  • R Notebook is everything and above what R MarkDown is

Table creating packages

stargazer: Produces LaTeX code, HTML/CSS code and ASCII text for well-formatted tables that hold regression analysis results from several models side-by-side, as well as summary statistics

Graphics Output in LaTeX Format

Landscape output

https://stackoverflow.com/questions/25849814/rstudio-rmarkdown-both-portrait-and-landscape-layout-in-a-single-pdf

Break a long document

First World Problems: Very long RMarkdown documents

Request an early exit

https://stackoverflow.com/a/33711413

Bibliographies

Bibliographies in RStudio Markdown are difficult – here’s how to make it easy

bookdown.org

The website is full of open-source books written with R markdown.

It is easy to download the book. Check the download icon on top (Toggle Sidebar, Search, Font settings, Edit, Download, Info).

To build the book, either use the "Build" button on the top right panel or use the command line

bookdown::render_book("index.Rmd", "bookdown::gitbook")
bookdown::render_book("index.Rmd", "bookdown::pdf_book")

For example, to build the r4ds book (website), using

# git clone https://github.com/hadley/r4ds.git
# Open the project in RStudio
# Using 'Build Book' button will give an error
#   Error in rmarkdown::render_site() : No site generator found
devtools::install_github("hadley/r4ds")
bookdown::render_book("index.Rmd", "bookdown::gitbook")

The generated folder _book is 25MB vs 51MB if we use webhttrack.

The bookdown website is easy to navigate using left/right arrow keys.

Figures in a bookdown are handled different from regular R markdown files. The full file path does not work. The "~/" or "../" symbol does not work. The symbolic link directories or files do not work. The only way it works is by creating a subdirectory under the bookdown index.Rmd file.

Also the way of including figures is different in R markdown and bookdown. In bookdown, we should follow this way. That is including knitr::include_graphics("images/myfile.png") in an R block. Recall that in R markdown file, we use ![](figures/myfile.png).

Alternatively we can use webhttrack to download the whole website/book without re-building the book in R.

TexLive

TexLive can be installed by 2 ways

  • sudo apt install texlive It includes tlmgr utility for package manager.
  • Official website

texlive-latex-extra

https://packages.debian.org/sid/texlive-latex-extra

For example, framed and titling packages are included.

tlmgr - TeX Live package manager

https://www.tug.org/texlive/tlmgr.html

Examples

  • Tidy Text Mining with R by Julia Silge and David Robinson (one of many books hosted on BOOKDOWN website).
  • Build 'R for Data Science by Garrett Grolemund, Hadley Wickham' 2019-0808

Create a website using R Markdown

R Markdown Websites, Files, R Markdown: The Definitive Guide -> rmarkdown’s site generator

  1. Create 3 files: _site.yml, index.Rmd and about.Rmd
  2. Execute the R -q -e "rmarkdown::render_site()" function from within the directory containing your files to build _site, a directory of files ready to deploy as a standalone static website. In this simple example, it will generate index.html, about.html and a new directory site_libs. Note we are not supposed to manually edit any html files.

If we host these files on our server using Apache, we need to make sure the owner of the directory and files is www-data.

  • # move/copy the content in the folder _site to /var/www/mySite
  • cd /var/www/mySite # make sure the permission and owner of the directory are OK
  • sudo chown www-data:www-data *.html # any explanation
  • sudo chown -R www-data:www-data site_libs/ # this directory is generated by R
  • sudo nano /etc/apache2/sites-enabled/mySite.conf # see Apache
  • # sudo a2ensite mySite.conf
  • sudo service apache2 reload

Examples

pkgdown: create a website for your package

Blogdown

Posterdown

posterdown: Use RMarkdown to generate PDF Conference Posters via HTML or LaTeX

Latex tools

Mathpix Snip Take a screenshot of math and paste the LaTeX into your editor