R: Difference between revisions

Revision as of 21:17, 9 March 2014

Install Rtools for Windows users

See http://goo.gl/gYh6C for a step-by-step instruction with screenshot.

My preferred way is not to check the option of setting PATH environment. But I manually add the followings to the PATH environment (based on Rtools v3.0)

c:\Rtools\bin;
c:\Rtools\gcc-4.6.3\bin;
C:\Program Files\R\R-2.15.2\bin\i386;

We can make our life easy by creating a file <Rcommand.bat> with the content (also useful if you have C:\cygwin\bin in your PATH although cygwin setup will not do it automatically for you.)

PS. I put <Rcommand.bat> under C:\Program Files\R folder. I create a shortcut called 'Rcmd' on desktop. I enter C:\Windows\System32\cmd.exe /K "Rcommand.bat" in the Target entry and "C:\Program Files\R" in Start in entry.

@echo off
set PATH=C:\Rtools\bin;c:\Rtools\gcc-4.6.3\bin;%PATH%
set PATH=C:\Program Files\R\R-2.15.2\bin\i386;%PATH%
set PKG_LIBS=`Rscript -e "Rcpp:::LdFlags()"`
set PKG_CPPFLAGS=`Rscript -e "Rcpp:::CxxFlags()"`
echo Setting environment for using R
cmd

So we can open the Command Prompt anywhere and run <Rcommand.bat> to get all environment variables ready! On Windows Vista, 7 and 8, we need to run it as administrator. OR we can change the security of the property so the current user can have an executive right.

Windows Toolset

Note that R on Windows supports Mingw-w64 (not Mingw which is a separate project). See here for the issue of developing a Qt application that links against R using Rcpp. And http://qt-project.org/wiki/MinGW is the wiki for compiling Qt using MinGW and MinGW-w64.

Build R from its source

Reference: http://cran.r-project.org/doc/manuals/R-admin.html#Installing-R-under-Windows

Download tcl file from http://www.stats.ox.ac.uk/pub/Rtools/R_Tcl_8-5-8.zip Unzip and put 'Tcl' into R_HOME folder.

"Open a command prompt as Administrator"

Below it is supposed we have created a directory C:\R and the source tar ball is placed under this directory.

set PATH=c:\Rtools\bin;c:\Rtools\gcc-4.6.3\bin;%PATH%

cd C:\R
tar xzvf R-3.0.2.tar.gz

cd R-3.0.2\src\gnuwin32

set TMPDIR=C:/tmp

make all recommended

This builds R (3.0.2) and all recommended packages.

See also Create_a_standalone_Rmath_library below about how to create and use a standalone Rmath library in your own C/C++/Fortran program. For example, if you want to know the 95-th percentile of a T distribution or generate a bunch of random variables, you don't need to search internet to find a library; you can just use Rmath library.

Compile and install an R package

cd C:\Documents and Settings\brb
wget http://www.bioconductor.org/packages/2.11/bioc/src/contrib/affxparser_1.30.2.tar.gz
C:\progra~1\r\r-2.15.2\bin\R CMD INSTALL --build affxparser_1.30.2.tar.gz

Helpful - check Chapter 6 of R Installation and Administration

Check/Upload to CRAN

http://win-builder.r-project.org/

64 bit toolchain

See January 2010 email https://stat.ethz.ch/pipermail/r-devel/2010-January/056301.html and R-Admin manual.

From R 2.11.0 there is 64 bit Windows binary for R.

Install R using binary package on Linux OS

Ubuntu/Debian

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9
sudo nano /etc/apt/sources.list
# deb http://cran.fhcrc.org/bin/linux/ubuntu precise/
sudo apt-get update
sudo apt-get install r-base

Redhat el6

It should be pretty easy to install via the EPEL: http://fedoraproject.org/wiki/EPEL

Just follow the instructions to enable the EPEL and then from the CLI as root:

yum install R

or via sudo:

sudo yum install R

Install R from source (ix86, x86_64 and arm platforms, Linux system)

Debian system (focus on arm architecture with notes from x86 system)

Simplest configuration

On my debian system in Pogoplug (armv5), Raspberry Pi (armv6) OR Beaglebone Black & Udoo(armv7), I can compile R. See R's admin manual. If the OS needs x11, I just need to install 2 required packages.

install gfortran: apt-get install build-essential gfortran (gfortran is not part of build-essential)
install readline library: apt-get install libreadline5-dev (pogoplug), apt-get install libreadline6-dev (raspberry pi/BBB)

Note: if I need X11, I should install

libX11 and libX11-devel, libXt, libXt-devel (for fedora)
libx11-dev (for debian) or xorg-dev (for pogoplug/raspberry pi/BBB)

and optional

texinfo (to fix 'WARNING: you cannot build info or HTML versions of the R manuals')

Note that it is also safe to install required tools via (please run nano /etc/apt/sources.list to include the repository of your favorite R mirror and also run sudo apt-get update first)

sudo apt-get build-dep r-base

The above command will install R dependence like jdk, tcl, tex, etc. The apt-get build-dep gave a more complete list than apt-get install r-base-dev for some reasons.

[Arm architecture] I also run apt-get install readline-common. I don't know if this is necessary. If x11 is not needed or not available (eg Pogoplug), I can add --with-x=no option in ./configure command. If R will be called from other applications such as Rserve, I can add --enable-R-shilb option in ./configure command. Check out ./configure --help to get a complete list of all options.

After running

wget http://cran.r-project.org/src/base/R-2/R-2.15.2.tar.gz
tar xzvf R-2.15.2.tar.gz
cd R-2.15.2
./configure

I got

R is now configured for armv5tel-unknown-linux-gnueabi

  Source directory:          .
  Installation directory:    /usr/local

  C compiler:                gcc -std=gnu99  -g -O2
  Fortran 77 compiler:       gfortran  -g -O2

  C++ compiler:              g++  -g -O2
  Fortran 90/95 compiler:    gfortran -g -O2
  Obj-C compiler:

  Interfaces supported:
  External libraries:        readline
  Additional capabilities:   NLS
  Options enabled:           shared R library, shared BLAS, R profiling

  Recommended packages:      yes

configure: WARNING: you cannot build info or HTML versions of the R manuals
configure: WARNING: you cannot build PDF versions of the R manuals
configure: WARNING: you cannot build PDF versions of vignettes and help pages
configure: WARNING: I could not determine a browser
configure: WARNING: I could not determine a PDF viewer

PS 1. On my raspberry pi machine, it shows R is now configured for armv6l-unknown-linux-gnueabihf and on Beaglebone black it shows R is now configured for armv7l-unknown-linux-gnueabihf.

PS 2. On my x86 system, it shows

R is now configured for x86_64-unknown-linux-gnu

  Source directory:          .
  Installation directory:    /usr/local

  C compiler:                gcc -std=gnu99  -g -O2
  Fortran 77 compiler:       gfortran  -g -O2

  C++ compiler:              g++  -g -O2
  Fortran 90/95 compiler:    gfortran -g -O2
  Obj-C compiler:

  Interfaces supported:      X11, tcltk
  External libraries:        readline, lzma
  Additional capabilities:   PNG, JPEG, TIFF, NLS, cairo
  Options enabled:           shared R library, shared BLAS, R profiling, Java

  Recommended packages:      yes

[arm] However, make gave errors for recommanded packages like KernSmooth, MASS, boot, class, cluster, codetools, foreign, lattice, mgcv, nlme, nnet, rpart, spatial, and survival. The error stems from gcc: SHLIB_LIBADD: No such file or directory. Note that I can get this error message even I try install.packages("MASS", type="source"). A suggested fix is here; adding perl = TRUE in sub() call for two lines in src/library/tools/R/install.R file. However, I got another error shared object 'MASS.so' not found. See also http://ftp.debian.org/debian/pool/main/r/r-base/. One possibility is to build R without recommended packages like ./configure --without-recommended.

make[1]: Entering directory `/mnt/usb/R-2.15.2/src/library/Recommended'
make[2]: Entering directory `/mnt/usb/R-2.15.2/src/library/Recommended'
begin installing recommended package MASS
* installing *source* package 'MASS' ...
** libs
make[3]: Entering directory `/tmp/Rtmp4caBfg/R.INSTALL1d1244924c77/MASS/src'
gcc -std=gnu99 -I/mnt/usb/R-2.15.2/include -DNDEBUG  -I/usr/local/include    -fpic  -g -O2  -c MASS.c -o MASS.o
gcc -std=gnu99 -I/mnt/usb/R-2.15.2/include -DNDEBUG  -I/usr/local/include    -fpic  -g -O2  -c lqs.c -o lqs.o
gcc -std=gnu99 -shared -L/usr/local/lib -o MASSSHLIB_EXT MASS.o lqs.o SHLIB_LIBADD -L/mnt/usb/R-2.15.2/lib -lR
gcc: SHLIB_LIBADD: No such file or directory
make[3]: *** [MASSSHLIB_EXT] Error 1
make[3]: Leaving directory `/tmp/Rtmp4caBfg/R.INSTALL1d1244924c77/MASS/src'
ERROR: compilation failed for package 'MASS'
* removing '/mnt/usb/R-2.15.2/library/MASS'
make[2]: *** [MASS.ts] Error 1
make[2]: Leaving directory `/mnt/usb/R-2.15.2/src/library/Recommended'
make[1]: *** [recommended-packages] Error 2
make[1]: Leaving directory `/mnt/usb/R-2.15.2/src/library/Recommended'
make: *** [stamp-recommended] Error 2
root@debian:/mnt/usb/R-2.15.2#
root@debian:/mnt/usb/R-2.15.2# bin/R

R version 2.15.2 (2012-10-26) -- "Trick or Treat"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: armv5tel-unknown-linux-gnueabi (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(MASS)
Error in library(MASS) : there is no package called 'MASS'
> library()
Packages in library '/mnt/usb/R-2.15.2/library':

base                    The R Base Package
compiler                The R Compiler Package
datasets                The R Datasets Package
grDevices               The R Graphics Devices and Support for Colours
                        and Fonts
graphics                The R Graphics Package
grid                    The Grid Graphics Package
methods                 Formal Methods and Classes
parallel                Support for Parallel computation in R
splines                 Regression Spline Functions and Classes
stats                   The R Stats Package
stats4                  Statistical Functions using S4 Classes
tcltk                   Tcl/Tk Interface
tools                   Tools for Package Development
utils                   The R Utils Package
> Sys.info()["machine"]
   machine
"armv5tel"
> gc()
         used (Mb) gc trigger (Mb) max used (Mb)
Ncells 170369  4.6     350000  9.4   350000  9.4
Vcells 163228  1.3     905753  7.0   784148  6.0

See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=679180

PS 3. The complete log of building R from source is in here File:Build R log.txt

Full configuration

  Interfaces supported:      X11, tcltk
  External libraries:        readline
  Additional capabilities:   PNG, JPEG, TIFF, NLS, cairo
  Options enabled:           shared R library, shared BLAS, R profiling, Java

Update: R 3.0.1 on Beaglebone Black (armv7a) + Ubuntu 13.04

See the page here.

Install all dependencies for building R

This is a comprehensive list. This list is even larger than r-base-dev.

root@debian:/mnt/usb/R-2.15.2# apt-get build-dep r-base
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be REMOVED:
  libreadline5-dev
The following NEW packages will be installed:
  bison ca-certificates ca-certificates-java debhelper defoma ed file fontconfig gettext
  gettext-base html2text intltool-debian java-common libaccess-bridge-java
  libaccess-bridge-java-jni libasound2 libasyncns0 libatk1.0-0 libaudit0 libavahi-client3
  libavahi-common-data libavahi-common3 libblas-dev libblas3gf libbz2-dev libcairo2
  libcairo2-dev libcroco3 libcups2 libdatrie1 libdbus-1-3 libexpat1-dev libflac8
  libfontconfig1-dev libfontenc1 libfreetype6-dev libgif4 libglib2.0-dev libgtk2.0-0
  libgtk2.0-common libice-dev libjpeg62-dev libkpathsea5 liblapack-dev liblapack3gf libnewt0.52
  libnspr4-0d libnss3-1d libogg0 libopenjpeg2 libpango1.0-0 libpango1.0-common libpango1.0-dev
  libpcre3-dev libpcrecpp0 libpixman-1-0 libpixman-1-dev libpng12-dev libpoppler5 libpulse0
  libreadline-dev libreadline6-dev libsm-dev libsndfile1 libthai-data libthai0 libtiff4-dev
  libtiffxx0c2 libunistring0 libvorbis0a libvorbisenc2 libxaw7 libxcb-render-util0
  libxcb-render-util0-dev libxcb-render0 libxcb-render0-dev libxcomposite1 libxcursor1
  libxdamage1 libxext-dev libxfixes3 libxfont1 libxft-dev libxi6 libxinerama1 libxkbfile1
  libxmu6 libxmuu1 libxpm4 libxrandr2 libxrender-dev libxss-dev libxt-dev libxtst6 luatex m4
  openjdk-6-jdk openjdk-6-jre openjdk-6-jre-headless openjdk-6-jre-lib openssl pkg-config
  po-debconf preview-latex-style shared-mime-info tcl8.5-dev tex-common texi2html texinfo
  texlive-base texlive-binaries texlive-common texlive-doc-base texlive-extra-utils
  texlive-fonts-recommended texlive-generic-recommended texlive-latex-base texlive-latex-extra
  texlive-latex-recommended texlive-pictures tk8.5-dev tzdata-java whiptail x11-xkb-utils
  x11proto-render-dev x11proto-scrnsaver-dev x11proto-xext-dev xauth xdg-utils xfonts-base
  xfonts-encodings xfonts-utils xkb-data xserver-common xvfb zlib1g-dev
0 upgraded, 136 newly installed, 1 to remove and 0 not upgraded.
Need to get 139 MB of archives.
After this operation, 410 MB of additional disk space will be used.
Do you want to continue [Y/n]?

bin/R (shell script) and bin/exec/R (binary executable) on Linux OS

bin/R is just a shell script to launch bin/exec/R program. So if we try to run the following program

# test.R
cat("-- reading arguments\n", sep = "");
cmd_args = commandArgs();
for (arg in cmd_args) cat("  ", arg, "\n", sep="");

from command line like

brb@brb-P45T-A:~/Downloads$ ~/R-3.0.1/bin/R --slave --no-save --no-restore --no-environ --silent --args arg1=abc < test.R
-- reading arguments
  /home/brb/R-3.0.1/bin/exec/R
  --slave
  --no-save
  --no-restore
  --no-environ
  --silent
  --args
  arg1=abc

we can see R actually call bin/exec/R program.

CentOS 6.x

Install build-essential (make, gcc, gdb, ...).

su
yum groupinstall "Development Tools"
yum install kernel-devel kernel-headers

Install readline and X11 (probably not necessary if we use ./configure --with-x=no)

yum install readline-devel 
yum install libX11 libX11-devel libXt libXt-devel

Install libpng (already there) and libpng-devel library. This is for web application purpose because png (and possibly svg) is a standard and preferred graphics format. If we want to output different graphics formats, we have to follow the guide in R-admin manual to install extra graphics libraries in Linux.

yum install libpng-devel
rpm -qa | grep "libpng"
# make sure both libpng and libpng-devel exist.

Install Java. One possibility is to download from Oracle. We want to download jdk-7u45-linux-x64.rpm and jre-7u45-linux-x64.rpm (assume 64-bit OS).

rpm -Uvh jdk-7u45-linux-x64.rpm
rpm -Uvh jre-7u45-linux-x64.rpm
# Check
java -version

Now we are ready to build R by using "./configure" and then "make" commands.

We can make R accessible from any directory by either run "make install" command or creating an R_HOME environment variable and export it to PATH environment variable, such as

export R_HOME="path to R"
export PATH=$PATH:$R_HOME/bin

Web Applications

Create HTML5 web and slides

http://www.gastonsanchez.com/depot/knitr-slides. The HTML5 slides work on my IE 8 too.

HTML5 slides examples

Software requirement

Rstudio
knitr, XML, RCurl (See omegahat for installation on Ubuntu)
pandoc package This is a command line tool. I am testing it on Windows 7.

Slide #22 gives an instruction to create

regular html file by using RStudio -> Knit HTML button
HTML5 slides by using pandoc from command line.

Files:

Rcmd source: 009-slides.Rmd Note that IE 8 was not supported by github. For IE 9, be sure to turn off "Compatibility View".
markdown output: 009-slides.md
HTML output: 009-slides.html

We can create Rcmd source in Rstudio by File -> New -> R Markdown.

There are 4 ways to produce slides with pandoc

S5
DZSlides
Slidy
Slideous

Use the markdown file (md) and convert it with pandoc

pandoc -s -S -i -t dzslides --mathjax html5_slides.md -o html5_slides.html

If we are comfortable with HTML and CSS code, open the html file (generated by pandoc) and modify the CSS style at will.

Markdown language

According to wikipedia:

Markdown is a lightweight markup language, originally created by John Gruber with substantial contributions from Aaron Swartz, allowing people “to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML)”.

Markup is a general term for content formatting - such as HTML - but markdown is a library that generates HTML markup.

Nice summary from stackoverflow.com and more complete list from github.

An example https://gist.github.com/jeromyanglim/2716336

basics and syntax

Convert mediawiki to markdown using online conversion tool from pandoc.

Cheat sheet.

Cloud-enabled HTML5 markdown editor

live demo

Example from hosted in github

R markdown file and use it in RStudio. Customizing Chunk Options can be found in knitr page and rpubs.com.

HTTP protocol

http://en.wikipedia.org/wiki/File:Http_request_telnet_ubuntu.png
Query string
How to capture http header? Use curl -i en.wikipedia.org.
Web Inspector. Build-in in Chrome. Right click on any page and choose 'Inspect Element'.
Web server
Simple TCP/IP web server
HTTP Made Really Easy
Illustrated Guide to HTTP
nweb: a tiny, safe Web server with 200 lines
Tiny HTTPd

An HTTP server is conceptually simple:

Open port 80 for listening
When contact is made, gather a little information (get mainly - you can ignore the rest for now)
Translate the request into a file request
Open the file and spit it back at the client

It gets more difficult depending on how much of HTTP you want to support - POST is a little more complicated, scripts, handling multiple requests, etc.

Example in R

> co <- socketConnection(port=8080, server=TRUE, blocking=TRUE) 
> # Now open a web browser and type http://localhost:8080/index.html
> readLines(co,1)
[1] "GET /index.html HTTP/1.1"
> readLines(co,1)
[1] "Host: localhost:8080"
> readLines(co,1)
[1] "User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0"
> readLines(co,1)
[1] "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
> readLines(co,1)
[1] "Accept-Language: en-US,en;q=0.5"
> readLines(co,1)
[1] "Accept-Encoding: gzip, deflate"
> readLines(co,1)
[1] "Connection: keep-alive"
> readLines(co,1)
[1] ""

Example in C (Very simple http server written in C, 187 lines)

Create a simple hello world html page and save it as <index.html> in the current directory (/home/brb/Downloads/)

Launch the server program (assume we have done gcc http_server.c -o http_server)

$ ./http_server -p 50002
Server started at port no. 50002 with root directory as /home/brb/Downloads

Secondly open a browser and type http://localhost:50002/index.html. The server will respond

GET /index.html HTTP/1.1
Host: localhost:50002
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

file: /home/brb/Downloads/index.html
GET /favicon.ico HTTP/1.1
Host: localhost:50002
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

file: /home/brb/Downloads/favicon.ico
GET /favicon.ico HTTP/1.1
Host: localhost:50003
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

file: /home/brb/Downloads/favicon.ico

The browser will show the page from <index.html> in server.

The only bad thing is the code does not close the port. For example, if I have use Ctrl+C to close the program and try to re-launch with the same port, it will complain socket() or bind(): Address already in use.

Another Example in C (55 lines)

http://mwaidyanatha.blogspot.com/2011/05/writing-simple-web-server-in-c.html

The response is embedded in the C code.

If we test the server program by opening a browser and type "http://localhost:15000/", the server received the follwing 7 lines

GET / HTTP/1.1
Host: localhost:15000
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

If we include a non-executable file's name in the url, we will be able to download that file. Try "http://localhost:15000/client.c".

If we use telnet program to test, wee need to type anything we want

$ telnet localhost 15000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
ThisCanBeAnything        <=== This is what I typed in the client and it is also shown on server
HTTP/1.1 200 OK          <=== From here is what I got from server
Content-length: 37Content-Type: text/html

HTML_DATA_HERE_AS_YOU_MENTIONED_ABOVE <=== The html tags are not passed from server, interesting!
Connection closed by foreign host.
$

Others

http://rosettacode.org/wiki/Hello_world/ (Different languages)
http://kperisetla.blogspot.com/2012/07/simple-http-web-server-in-c.html (Windows web server)
http://css.dzone.com/articles/web-server-c (handling HTTP GET request, handling content types(txt, html, jpg, zip. rar, pdf, php etc.), sending proper HTTP error codes, serving the files from a web root, change in web root in a config file, zero copy optimization using sendfile method and php file handling.)
https://github.com/gtungatkar/Simple-HTTP-server
https://github.com/davidmoreno/onion

shiny

The following is what we see on a browser after we run an example from shiny package. See http://rstudio.github.com/shiny/tutorial/#hello-shiny. Note that the R session needs to be on; i.e. R command prompt will not be returned unless we press Ctrl+C or ESC.

shiny depends on websockets, caTools, bitops, digest packages.

Q & A:

Q: If we run runExample('01_hello') in Rserve from an R client, we can continue our work in R client without losing the functionality of the GUI from shiny. Question: how do we kill the job?
If I run the example "01_hello", the browser only shows the control but not graph on Firefox? A: Use Chrome or Opera as the default browser.
If I run the example "01_hello" on RHEL the first time, it works fine. But if I click 'Ctrl + C' to stop it and run it again, I got a message

Warning in .SOCK_SERVE(port) : R-Websockets(tcpserv): bind() failed. 
Error in createContext(port, webpage, is.binary = is.binary) : 
  Unable to bind socket on port 8100; is it realsy in use?

A simple solution is to close R and open it again.

Q: Deployment on web. A: Not ready yet. Shiny server platform is still under beta testing. Shiny apps are hosted using the R websockets package which acts more like a tcp server than a web server, and that architecture just doesn't fit with rApache, or even apache for that matter.

Q: How difficult to put the code in Gist:github? A: Just create an account. Do not even need to create a repository. Just go to http://gist.github.com and create a new gist. The new gist can be secret or public. A secret gist can not be edited again after it is created although it works fine when it was used in runGist() function.

Deploy to run locally

Follow the instruction here, we can do as following (Tested on Windows OS)

Create a desktop shortcut with target "C:\Program Files\R\R-3.0.2\bin\R.exe" -e "shiny::runExample('01_hello')" . We can name the shortcut as we like, e.g. R+shiny
Double click the shortcut. The Windows Firewall will be popped up and say it block some features of the program. It does not matter if we choose Allow access or Cancel.
Look at the command prompt window (black background console window), it will say something like
```
Listening on port 7510
```
at the last line of the console.
Open your browser (Chrome or Firefox works), and type the address http://localhost:7510. You will see something magic happen.
If we don't want to play with it, we can close the browser and close the command console (hit 'x')too.

Deploy to run remotely -shiny server

If we want to deploy our shiny apps to WWW, we need to install shiny server.

Following the guide on here, shiny-server is up smoothly on my Ubuntu machine. After I run the command sudo gdebi shiny-server-0.4.0.8-amd64.deb, shiny-server is started. Thanks to upstart in Ubuntu, shiny-server is automatically started whenever the machine is started.

Each app directory needs to be copied to /srv/shiny-server/ directory using sudo.

The default port is 3838. That is, the remote computer can access the website using http://xxx.xxx.x.xx:3838/AppName.

Last but not the least, according to its web page, shiny-server is Experimental quality. Use at your own risk!.

websocket

http://illposed.net/jsm2012.pdf

Gallery

Example of using googleVis: http://shinyeoda.cloudapp.net/
Integrate with Javascript: https://github.com/wch/shiny-jsdemo and https://github.com/trestletech/ShinyDash-Sample
interactiveDisplay (Bioconductor package, there is a STOP Application button too): http://www.bioconductor.org/packages/release/bioc/html/interactiveDisplay.html

httpuv

http and WebSocket library.

RApache

gWidgetsWWW

Rook

Since R 2.13, the internal web server was exposed.

Tutorual from useR2012 and Jeffrey Horner

Here is another one from http://www.rinfinance.com.

Rook is also supported by [rApache too. See http://rapache.net/manual.html.

Google group. https://groups.google.com/forum/?fromgroups#!forum/rrook

Advantage

the web applications are created on desktop, whether it is Windows, Mac or Linux.
No Apache is needed.
create multiple applications at the same time. This complements the limit of rApache.

4 lines of code example.

library(Rook)
s <- Rhttpd$new()
s$start(quiet=TRUE)
s$print()
s$browse(1)  # OR s$browse("RookTest")

Notice that after s$browse() command, the cursor will return to R because the command just a shortcut to open the web page http://127.0.0.1:10215/custom/RookTest.

We can add Rook application to the server; see ?Rhttpd.

s$add(
    app=system.file('exampleApps/helloworld.R',package='Rook'),name='hello'
)
s$add(
    app=system.file('exampleApps/helloworldref.R',package='Rook'),name='helloref'
)
s$add(
    app=system.file('exampleApps/summary.R',package='Rook'),name='summary'
)

s$print()

#Server started on 127.0.0.1:10221
#[1] RookTest http://127.0.0.1:10221/custom/RookTest
#[2] helloref http://127.0.0.1:10221/custom/helloref
#[3] summary  http://127.0.0.1:10221/custom/summary
#[4] hello    http://127.0.0.1:10221/custom/hello

#  Stops the server but doesn't uninstall the app
## Not run: 
s$stop()

## End(Not run)
s$remove(all=TRUE)
rm(s)

For example, the interface and the source code of summary app are given below

app <- function(env) {
    req <- Rook::Request$new(env)
    res <- Rook::Response$new()
    res$write('Choose a CSV file:\n')
    res$write('<form method="POST" enctype="multipart/form-data">\n')
    res$write('<input type="file" name="data">\n')
    res$write('<input type="submit" name="Upload">\n</form>\n<br>')

    if (!is.null(req$POST())){
	data <- req$POST()[['data']]
	res$write("<h3>Summary of Data</h3>");
	res$write("<pre>")
	res$write(paste(capture.output(summary(read.csv(data$tempfile,stringsAsFactors=FALSE)),file=NULL),collapse='\n'))
	res$write("</pre>")
	res$write("<h3>First few lines (head())</h3>");
	res$write("<pre>")
	res$write(paste(capture.output(head(read.csv(data$tempfile,stringsAsFactors=FALSE)),file=NULL),collapse='\n'))
	res$write("</pre>") 
    }
    res$finish()
}

More example:

http://lamages.blogspot.com/2012/08/rook-rocks-example-with-googlevis.html
Self-organizing map
Deploy Rook apps with rApache. First one and two.

sumo

Sumo is a fully-functional web application template that exposes an authenticated user's R session within java server pages. See the paper http://journal.r-project.org/archive/2012-1/RJournal_2012-1_Bergsma+Smith.pdf.

Stockplot

FastRWeb

Rwui

CGHWithR and WebDevelopR

CGHwithR is still working with old version of R although it is removed from CRAN. Its successor is WebDevelopR. Its The vignette (year 2013) provides a review of several available methods.

manipulate from RStudio

This is not a web application. But the manipulate package can be used to create interactive plot within R(Studio) environment easily. Its source is available at here.

Mathematica also has manipulate function for plotting; see here.

Creating local repository for CRAN and Bioconductor (focus on Windows binary packages only)

How to set up a local repository

CRAN specific: http://cran.r-project.org/mirror-howto.html
Bioconductor specific: http://www.bioconductor.org/about/mirrors/mirror-how-to/

General guide: http://cran.r-project.org/doc/manuals/R-admin.html#Setting-up-a-package-repository

Utilities such as install.packages can be pointed at any CRAN-style repository, and R users may want to set up their own. The ‘base’ of a repository is a URL such as http://www.omegahat.org/R/: this must be an URL scheme that download.packages supports (which also includes ‘ftp://’ and ‘file://’, but not on most systems ‘https://’). Under that base URL there should be directory trees for one or more of the following types of package distributions:

"source": located at src/contrib and containing .tar.gz files. Other forms of compression can be used, e.g. .tar.bz2 or .tar.xz files.
"win.binary": located at bin/windows/contrib/x.y for R versions x.y.z and containing .zip files for Windows.
"mac.binary.leopard": located at bin/macosx/leopard/contrib/x.y for R versions x.y.z and containing .tgz files.

Each terminal directory must also contain a PACKAGES file. This can be a concatenation of the DESCRIPTION files of the packages separated by blank lines, but only a few of the fields are needed. The simplest way to set up such a file is to use function write_PACKAGES in the tools package, and its help explains which fields are needed. Optionally there can also be a PACKAGES.gz file, a gzip-compressed version of PACKAGES—as this will be downloaded in preference to PACKAGES it should be included for large repositories. (If you have a mis-configured server that does not report correctly non-existent files you will need PACKAGES.gz.)

To add your repository to the list offered by setRepositories(), see the help file for that function.

A repository can contain subdirectories, when the descriptions in the PACKAGES file of packages in subdirectories must include a line of the form

Path: path/to/subdirectory

—once again write_PACKAGES is the simplest way to set this up.

Space requirement if we want to mirror WHOLE repository

Whole CRAN takes about 92GB (rsync -avn cran.r-project.org::CRAN > ~/Downloads/cran).
Bioconductor is big (> 64G for BioC 2.11). Please check the size of what will be transferred with e.g. (rsync -avn bioconductor.org::2.11 > ~/Downloads/bioc) and make sure you have enough room on your local disk before you start.

On the other hand, we if only care about Windows binary part, the space requirement is largely reduced.

CRAN: 2.7GB
Bioconductor: 28GB.

Misc notes

If the binary package was built on R 2.15.1, then it cannot be installed on R 2.15.2. But vice is OK.
Remember to issue "--delete" option in rsync, otherwise old version of package will be installed.
The repository still need src directory. If it is missing, we will get an error

Warning: unable to access index for repository http://arraytools.no-ip.org/CRAN/src/contrib
Warning message:
package ‘glmnet’ is not available (for R version 2.15.2)

The error was given by available.packages() function.

To bypass the requirement of src directory, I can use

install.packages("glmnet", contriburl = contrib.url(getOption('repos'), "win.binary"))

but there may be a problem when we use biocLite() command.

I find a workaround. Since the error comes from missing CRAN/src directory, we just need to make sure the directory CRAN/src/contrib exists AND either CRAN/src/contrib/PACKAGES or CRAN/src/contrib/PACKAGES.gz exists.

To create CRAN repository

Before creating a local repository please give a dry run first. You don't want to be surprised how long will it take to mirror a directory.

Dry run (-n option). Pipe out the process to a text file for an examination.

rsync -avn cran.r-project.org::CRAN > crandryrun.txt

To mirror only partial repository, it is necessary to create directories before running rsync command.

cd
mkdir -p ~/Rmirror/CRAN/bin/windows/contrib/2.15
rsync -rtlzv --delete cran.r-project.org::CRAN/bin/windows/contrib/2.15/ ~/Rmirror/CRAN/bin/windows/contrib/2.15
(one line with space before ~/Rmirror)

# src directory is very large (~27GB) since it contains source code for each R version. 
# We just need the files PACKAGES and PACKAGES.gz in CRAN/src/contrib. So I comment out the following line.
# rsync -rtlzv --delete cran.r-project.org::CRAN/src/ ~/Rmirror/CRAN/src/
mkdir -p ~/Rmirror/CRAN/src/contrib
rsync -rtlzv --delete cran.r-project.org::CRAN/src/contrib/PACKAGES ~/Rmirror/CRAN/src/contrib/
rsync -rtlzv --delete cran.r-project.org::CRAN/src/contrib/PACKAGES.gz ~/Rmirror/CRAN/src/contrib/

And optionally

library(tools)
write_PACKAGES("~/Rmirror/CRAN/bin/windows/contrib/2.15", type="win.binary")

and if we want to get src directory

rsync -rtlzv --delete cran.r-project.org::CRAN/src/contrib/*.tar.gz ~/Rmirror/CRAN/src/contrib/
rsync -rtlzv --delete cran.r-project.org::CRAN/src/contrib/2.15.3 ~/Rmirror/CRAN/src/contrib/

We can use du -h to check the folder size.

For example (as of 1/7/2013),

$ du -k ~/Rmirror --max-depth=1 --exclude ".*" | sort -nr | cut -f2 | xargs -d '\n' du -sh
30G	/home/brb/Rmirror
28G	/home/brb/Rmirror/Bioc
2.7G	/home/brb/Rmirror/CRAN

To create Bioconductor repository

Dry run

rsync -avn bioconductor.org::2.11 > biocdryrun.txt

Then creates directories before running rsync.

cd
mkdir -p ~/Rmirror/Bioc
wget -N http://www.bioconductor.org/biocLite.R -P ~/Rmirror/Bioc

where -N is to overwrite original file if the size or timestamp change and -P in wget means an output directory, not a file name.

Optionally, we can add the following in order to see the Bioconductor front page.

rsync -zrtlv  --delete bioconductor.org::2.11/BiocViews.html ~/Rmirror/Bioc/packages/2.11/
rsync -zrtlv  --delete bioconductor.org::2.11/index.html ~/Rmirror/Bioc/packages/2.11/

The software part (aka bioc directory) installation:

cd
mkdir -p ~/Rmirror/Bioc/packages/2.11/bioc/bin/windows
mkdir -p ~/Rmirror/Bioc/packages/2.11/bioc/src
rsync -zrtlv  --delete bioconductor.org::2.11/bioc/bin/windows/ ~/Rmirror/Bioc/packages/2.11/bioc/bin/windows
# Either rsync whole src directory or just essential files
# rsync -zrtlv  --delete bioconductor.org::2.11/bioc/src/ ~/Rmirror/Bioc/packages/2.11/bioc/src
rsync -zrtlv  --delete bioconductor.org::2.11/bioc/src/contrib/PACKAGES ~/Rmirror/Bioc/packages/2.11/bioc/src/contrib/
rsync -zrtlv  --delete bioconductor.org::2.11/bioc/src/contrib/PACKAGES.gz ~/Rmirror/Bioc/packages/2.11/bioc/src/contrib/
# Optionally the html part
mkdir -p ~/Rmirror/Bioc/packages/2.11/bioc/html
rsync -zrtlv  --delete bioconductor.org::2.11/bioc/html/ ~/Rmirror/Bioc/packages/2.11/bioc/html
mkdir -p ~/Rmirror/Bioc/packages/2.11/bioc/vignettes
rsync -zrtlv  --delete bioconductor.org::2.11/bioc/vignettes/ ~/Rmirror/Bioc/packages/2.11/bioc/vignettes
mkdir -p ~/Rmirror/Bioc/packages/2.11/bioc/news
rsync -zrtlv  --delete bioconductor.org::2.11/bioc/news/ ~/Rmirror/Bioc/packages/2.11/bioc/news
mkdir -p ~/Rmirror/Bioc/packages/2.11/bioc/licenses
rsync -zrtlv  --delete bioconductor.org::2.11/bioc/licenses/ ~/Rmirror/Bioc/packages/2.11/bioc/licenses
mkdir -p ~/Rmirror/Bioc/packages/2.11/bioc/manuals
rsync -zrtlv  --delete bioconductor.org::2.11/bioc/manuals/ ~/Rmirror/Bioc/packages/2.11/bioc/manuals
mkdir -p ~/Rmirror/Bioc/packages/2.11/bioc/readmes
rsync -zrtlv  --delete bioconductor.org::2.11/bioc/readmes/ ~/Rmirror/Bioc/packages/2.11/bioc/readmes

and annotation (aka data directory) part:

mkdir -p ~/Rmirror/Bioc/packages/2.11/data/annotation/bin/windows
mkdir -p ~/Rmirror/Bioc/packages/2.11/data/annotation/src/contrib
# one line for each of the following
rsync -zrtlv --delete bioconductor.org::2.11/data/annotation/bin/windows/ ~/Rmirror/Bioc/packages/2.11/data/annotation/bin/windows
rsync -zrtlv --delete bioconductor.org::2.11/data/annotation/src/contrib/PACKAGES ~/Rmirror/Bioc/packages/2.11/data/annotation/src/contrib/
rsync -zrtlv --delete bioconductor.org::2.11/data/annotation/src/contrib/PACKAGES.gz ~/Rmirror/Bioc/packages/2.11/data/annotation/src/contrib/

and experiment directory:

mkdir -p ~/Rmirror/Bioc/packages/2.11/data/experiment/bin/windows/contrib/2.15
mkdir -p ~/Rmirror/Bioc/packages/2.11/data/experiment/src/contrib
# one line for each of the following
# Note that we are cheating by only downloading PACKAGES and PACKAGES.gz files
rsync -zrtlv --delete bioconductor.org::2.11/data/experiment/bin/windows/contrib/2.15/PACKAGES ~/Rmirror/Bioc/packages/2.11/data/experiment/bin/windows/contrib/2.15/
rsync -zrtlv --delete bioconductor.org::2.11/data/experiment/bin/windows/contrib/2.15/PACKAGES.gz ~/Rmirror/Bioc/packages/2.11/data/experiment/bin/windows/contrib/2.15/
rsync -zrtlv --delete bioconductor.org::2.11/data/experiment/src/contrib/PACKAGES ~/Rmirror/Bioc/packages/2.11/data/experiment/src/contrib/
rsync -zrtlv --delete bioconductor.org::2.11/data/experiment/src/contrib/PACKAGES.gz ~/Rmirror/Bioc/packages/2.11/data/experiment/src/contrib/

and extra directory:

mkdir -p ~/Rmirror/Bioc/packages/2.11/extra/bin/windows/contrib/2.15
mkdir -p ~/Rmirror/Bioc/packages/2.11/extra/src/contrib
# one line for each of the following
# Note that we are cheating by only downloading PACKAGES and PACKAGES.gz files
rsync -zrtlv --delete bioconductor.org::2.11/extra/bin/windows/contrib/2.15/PACKAGES ~/Rmirror/Bioc/packages/2.11/extra/bin/windows/contrib/2.15/
rsync -zrtlv --delete bioconductor.org::2.11/extra/bin/windows/contrib/2.15/PACKAGES.gz ~/Rmirror/Bioc/packages/2.11/extra/bin/windows/contrib/2.15/
rsync -zrtlv --delete bioconductor.org::2.11/extra/src/contrib/PACKAGES ~/Rmirror/Bioc/packages/2.11/extra/src/contrib/
rsync -zrtlv --delete bioconductor.org::2.11/extra/src/contrib/PACKAGES.gz ~/Rmirror/Bioc/packages/2.11/extra/src/contrib/

To test local repository

Create soft links in Apache server

su
ln -s /home/brb/Rmirror/CRAN /var/www/html/CRAN
ln -s /home/brb/Rmirror/Bioc /var/www/html/Bioc
ls -l /var/www/html

The soft link mode should be 777.

To test CRAN

Replace the host name arraytools.no-ip.org by IP address 10.133.2.111 if necessary.

r <- getOption("repos"); r["CRAN"] <- "http://arraytools.no-ip.org/CRAN"
options(repos=r)
install.packages("glmnet")

We can test if the backup server is working or not by installing a package which was removed from the CRAN. For example, 'ForImp' was removed from CRAN in 11/8/2012, but I still a local copy built on R 2.15.2 (run rsync on 11/6/2012).

r <- getOption("repos"); r["CRAN"] <- "http://cran.r-project.org"
r <- c(r, BRB='http://arraytools.no-ip.org/CRAN')
#                        CRAN                            CRANextra                                  BRB 
# "http://cran.r-project.org" "http://www.stats.ox.ac.uk/pub/RWin"   "http://arraytools.no-ip.org/CRAN"
options(repos=r)
install.packages('ForImp')

Note by default, CRAN mirror is selected interactively.

> getOption("repos")
                                CRAN                            CRANextra 
                            "@CRAN@" "http://www.stats.ox.ac.uk/pub/RWin"

To test Bioconductor

# CRAN part:
r <- getOption("repos"); r["CRAN"] <- "http://arraytools.no-ip.org/CRAN"
options(repos=r)
# Bioconductor part:
options("BioC_mirror" = "http://arraytools.no-ip.org/Bioc")
source("http://bioconductor.org/biocLite.R")
# This source biocLite.R line can be placed either before or after the previous 2 lines
biocLite("aCGH")

If there is a connection problem, check folder attributes.

chmod -R 755 ~/CRAN/bin

Note that if a binary package was created for R 2.15.1, then it can be installed under R 2.15.1 but not R 2.15.2. The R console will show package xxx is not available (for R version 2.15.2).

For binary installs, the function also checks for the availability of a source package on the same repository, and reports if the source package has a later version, or is available but no binary version is.

So for example, if the mirror does not have contents under src directory, we need to run the following line in order to successfully run install.packages() function.

options(install.packages.check.source = "no")

If we only mirror the essential directories, we can run biocLite() successfully. However, the R console will give some warning

> biocLite("aCGH")
BioC_mirror: http://arraytools.no-ip.org/Bioc
Using Bioconductor version 2.11 (BiocInstaller 1.8.3), R version 2.15.
Installing package(s) 'aCGH'
Warning: unable to access index for repository http://arraytools.no-ip.org/Bioc/packages/2.11/data/experiment/src/contrib
Warning: unable to access index for repository http://arraytools.no-ip.org/Bioc/packages/2.11/extra/src/contrib
Warning: unable to access index for repository http://arraytools.no-ip.org/Bioc/packages/2.11/data/experiment/bin/windows/contrib/2.15
Warning: unable to access index for repository http://arraytools.no-ip.org/Bioc/packages/2.11/extra/bin/windows/contrib/2.15
trying URL 'http://arraytools.no-ip.org/Bioc/packages/2.11/bioc/bin/windows/contrib/2.15/aCGH_1.36.0.zip'
Content type 'application/zip' length 2431158 bytes (2.3 Mb)
opened URL
downloaded 2.3 Mb

package ‘aCGH’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
        C:\Users\limingc\AppData\Local\Temp\Rtmp8IGGyG\downloaded_packages
Warning: unable to access index for repository http://arraytools.no-ip.org/Bioc/packages/2.11/data/experiment/bin/windows/contrib/2.15
Warning: unable to access index for repository http://arraytools.no-ip.org/Bioc/packages/2.11/extra/bin/windows/contrib/2.15
> library()

CRAN repository directory structure

The information below is specific to R 2.15.2. There are linux and macosx subdirecotries whenever there are windows subdirectory.

bin/winows/contrib/2.15
src/contrib
   /contrib/2.15.2
   /contrib/Archive
web/checks
   /dcmeta
   /packages
   /views

A clickable map [1]

Bioconductor repository directory structure

The information below is specific to Bioc 2.11. There are linux and macosx subdirecotries whenever there are windows subdirectory.

bioc/bin/windows/contrib/2.15
    /html
    /install
    /license
    /manuals 
    /news
    /src
    /vignettes
data/annotation/bin/windows/contrib/2.15
               /html
               /licenses
               /manuals
               /src
               /vignettes
     /experiment/bin/windows/contrib/2.15
                /html
                /manuals
                /src/contrib
                /vignettes
extra/bin/windows/contrib
     /html
     /src
     /vignettes

List all R packages from CRAN/Bioconductor

Check my daily result based on R 2.15 and Bioc 2.11 in [2]

Parallel Computing

Example code for the book Parallel R by McCallum and Weston.
An introduction to distributed memory parallelism in R and C
Processing: When does it worth?

Windows Security Warning

It seems it is safe to choose 'Cancel' when Windows Firewall tried to block R program when we use makeCluster() to create a socket cluster.

library(parallel)
cl <- makeCluster(2)
clusterApply(cl, 1:2, get("+"), 3)
stopCluster(cl)

If we like to see current firewall settings, just click Windows Start button, search 'Firewall' and choose 'Windows Firewall with Advanced Security'. In the 'Inbound Rules', we can see what programs (like, R for Windows GUI front-end, or Rserve) are among the rules. These rules are called 'private' in the 'Profile' column. Note that each of them may appear twice because one is 'TCP' protocol and the other one has a 'UDP' protocol.

parallel package

Parallel package was included in R 2.14.0. It is derived from the snow and multicore packages and provides many of the same functions as those packages.

The parallel package provides several *apply functions for R users to quickly modify their code using parallel computing.

makeCluster(makePSOCKcluster, makeForkCluster), stopCluster. Other cluster types are passed to package snow.
clusterCall, clusterEvalQ, clusterSplit
clusterApply, clusterApplyLB
clusterExport
clusterMap
parLapply, parSapply, parApply, parRapply, parCapply
parLapplyLB, parSapplyLB (load balance version)
clusterSetRNGStream, nextRNGStream, nextRNGSubStream

Examples (See ?clusterApply)

snow package

Supported cluster types are "SOCK", "PVM", "MPI", and "NWS".

multicore package

foreach package

This package depends on one of the following

doParallel - Foreach parallel adaptor for the parallel package
doSNOW - Foreach parallel adaptor for the snow package
doMC - Foreach parallel adaptor for the multicore package
doMPI - Foreach parallel adaptor for the Rmpi package
doRedis - Foreach parallel adapter for the rredis package

as a backend.

snowfall package

http://www.imbi.uni-freiburg.de/parallel/docs/Reisensburg2009_TutParallelComputing_Knaus_Porzelius.pdf

Rmpi package

Some examples/tutorials

Cloud Computing

Install R on Amazon EC2

http://randyzwitch.com/r-amazon-ec2/

Bioconductor on Amazon EC2

http://www.bioconductor.org/help/bioconductor-cloud-ami/

Big Data Analysis

http://blog.comsysto.com/2013/02/14/my-favorite-community-links/

Useful R packages

RInside

Ubuntu

With RInside, R can be embedded in a graphical application. For example, $HOME/R/x86_64-pc-linux-gnu-library/3.0/RInside/examples/qt directory includes source code of a Qt application to show a kernel density plot with various options like kernel functions, bandwidth and an R command text box to generate the random data. See my demo on Youtube. I have tested this qtdensity example successfully using Qt 4.8.5.

Follow the instruction cairoDevice to install required libraries for cairoDevice package and then cairoDevice itself.
Install Qt. Check 'qmake' command becomes available by typing 'whereis qmake' or 'which qmake' in terminal.
Open Qt Creator from Ubuntu start menu/Launcher. Open the project file $HOME/R/x86_64-pc-linux-gnu-library/3.0/RInside/examples/qt/qtdensity.pro in Qt Creator.
Under Qt Creator, hit 'Ctrl + R' or the big green triangle button on the lower-left corner to build/run the project. If everything works well, you shall see the interactive program qtdensity appears on your desktop.

.

With RInside + Wt web toolkit installed, we can also create a web application. To demonstrate the example in examples/wt directory, we can do

cd ~/R/x86_64-pc-linux-gnu-library/3.0/RInside/examples/wt
make
sudo ./wtdensity --docroot . --http-address localhost --http-port 8080

Then we can go to the browser's address bar and type http://localhost:8080 to see how it works (a screenshot is in here).

Windows 7

To make RInside works on Windows OS, try the following

Make sure R is installed under C:\ instead of C:\Program Files if we don't want to get an error like g++.exe: error: Files/R/R-3.0.1/library/RInside/include: No such file or directory.
Install RTools
Instal RInside package from source (the binary version will give an error )
Create a DOS batch file containing necessary paths in PATH environment variable

@echo off
set PATH=C:\Rtools\bin;c:\Rtools\gcc-4.6.3\bin;%PATH%
set PATH=C:\R\R-3.0.1\bin\i386;%PATH%
set PKG_LIBS=`Rscript -e "Rcpp:::LdFlags()"`
set PKG_CPPFLAGS=`Rscript -e "Rcpp:::CxxFlags()"`
set R_HOME=C:\R\R-3.0.1
echo Setting environment for using R
cmd

In the Windows command prompt, run

cd C:\R\R-3.0.1\library\RInside\examples\standard
make -f Makefile.win

Now we can test by running any of executable files that make generates. For example, rinside_sample0.

rinside_sample0

As for the Qt application qdensity program, we need to make sure the same version of MinGW was used in building RInside/Rcpp and Qt. See some discussions in

So the Qt and Wt web tool applications on Windows may or may not be possible.

Hadoop (eg ~100 terabytes)

XML

On Ubuntu, we need to install libxml2-dev before we can install XML package.

sudo apt-get update
sudo apt-get install libxml2-dev

Create GUI

gWidgets

GenOrd: Generate ordinal and discrete variables with given correlation matrix and marginal distributions

here

rjson

http://heuristically.wordpress.com/2013/05/20/geolocate-ip-addresses-in-r/

RJSONIO

Plot IP on google map

http://thebiobucket.blogspot.com/2011/12/some-fun-with-googlevis-plotting-blog.html#more (RCurl, RJONIO, plyr, googleVis)
http://devblog.icans-gmbh.com/using-the-maxmind-geoip-api-with-r/ (RCurl, RJONIO, maps)
http://cran.r-project.org/web/packages/geoPlot/index.html (geoPlot package (deprecated as 8/12/2013))
http://archive09.linux.com/feature/135384 (Not R) ApacheMap
http://batchgeo.com/features/geolocation-ip-lookup/ (Not R) (Enter a spreadsheet of adress, city, zip or a column of IPs and it will show the location on google map)
http://code.google.com/p/apachegeomap/

The following example is modified from the first of above list.

require(RJSONIO) # fromJSON
require(RCurl)   # getURL

temp = getURL("https://gist.github.com/arraytools/6743826/raw/23c8b0bc4b8f0d1bfe1c2fad985ca2e091aeb916/ip.txt", 
                           ssl.verifypeer = FALSE)
ip <- read.table(textConnection(temp), as.is=TRUE)
names(ip) <- "IP"
nr = nrow(ip)
 
Lon <- as.numeric(rep(NA, nr))
Lat <- Lon
Coords <- data.frame(Lon, Lat)
 
ip2coordinates <- function(ip) {
  api <- "http://freegeoip.net/json/"
  get.ips <- getURL(paste(api, URLencode(ip), sep=""))
  # result <- ldply(fromJSON(get.ips), data.frame)
  result <- data.frame(fromJSON(get.ips))
  names(result)[1] <- "ip.address"
  return(result)
}

for (i in 1:nr){
  cat(i, "\n")
  try(
  Coords[i, 1:2] <- ip2coordinates(ip$IP[i])[c("longitude", "latitude")]
  )
}
 
# append to log-file:
logfile <- data.frame(ip, Lat = Coords$Lat, Long = Coords$Lon,
                                       LatLong = paste(round(Coords$Lat, 1), round(Coords$Lon, 1), sep = ":")) 
log_gmap <- logfile[!is.na(logfile$Lat), ]

require(googleVis) # gvisMap
gmap <- gvisMap(log_gmap, "LatLong",
                options = list(showTip = TRUE, enableScrollWheel = TRUE,
                               mapType = 'hybrid', useMapTypeControl = TRUE,
                               width = 1024, height = 800))
plot(gmap)

The plot.gvis() method in googleVis packages also teaches the startDynamicHelp() function in the tools package, which was used to launch a http server. See Jeffrey Horner's note about deploying Rook App.

googleVis

See an example from RJSONIO above.

Rcpp

Discussion archive

Example 1. convolution example

First, Rcpp package should be installed (I am working on Linux system). Next we try one example shipped in Rcpp package.

PS. If R was not available in global environment (such as built by ourselves), we need to modify 'Makefile' file by replacing 'R' command with its complete path (4 places).

cd ~/R/x86_64-pc-linux-gnu-library/3.0/Rcpp/examples/ConvolveBenchmarks/
make
R

Then type the following in an R session to see how it works. Note that we don't need to issue library(Rcpp) in R.

dyn.load("convolve3_cpp.so")
x <- .Call("convolve3cpp", 1:3, 4:6)
x # 4 13 28 27 18

If we have our own cpp file, we need to use the following way to create dynamic loaded library file. Note that the character (grave accent) ` is not (single quote)'. If you mistakenly use ', it won't work.

export PKG_CXXFLAGS=`Rscript -e "Rcpp:::CxxFlags()"`
export PKG_LIBS=`Rscript -e "Rcpp:::LdFlags()"`
R CMD SHILB xxxx.cpp

Example 2. Use together with inline package

http://adv-r.had.co.nz/C-interface.html#calling-c-functions-from-r

library(inline)
src <-'
 Rcpp::NumericVector xa(a);
 Rcpp::NumericVector xb(b);
 int n_xa = xa.size(), n_xb = xb.size();

 Rcpp::NumericVector xab(n_xa + n_xb - 1);
 for (int i = 0; i < n_xa; i++)
 for (int j = 0; j < n_xb; j++)
 xab[i + j] += xa[i] * xb[j];
 return xab;
'
fun <- cxxfunction(signature(a = "numeric", b = "numeric"),
 src, plugin = "Rcpp")
fun(1:3, 1:4) 
# [1]  1  4 10 16 17 12

Example 3. Calling an R function

caret

xlsx package

ggplot2

Data Manipulation

stringr and plyr

http://martinsbioblogg.wordpress.com/2013/03/24/using-r-reading-tables-that-need-a-little-cleaning/

A data.frame is pretty much a list of vectors, so we use plyr to apply over the list and stringr to search and replace in the vectors.

reshape2

Use acast() function in reshape2 package. It will convert data.frame used for analysis to a table-like data.frame good for display.

http://lamages.blogspot.com/2013/10/creating-matrix-from-long-dataframe.html

jpeg

If we want to create the image on this wiki left hand side panel, we can use jpeg package to read an existing plot and then edit and save it.

cairoDevice

PS. Not sure the advantage of functions in this package compared to R's functions (eg. Cairo_svg() vs svg()) or even Cairo package.

For ubuntu OS, we need to install 2 libraries and 1 R package RGtk2.

sudo apt-get install libgtk2.0-dev libcairo2-dev

On Windows OS, we may got the error: unable to load shared object 'C:/Program Files/R/R-3.0.2/library/cairoDevice/libs/x64/cairoDevice.dll' . We need to follow the instruction in here.

iterators

Iterator is useful over for-loop if the data is already a collection. It can be used to iterate over a vector, data frame, matrix, file

Iterator can be combined to use with foreach package http://www.exegetic.biz/blog/2013/11/iterators-in-r/ has more elaboration.

Different ways of using R

R call C/C++

Mainly talks about .C() and .Call().

Embedding R

See Writing for R Extensions Manual Chapter 8.
Talk by Simon Urbanek in UseR 2004.
Technical report by Friedrich Leisch in 2007.
https://stat.ethz.ch/pipermail/r-help/attachments/20110729/b7d86ed7/attachment.pl

An very simple example (do not return from shell) from Writing R Extensions manual

The command-line R front-end, R_HOME/bin/exec/R, is one such example. Its source code is in file <src/main/Rmain.c>.

This example can be run by

R_HOME/bin/R CMD R_HOME/bin/exec/R

Note:

R_HOME/bin/exec/R is the R binary. However, it couldn't be launched directly unless R_HOME and LD_LIBRARY_PATH are set up. Again, this is explained in Writing R Extension manual.
R_HOME/bin/R is a shell-script front-end where users can invoke it. It sets up the environment for the executable. It can be copied to /usr/local/bin/R. When we run R_HOME/bin/R, it actually runs R_HOME/bin/R CMD R_HOME/bin/exec/R (see line 259 of R_HOME/bin/R as in R 3.0.2) so we know the important role of R_HOME/bin/exec/R.

More examples of embedding can be found in tests/Embedding directory. Read <index.html> for more information about these test examples.

An example from Bioconductor workshop

What is covered in this section is different from Create and use a standalone Rmath library.
Use eval() function. See R-Ext 8.1 and 8.2 and 5.11.
http://stackoverflow.com/questions/2463437/r-from-c-simplest-possible-helloworld (obtained from searching R_tryEval on google)
http://stackoverflow.com/questions/7457635/calling-r-function-from-c

Example: Create <embed.c> file

#include <Rembedded.h>
#include <Rdefines.h>

static void doSplinesExample();
int
main(int argc, char *argv[])
{
    Rf_initEmbeddedR(argc, argv);
    doSplinesExample();
    Rf_endEmbeddedR(0);
    return 0;
}
static void
doSplinesExample()
{
    SEXP e, result;
    int errorOccurred;

    // create and evaluate 'library(splines)'
    PROTECT(e = lang2(install("library"), mkString("splines")));
    R_tryEval(e, R_GlobalEnv, &errorOccurred);
    if (errorOccurred) {
        // handle error
    }
    UNPROTECT(1);

    // 'options(FALSE)' ...
    PROTECT(e = lang2(install("options"), ScalarLogical(0)));
    // ... modified to 'options(example.ask=FALSE)' (this is obscure)
    SET_TAG(CDR(e), install("example.ask"));
    R_tryEval(e, R_GlobalEnv, NULL);
    UNPROTECT(1);

    // 'example("ns")'
    PROTECT(e = lang2(install("example"), mkString("ns")));
    R_tryEval(e, R_GlobalEnv, &errorOccurred);
    UNPROTECT(1);
}

Then build the executable. Note that I don't need to create R_HOME variable.

cd 
tar xzvf 
cd R-3.0.1
./configure --enable-R-shlib
make
cd tests/Embedding
make
~/R-3.0.1/bin/R CMD ./Rtest

nano embed.c
# Using a single line will give an error and cannot not show the real problem.
# ../../bin/R CMD gcc -I../../include -L../../lib -lR embed.c
# A better way is to run compile and link separately
gcc -I../../include -c embed.c
gcc -o embed embed.o -L../../lib -lR -lRblas
../../bin/R CMD ./embed

Note that if we want to call the executable file ./embed directly, we shall set up R environment by specifying R_HOME variable and including the directories used in linking R in LD_LIBRARY_PATH. This is based on the inform provided by Writing R Extensions.

export R_HOME=/home/brb/Downloads/R-3.0.2
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/brb/Downloads/R-3.0.2/lib
./embed # No need to include R CMD in front.

Question: Create a data frame in C? Answer: Use data.frame() via an eval() call from C. Or see the code is stats/src/model.c, as part of model.frame.default. Or using Rcpp as here.

Reference http://bioconductor.org/help/course-materials/2012/Seattle-Oct-2012/AdvancedR.pdf

Create a Simple Socket Server in R

This example is coming from this paper.

Create an R function

simpleServer <- function(port=6543)
{
  sock <- socketConnection ( port=port , server=TRUE)
  on.exit(close( sock ))
  cat("\nWelcome to R!\nR>" ,file=sock )
  while(( line <- readLines ( sock , n=1)) != "quit")
  {
    cat(paste("socket >" , line , "\n"))
    out<- capture.output (try(eval(parse(text=line ))))
    writeLines ( out , con=sock )
    cat("\nR> " ,file =sock )
  }
}

Then run simpleServer(). Open another terminal and try to communicate with the server

$ telnet localhost 6543
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

Welcome to R!
R> summary(iris[, 3:5])
  Petal.Length    Petal.Width          Species  
 Min.   :1.000   Min.   :0.100   setosa    :50  
 1st Qu.:1.600   1st Qu.:0.300   versicolor:50  
 Median :4.350   Median :1.300   virginica :50  
 Mean   :3.758   Mean   :1.199                  
 3rd Qu.:5.100   3rd Qu.:1.800                  
 Max.   :6.900   Max.   :2.500                  

R> quit
Connection closed by foreign host.

Rserve

Note the way of launching Rserve is like the way we launch C program when R was embedded in C. See Call R from C/C++ or Example from Bioconductor workshop.

See my Rserve page.

(Commercial) StatconnDcom

R.NET

RJava

RCaller

(From RInside documentation) The RInside package makes it easier to embed R in your C++ applications. There is no code you would execute directly from the R environment. Rather, you write C++ programs that embed R which is illustrated by some the included examples.

The included examples are armadillo, eigen, mpi, qt, standard, threads and wt.

To run 'make' when we don't have a global R, we should modify the file <Makefile>. Also if we just want to create one executable file, we can do, for example, 'make rinside_sample1'.

To run any executable program, we need to specify LD_LIBRARY_PATH variable, something like

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/brb/Downloads/R-3.0.2/lib

The real build process looks like (check <Makefile> for completeness)

g++ -I/home/brb/Downloads/R-3.0.2/include \
    -I/home/brb/Downloads/R-3.0.2/library/Rcpp/include \
    -I/home/brb/Downloads/R-3.0.2/library/RInside/include -g -O2 -Wall \
    -I/usr/local/include   \
    rinside_sample0.cpp  \
    -L/home/brb/Downloads/R-3.0.2/lib -lR  -lRblas -lRlapack \
    -L/home/brb/Downloads/R-3.0.2/library/Rcpp/lib -lRcpp \
    -Wl,-rpath,/home/brb/Downloads/R-3.0.2/library/Rcpp/lib \
    -L/home/brb/Downloads/R-3.0.2/library/RInside/lib -lRInside \
    -Wl,-rpath,/home/brb/Downloads/R-3.0.2/library/RInside/lib \
    -o rinside_sample0

Hello World example of embedding R in C++.

#include <RInside.h>                    // for the embedded R via RInside

int main(int argc, char *argv[]) {

    RInside R(argc, argv);              // create an embedded R instance 

    R["txt"] = "Hello, world!\n";	// assign a char* (string) to 'txt'

    R.parseEvalQ("cat(txt)");           // eval the init string, ignoring any returns

    exit(0);
}

The above can be compared to the Hello world example in Qt.

#include <QApplication.h>
#include <QPushButton.h>

int main( int argc, char **argv )
{
    QApplication app( argc, argv );

    QPushButton hello( "Hello world!", 0 );
    hello.resize( 100, 30 );

    app.setMainWidget( &hello );
    hello.show();

    return app.exec();
}

RFortran

RFortran is an open source project with the following aim:

To provide an easy to use Fortran software library that enables Fortran programs to transfer data and commands to and from R.

It works only on Windows platform with Microsoft Visual Studio installed:(

Call R from other languages

JRI

http://www.rforge.net/JRI/

ryp2

http://rpy.sourceforge.net/rpy2.html

Create a standalone Rmath library

R has many math and statistical functions. We can easily use these functions in our C/C++/Fortran. The definite guide of doing this is on Chapter 9 "The standalone Rmath library" of R-admin manual.

Here is my experience based on R 3.0.2 on Windows OS.

Create a static library <libRmath.a> and a dynamic library <Rmath.dll>

Suppose we have downloaded R source code and build R from its source. See Build_R_from_its_source. Then the following 2 lines will generate files <libRmath.a> and <Rmath.dll> under C:\R\R-3.0.2\src\nmath\standalone directory.

cd C:\R\R-3.0.2\src\nmath\standalone
make -f Makefile.win

Use Rmath library in our code

set CPLUS_INCLUDE_PATH=C:\R\R-3.0.2\src\include
set LIBRARY_PATH=C:\R\R-3.0.2\src\nmath\standalone
# It is not LD_LIBRARY_PATH in above.

# Created <RmathEx1.cpp> from the book "Statistical Computing in C++ and R" web site
# http://math.la.asu.edu/~eubank/CandR/ch4Code.cpp
# It is OK to save the cpp file under any directory.

# Force to link against the static library <libRmath.a>
g++ RmathEx1.cpp -lRmath -lm -o RmathEx1.exe
# OR
g++ RmathEx1.cpp -Wl,-Bstatic -lRmath -lm -o RmathEx1.exe

# Force to link against dynamic library <Rmath.dll>
g++ RmathEx1.cpp Rmath.dll -lm -o RmathEx1Dll.exe

Test the executable program. Note that the executable program RmathEx1.exe can be transferred to and run in another computer without R installed. Isn't it cool!

c:\R>RmathEx1
Enter a argument for the normal cdf:
1
Enter a argument for the chi-squared cdf:
1
Prob(Z <= 1) = 0.841345
Prob(Chi^2 <= 1)= 0.682689

Below is the cpp program <RmathEx1.cpp>.

//RmathEx1.cpp
#define MATHLIB_STANDALONE 
#include <iostream>
#include "Rmath.h"

using std::cout; using std::cin; using std::endl;

int main()
{
  double x1, x2;
  cout << "Enter a argument for the normal cdf:" << endl;
  cin >> x1;
  cout << "Enter a argument for the chi-squared cdf:" << endl;
  cin >> x2;

  cout << "Prob(Z <= " << x1 << ") = " << 
    pnorm(x1, 0, 1, 1, 0)  << endl;
  cout << "Prob(Chi^2 <= " << x2 << ")= " << 
    pchisq(x2, 1, 1, 0) << endl;
  return 0;
}

Calling R.dll directly

See Chapter 8.2.2 of R Extensions. This is related to embedding R under Windows. The file <R.dll> on Windows is like <libR.so> on Linux.

Create HTML 5 web and slides

See here

Create HTML report

ReportingTools (Jason Hackney) from Bioconductor.

Create academic report

reports package in CRAN and in github repository. The youtube video gives an overview of the package.

Create Word report

knitr + pandoc

It is better to create rmd file in RStudio. Rstudio provides a template for rmd file and it also provides a quick reference to R markdown language.

# Idea:
#        knitr       pandoc
#   rmd -------> md --------> docx
library(knitr)
knit2html("example.rmd") #Create md and html files

and then

FILE <- "example"
system(paste0("pandoc -o ", FILE, ".docx ", FILE, ".md"))

Note. For example reason, if I play around the above 2 commands for several times, the knit2html() does not work well. However, if I click 'Knit HTML' button on the RStudio, it then works again.

Another way is

library(pander)
name = "demo"
knit(paste0(name, ".Rmd"), encoding = "utf-8")
Pandoc.brew(file = paste0(name, ".md"), output = paste0(-name, "docx"), convert = "docx")

Note that once we have used knitr command to create a md file, we can use pandoc shell command to convert it to different formats:

A pdf file: pandoc -s report.md -t latex -o report.pdf
A html file: pandoc -s report.md -o report.html (with the -c flag html files can be added easily)
Openoffice: pandoc report.md -o report.odt
Word docx: pandoc report.md -o report.docx

pander

Try pandoc[1] with a minimal reproducible example, you might give a try to my "pander" package [2] too:

library(pander)
Pandoc.brew(system.file('examples/minimal.brew', package='pander'),
            output = tempfile(), convert = 'docx')

Where the content of the "minimal.brew" file is something you might have got used to with Sweave - although it's using "brew" syntax instead. See the examples of pander [3] for more details. Please note that pandoc should be installed first, which is pretty easy on Windows.

R2wd

Use R2wd package. However, only 32-bit R is allowed and sometimes it can not produce all 'table's.

> library(R2wd)
> wdGet()
Loading required package: rcom
Loading required package: rscproxy
rcom requires a current version of statconnDCOM installed.
To install statconnDCOM type
     installstatconnDCOM()

This will download and install the current version of statconnDCOM

You will need a working Internet connection
because installation needs to download a file.
Error in if (wdapp[["Documents"]][["Count"]] == 0) wdapp[["Documents"]]$Add() : 
  argument is of length zero

The solution is to launch 32-bit R instead of 64-bit R since statconnDCOM does not support 64-bit R.

Convert from pdf to word

The best rendering of advanced tables is done by converting from pdf to Word. See http://biostat.mc.vanderbilt.edu/wiki/Main/SweaveConvert

rtf

Use rtf package for Rich Text Format (RTF) Output.

xtable

Package xtable will produce html output. If you save the file and then open it with Word, you will get serviceable results. I've had better luck copying the output from xtable and pasting it into Excel.

ReporteRs

Microsoft Word, Microsoft Powerpoint and HTML documents generation from R. The source code is hosted on https://github.com/davidgohel/ReporteRs

COM client or server

Client

RDCOMClient where excel.link depends on it.

Server

RDCOMServer

Use R under proxy

http://support.rstudio.org/help/kb/faq/configuring-r-to-use-an-http-proxy

What is the best place to save Rconsole on Windows platform

Put it in C:/Users/USERNAME/Documents folder so no matter how R was upgraded/downgraded, it always find my preference.

Web scraping

http://www.slideshare.net/schamber/web-data-from-r#btnNext

Launch Rstudio

If multiple versions of R was detected, Rstudio can not be launched successfully. A java-like clock will be spinning without a stop. The trick is to click Ctrl key and click the Rstudio at the same time. After done that, it will show up a selection of R to choose from.

List files using regular expression

Extension

list.files(pattern = "\\.txt$")

Start with

list.files(pattern = "^Something")

Using Sys.glob()"' as

> Sys.glob("~/Downloads/*.txt")
[1] "/home/brb/Downloads/ip.txt"       "/home/brb/Downloads/valgrind.txt"

Hidden tool: rsync in Rtools

c:\Rtools\bin>rsync -avz "/cygdrive/c/users/limingc/Downloads/a.exe" "/cygdrive/c/users/limingc/Documents/"
sending incremental file list
a.exe

sent 323142 bytes  received 31 bytes  646346.00 bytes/sec
total size is 1198416  speedup is 3.71

c:\Rtools\bin>

And rsync works best when we need to sync folder.

c:\Rtools\bin>rsync -avz "/cygdrive/c/users/limingc/Downloads/binary" "/cygdrive/c/users/limingc/Documents/"
sending incremental file list
binary/
binary/Eula.txt
binary/cherrytree.lnk
binary/depends64.chm
binary/depends64.dll
binary/depends64.exe
binary/mtputty.exe
binary/procexp.chm
binary/procexp.exe
binary/pscp.exe
binary/putty.exe
binary/sqlite3.exe
binary/wget.exe

sent 4115294 bytes  received 244 bytes  1175868.00 bytes/sec
total size is 8036311  speedup is 1.95

c:\Rtools\bin>rm c:\users\limingc\Documents\binary\procexp.exe
cygwin warning:
  MS-DOS style path detected: c:\users\limingc\Documents\binary\procexp.exe
  Preferred POSIX equivalent is: /cygdrive/c/users/limingc/Documents/binary/procexp.exe
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames

c:\Rtools\bin>rsync -avz "/cygdrive/c/users/limingc/Downloads/binary" "/cygdrive/c/users/limingc/Documents/"
sending incremental file list
binary/
binary/procexp.exe

sent 1767277 bytes  received 35 bytes  3534624.00 bytes/sec
total size is 8036311  speedup is 4.55

c:\Rtools\bin>

Unforunately, if the destination is a network drive, I could get a permission denied (13) error. See also http://superuser.com/questions/69620/rsync-file-permissions-on-windows

Install rgdal package on ubuntu

sudo apt-get install libgdal1-dev libproj-dev
R
> install.packages("rgdal")

Set up Emacs on Windows

Edit the file C:\Program Files\GNU Emacs 23.2\site-lisp\site-start.el with something like

(setq-default inferior-R-program-name
              "c:/program files/r/r-2.15.2/bin/i386/rterm.exe")

Database

RMySQL

RSQLite

[http://blog.sobbayi.com/sql

MongoDB

Github

R source

https://github.com/wch/r-source/ Daily update, interesting, should be visited every day. Clicking 1000+ commits to look at daily changes.
https://github.com/SurajGupta/r-source (update for each R release)

github

https://github.com/languages/R

Send local repository to Github in R by using reports package

http://www.youtube.com/watch?v=WdOI_-aZV0Y

My collection

https://github.com/arraytools
https://gist.github.com/4383351 heatmap using leukemia data
https://gist.github.com/4382774 heatmap using sequential data
https://gist.github.com/4484270 biocLite

How to download

Clone ~ Download.

Command line

git clone https://gist.github.com/4484270.git

This will create a subdirectory called '4484270' with all cloned files there.

Within R

library(devtools)
source_gist("4484270")

or First download the json file from

https://api.github.com/users/MYUSERLOGIN/gists

and then

library(RJSONIO)
x <- fromJSON("~/Downloads/gists.json")
setwd("~/Downloads/")
gist.id <- lapply(x, "[[", "id")
lapply(gist.id, function(x){
  cmd <- paste0("git clone https://gist.github.com/", x, ".git")
  system(cmd)
})

Connect R with Arduino

Android App

R Instructor $4.84
Statistical Distribution (Not R related app)

Tricks

Change default R repository

Edit global Rprofile file. On *NIX platforms, it's located in /usr/lib/R/library/base/R/Rprofile although local .Rprofile settings take precedence.

For example, I can specify the R mirror I like by creating a single line <.Rprofile> file under my home directory.

options(repos = "http://cran.rstudio.com/")

Query about an R package

packageDescription("MASS")
packageVersion("MASS")
packageStatus() # Summarize information about installed packages

installed.packages()
available.packages()

The 'available.packages()' command is useful for understanding package dependency. Use setRepositories() to select repositories and options()$repos to check or change the repositories.

> packageStatus() 
Number of installed packages:
                                    
                                      ok upgrade unavailable
  C:/Program Files/R/R-3.0.1/library 110       0           1

Number of available packages (each package counted only once):
                                                                                   
                                                                                    installed not installed
  http://watson.nci.nih.gov/cran_mirror/bin/windows/contrib/3.0                            76          4563
  http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/3.0                                0             5
  http://www.bioconductor.org/packages/2.12/bioc/bin/windows/contrib/3.0                   16           625
  http://www.bioconductor.org/packages/2.12/data/annotation/bin/windows/contrib/3.0         4           686
> tmp <- available.packages()
> str(tmp)
 chr [1:5975, 1:17] "A3" "ABCExtremes" "ABCp2" "ACCLMA" "ACD" "ACNE" "ADGofTest" "ADM3" "AER" ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:5975] "A3" "ABCExtremes" "ABCp2" "ACCLMA" ...
  ..$ : chr [1:17] "Package" "Version" "Priority" "Depends" ...
> tmp[1:3,]
            Package       Version Priority Depends                     Imports LinkingTo Suggests             
A3          "A3"          "0.9.2" NA       "xtable, pbapply"           NA      NA        "randomForest, e1071"
ABCExtremes "ABCExtremes" "1.0"   NA       "SpatialExtremes, combinat" NA      NA        NA                   
ABCp2       "ABCp2"       "1.1"   NA       "MASS"                      NA      NA        NA                   
            Enhances License      License_is_FOSS License_restricts_use OS_type Archs MD5sum NeedsCompilation File
A3          NA       "GPL (>= 2)" NA              NA                    NA      NA    NA     NA               NA  
ABCExtremes NA       "GPL-2"      NA              NA                    NA      NA    NA     NA               NA  
ABCp2       NA       "GPL-2"      NA              NA                    NA      NA    NA     NA               NA  
            Repository                                                     
A3          "http://watson.nci.nih.gov/cran_mirror/bin/windows/contrib/3.0"
ABCExtremes "http://watson.nci.nih.gov/cran_mirror/bin/windows/contrib/3.0"
ABCp2       "http://watson.nci.nih.gov/cran_mirror/bin/windows/contrib/3.0"

And the following commands find which package depends on Rcpp and also which are from bioconductor repository.

> pkgName <- "Rcpp"
> rownames(tmp)[grep(pkgName, tmp[,"Depends"])]
> tmp[grep("Rcpp", tmp[,"Depends"]), "Depends"]

> ind <- intersect(grep(pkgName, tmp[,"Depends"]), grep("bioconductor", tmp[, "Repository"]))
> rownames(grep)[ind]
NULL
> rownames(tmp)[ind]
 [1] "ddgraph"            "DESeq2"             "GeneNetworkBuilder" "GOSemSim"           "GRENITS"           
 [6] "mosaics"            "mzR"                "pcaMethods"         "Rdisop"             "Risa"              
[11] "rTANDEM"

Editor

http://en.wikipedia.org/wiki/R_(programming_language)#Editors_and_IDEs

Emacs + ESS.
Rstudio - editor/R terminal/R graphics/file browser/package manager. The new version (0.98) also provides a new feature for debugging step-by-step.
geany - I like the feature that it shows defined functions on the side panel even for R code. RStudio can also do this (see the bottom of the code panel).
Rgedit which includes a feature of splitting screen into two panes and run R in the bottom panel. See here.
Komodo IDE with browser preview http://www.youtube.com/watch?v=wv89OOw9roI at 4:06 and http://docs.activestate.com/komodo/4.4/editor.html

GUI for Data Analysis

Rcmdr

http://cran.r-project.org/web/packages/Rcmdr/index.html

Deducer

http://cran.r-project.org/web/packages/Deducer/index.html

Create a new R package, namespace, documentation

https://stat.ethz.ch/pipermail/r-devel/2013-July/066975.html
Benefit of import in a namespace
This youtube video from Tyler Rinker teaches how to use RStudio to develop an R package and also use Git to do version control. Very useful!

Create R package from R code with roxyPackage

http://lamages.blogspot.com/2013/03/create-r-package-from-single-r-file.html

Minimal R package for submission

https://stat.ethz.ch/pipermail/r-devel/2013-August/067257.html and CRAN Repository Policy.

Apply family

Vectorize, aggregate, apply, by, eapply, lapply, mapply, replicate, scale, sapply, split, tapply, and vapply.

Check out this.

However, apply is just a wrap of a loop. The performance is not better than a for loop. See http://tolstoy.newcastle.edu.au/R/help/06/05/27255.html.

plyr package

Paper in J. Stat Software.

A quick introduction to plyr with a summary of apply functions in R and compare them with functions in plyr package.

plyr has a common syntax -- easier to remember
plyr requires less code since it takes care of the input and output format
plyr can easily be run in parallel -- faster

llply()

llply is equivalent to lapply except that it will preserve labels and can display a progress bar. This is handy if we want to do a crazy thing.

LLID2GOIDs <- lapply(rLLID, function(x) get("org.Hs.egGO")[[x]])

where rLLID is a list of entrez ID. For example,

get("org.Hs.egGO")[["6772"]]

returns a list of 49 GOs.

ddply()

http://lamages.blogspot.com/2012/06/transforming-subsets-of-data-in-r-with.html

ldply()

An R Script to Automatically download PubMed Citation Counts By Year of Publication

mclapply() from paralle package is a mult-core version of lapply()

Note that Windows OS can not take advantage of it.

Another choice for Windows OS is to use parLapply() function in parallel package.

ncores <- as.integer( Sys.getenv('NUMBER_OF_PROCESSORS') )
cl <- makeCluster(getOption("cl.cores", ncores))
LLID2GOIDs2 <- parLapply(cl, rLLID, function(x) {
                                    library(org.Hs.eg.db); get("org.Hs.egGO")[[x]]} 
                        )
stopCluster(cl)

It does work. Cut the computing time from 100 sec to 29 sec on 4 cores.

regular expression

Not specific to R

Example

grep("\\.zip$", pkgs) or grep("\\.tar.gz$", pkgs)

Clipboard

source("clipboard")
read.table("clipboard")

read/manipulate binary data

x <- readBin(fn, raw(), file.info(fn)$size)
rawToChar(x[1:16])
See Biostrings C API

read/download/source a file from internet

Simple text file http

retail <- read.csv("http://robjhyndman.com/data/ausretail.csv",header=FALSE)

Zip file

con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
source(con)
close(con)

Google drive file based on https using RCurl package

require(RCurl)
myCsv <- getURL("https://docs.google.com/spreadsheet/pub?hl=en_US&hl=en_US&key=0AkuuKBh0jM2TdGppUFFxcEdoUklCQlJhM2kweGpoUUE&single=true&gid=0&output=csv")
read.csv(textConnection(myCsv))

Github files https using RCurl package

x = getURL("https://gist.github.com/arraytools/6671098/raw/c4cb0ca6fe78054da8dbe253a05f7046270d5693/GeneIDs.txt", 
            ssl.verifypeer = FALSE)
read.table(text=x)

Create publication tables using tables package

See p13 for example in http://www.ianwatson.com.au/stata/tabout_tutorial.pdf

R's tables packages is the best solution. For example,

> library(tables)
> tabular( (Species + 1) ~ (n=1) + Format(digits=2)*
+          (Sepal.Length + Sepal.Width)*(mean + sd), data=iris )
                                                  
                Sepal.Length      Sepal.Width     
 Species    n   mean         sd   mean        sd  
 setosa      50 5.01         0.35 3.43        0.38
 versicolor  50 5.94         0.52 2.77        0.31
 virginica   50 6.59         0.64 2.97        0.32
 All        150 5.84         0.83 3.06        0.44
> str(iris)
'data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

and

# This example shows some of the less common options         
> Sex <- factor(sample(c("Male", "Female"), 100, rep=TRUE))
> Status <- factor(sample(c("low", "medium", "high"), 100, rep=TRUE))
> z <- rnorm(100)+5
> fmt <- function(x) {
  s <- format(x, digits=2)
  even <- ((1:length(s)) %% 2) == 0
  s[even] <- sprintf("(%s)", s[even])
  s
}
> tabular( Justify(c)*Heading()*z*Sex*Heading(Statistic)*Format(fmt())*(mean+sd) ~ Status )
                  Status              
 Sex    Statistic high   low    medium
 Female mean       4.88   4.96   5.17 
        sd        (1.20) (0.82) (1.35)
 Male   mean       4.45   4.31   5.05 
        sd        (1.01) (0.93) (0.75)

See also a collection of R packages related to reproducible research in http://cran.r-project.org/web/views/ReproducibleResearch.html

Create flat tables in R console using ftable()

> ftable(Titanic, row.vars = 1:3)
                   Survived  No Yes
Class Sex    Age                   
1st   Male   Child            0   5
             Adult          118  57
      Female Child            0   1
             Adult            4 140
2nd   Male   Child            0  11
             Adult          154  14
      Female Child            0  13
             Adult           13  80
3rd   Male   Child           35  13
             Adult          387  75
      Female Child           17  14
             Adult           89  76
Crew  Male   Child            0   0
             Adult          670 192
      Female Child            0   0
             Adult            3  20
> ftable(Titanic, row.vars = 1:2, col.vars = "Survived")
             Survived  No Yes
Class Sex                    
1st   Male            118  62
      Female            4 141
2nd   Male            154  25
      Female           13  93
3rd   Male            422  88
      Female          106  90
Crew  Male            670 192
      Female            3  20
> ftable(Titanic, row.vars = 2:1, col.vars = "Survived")
             Survived  No Yes
Sex    Class                 
Male   1st            118  62
       2nd            154  25
       3rd            422  88
       Crew           670 192
Female 1st              4 141
       2nd             13  93
       3rd            106  90
       Crew             3  20
> str(Titanic)
 table [1:4, 1:2, 1:2, 1:2] 0 0 35 0 0 0 17 0 118 154 ...
 - attr(*, "dimnames")=List of 4
  ..$ Class   : chr [1:4] "1st" "2nd" "3rd" "Crew"
  ..$ Sex     : chr [1:2] "Male" "Female"
  ..$ Age     : chr [1:2] "Child" "Adult"
  ..$ Survived: chr [1:2] "No" "Yes"
> x <- ftable(mtcars[c("cyl", "vs", "am", "gear")])
> x
          gear  3  4  5
cyl vs am              
4   0  0        0  0  0
       1        0  0  1
    1  0        1  2  0
       1        0  6  1
6   0  0        0  0  0
       1        0  2  1
    1  0        2  2  0
       1        0  0  0
8   0  0       12  0  0
       1        0  0  2
    1  0        0  0  0
       1        0  0  0
> ftable(x, row.vars = c(2, 4))
        cyl  4     6     8   
        am   0  1  0  1  0  1
vs gear                      
0  3         0  0  0  0 12  0
   4         0  0  0  2  0  0
   5         0  1  0  1  0  2
1  3         1  0  2  0  0  0
   4         2  6  2  0  0  0
   5         0  1  0  0  0  0
> 
> ## Start with expressions, use table()'s "dnn" to change labels
> ftable(mtcars$cyl, mtcars$vs, mtcars$am, mtcars$gear, row.vars = c(2, 4),
         dnn = c("Cylinders", "V/S", "Transmission", "Gears"))

          Cylinders     4     6     8   
          Transmission  0  1  0  1  0  1
V/S Gears                               
0   3                   0  0  0  0 12  0
    4                   0  0  0  2  0  0
    5                   0  1  0  1  0  2
1   3                   1  0  2  0  0  0
    4                   2  6  2  0  0  0
    5                   0  1  0  0  0  0

tracemem, data type, copy

How to avoid copying a long vector

Tell if the current R is running in 32-bit or 64-bit mode

8 * .Machine$sizeof.pointer

where sizeof.pointer returns the number of *bytes* in a C SEXP type and '8' means number of bits per byte.

Handling length 2^31 and more in R 3.0.0

From R News for 3.0.0 release:

There is a subtle change in behaviour for numeric index values 2^31 and larger. These never used to be legitimate and so were treated as NA, sometimes with a warning. They are now legal for long vectors so there is no longer a warning, and x[2^31] <- y will now extend the vector on a 64-bit platform and give an error on a 32-bit one.

In R 2.15.2, if I try to assign a vector of length 2^31, I will get an error

> x <- seq(1, 2^31)
Error in from:to : result would be too long a vector

However, for R 3.0.0 (tested on my 64-bit Ubuntu with 16GB RAM. The R was compiled by myself):

> system.time(x <- seq(1,2^31))
   user  system elapsed
  8.604  11.060 120.815
> length(x)
[1] 2147483648
> length(x)/2^20
[1] 2048
> gc()
             used    (Mb) gc trigger    (Mb)   max used    (Mb)
Ncells     183823     9.9     407500    21.8     350000    18.7
Vcells 2147764406 16386.2 2368247221 18068.3 2148247383 16389.9
>

Note:

2^31 length is about 2 Giga length. It takes about 16 GB (2^31*8/2^20 MB) memory.
On Windows, it is almost impossible to work with 2^31 length of data if the memory is less than 16 GB because virtual disk on Windows does not work well. For example, when I tested on my 12 GB Windows 7, the whole Windows system freezes for several minutes before I force to power off the machine.
My slide in http://goo.gl/g7sGX shows the screenshots of running the above command on my Ubuntu and RHEL machines. As you can see the linux is pretty good at handling large (> system RAM) data. That said, as long as your linux system is 64-bit, you can possibly work on large data without too much pain.
For large dataset, it makes sense to use database or specially crafted packages like bigmemory or ff.

NA in index

Question: what is seq(1, 3)[c(1, 2, NA)]?

Answer: It will reserve the element with NA in indexing and return the value NA for it.

Question: What is TRUE & NA?

Answer: NA

Question: What is FALSE & NA?

Answer: FALSE

Question: c("A", "B", NA) != "" ?

Answer: TRUE TRUE NA

Question: which(c("A", "B", NA) != "") ?

Answer: 1 2

Question: c(1, 2, NA) != "" & !is.na(c(1, 2, NA)) ?

Answer: TRUE TRUE FALSE

Question: c("A", "B", NA) != "" & !is.na(c("A", "B", NA)) ?

Answer: TRUE TRUE FALSE

Conclusion: In order to exclude empty or NA for numerical or character data type, we can use which() or a convenience function keep.complete(x) <- function(x) x != "" & !is.na(x). This will guarantee return logical values and not contain NAs.

Don't just use x != "" OR !is.na(x).

Creating publication quality graphs in R

http://teachpress.environmentalinformatics-marburg.de/2013/07/creating-publication-quality-graphs-in-r-7/

with() and within() functions

within() is similar to with() except it is used to create new columns and merge them with the original data sets. See youtube video.

closePr <- with(mariokart, totalPr - shipPr)
head(closePr, 20)

mk <- within(mariokart, {
             closePr <- totalPr - shipPr
     })
head(mk) # new column closePr

mk <- mariokart
aggregate(. ~ wheels + cond, mk, mean)
# create mean according to each level of (wheels, cond)

aggregate(totalPr ~ wheels + cond, mk, mean)

tapply(mk$totalPr, mk[, c("wheels", "cond")], mean)

Draw Color Palette

http://teachpress.environmentalinformatics-marburg.de/2013/07/creating-publication-quality-graphs-in-r-7/

Read excel files

http://www.milanor.net/blog/?p=779

My experience is to save the Excel file as csv file (before it can be read into R) if it is possible.

Serialization

If we want to pass an R object to C (use recv() function), we can use writeBin() to output the stream size and then use serialize() function to output the stream to a file. See the post on R mailing list.

> a <- list(1,2,3)
> a_serial <- serialize(a, NULL)
> a_length <- length(a_serial)
> a_length
[1] 70
> writeBin(as.integer(a_length), connection, endian="big")
> serialize(a, connection)

In C++ process, I receive one int variable first to get the length, and then read <length> bytes from the connection.

socketConnection

See ?socketconnection.

Simple example

from the socketConnection's manual.

Open one R session

con1 <- socketConnection(port = 22131, server = TRUE) # wait until a connection from some client
writeLines(LETTERS, con1)
close(con1)

Open another R session (client)

con2 <- socketConnection(Sys.info()["nodename"], port = 22131)
# as non-blocking, may need to loop for input
readLines(con2)
while(isIncomplete(con2)) {
   Sys.sleep(1)
   z <- readLines(con2)
   if(length(z)) print(z)
}
close(con2)

Use nc in client

The client does not have to be the R. We can use telnet, nc, etc. See the post here. For example, on the client machine, we can issue

nc localhost 22131   [ENTER]

Then the client will wait and show anything written from the server machine. The connection from nc will be terminated once close(con1) is given.

If I use the command

nc -v -w 2 localhost -z 22130-22135

then the connection will be established for a short time which means the cursor on the server machine will be returned. If we issue the above nc command again on the client machine it will show the connection to the port 22131 is refused. PS. "-w" switch denotes the number of seconds of the timeout for connects and final net reads.

Some post I don't have a chance to read. http://digitheadslabnotebook.blogspot.com/2010/09/how-to-send-http-put-request-from-r.html

Use curl command in client

On the server,

con1 <- socketConnection(port = 8080, server = TRUE)

On the client,

curl --trace-ascii debugdump.txt http://localhost:8080/

Then go to the server,

while(nchar(x <- readLines(con1, 1)) > 0) cat(x, "\n")

close(con1) # return cursor in the client machine

Use telnet command in client

On the server,

con1 <- socketConnection(port = 8080, server = TRUE)

On the client,

sudo apt-get install telnet
telnet localhost 8080
abcdefg
hijklmn
qestst

Go to the server,

readLines(con1, 1)
readLines(con1, 1)
readLines(con1, 1)
close(con1) # return cursor in the client machine

Some tutorial about using telnet on http request. And this is a summary of using telnet.

Warning: cannot remove prior installation of package

http://stackoverflow.com/questions/15932152/unloading-and-removing-a-loaded-package-withouth-restarting-r

For example,

# Install the latest hgu133plus2cdf package
# Remove/Uninstall hgu133plus2.db package
# Put/Install an old version of IRanges (eg version 1.18.2 while currently it is version 1.18.3)
# Test on R 3.0.1
library(hgu133plus2cdf) # hgu133pluscdf does not depend or import IRanges
source("http://bioconductor.org/biocLite.R")
biocLite("hgu133plus2.db", ask=FALSE) # hgu133plus2.db imports IRanges
# Warning:cannot remove prior installation of package 'IRanges'
# Open Windows Explorer and check IRanges folder. Only see libs subfolder.

Note:

In the above example, all packages were installed under C:\Program Files\R\R-3.0.1\library\.
In another instance where I cannot reproduce the problem, new R packages were installed under C:\Users\xxx\Documents\R\win-library\3.0\. The different thing is IRanges package CAN be updated but if I use packageVersion("IRanges") command in R, it still shows the old version.
The above were tested on a desktop.
When working on virtualbox VM, sometimes (sort of frequently) I will get an error Warning: unable to move temporary installation `C:\Users\brb\Documents\R\win-library\3.0\fileed8270978f5\quadprog` to `C:\Users\brb\Documents\R\win-library\3.0\quadprog` when I try to run 'install.packages("forecast").

R package depends vs imports

In the namespace era Depends is never really needed. All modern packages have no technical need for Depends anymore. Loosely speaking the only purpose of Depends today is to expose other package's functions to the user without re-exporting them.

load = functions exported in myPkg are available to interested parties as myPkg::foo or via direct imports - essentially this means the package can now be used

attach = the namespace (and thus all exported functions) is attached to the search path - the only effect is that you have now added the exported functions to the global pool of functions - sort of like dumping them in the workspace (for all practical purposes, not technically)

import a function into a package = make sure that this function works in my package regardless of the search path (so I can write fn1 instead of pkg1::fn1 and still know it will come from pkg1 and not someone's workspace or other package that chose the same name)

https://stat.ethz.ch/pipermail/r-devel/2013-September/067451.html

The distinction is between "loading" and "attaching" a package. Loading it (which would be done if you had MASS::loglm, or imported it) guarantees that the package is initialized and in memory, but doesn't make it visible to the user without the explicit MASS:: prefix. Attaching it first loads it, then modifies the user's search list so the user can see it.

Loading is less intrusive, so it's preferred over attaching. Both library() and require() would attach it.

Subsetting

Subset assignment of R Language Definition and Manipulation of functions.

The result of the command x[3:5] <- 13:15 is as if the following had been executed

`*tmp*` <- x
x <- "[<-"(`*tmp*`, 3:5, value=13:15)
rm(`*tmp*`)

S3 and S4

Software for Data Analysis: Programming with R by John Chambers
Programming with Data: A Guide to the S Language by John Chambers
https://www.rmetrics.org/files/Meielisalp2009/Presentations/Chalabi1.pdf
https://www.stat.auckland.ac.nz/S-Workshop/Gentleman/S4Objects.pdf

To get the source code of S4 methods, we can use showMethod(), getMethod() and showMethod(). For example

library(qrqc)
showMethods("gcPlot")
getMethod("gcPlot", "FASTQSummary") # get an error
showMethod("gcPlot", "FASTQSummary") # good.

findInterval()

Related functions are cuts() and split(). See also

@@ Line 1,035: / Line 1,035: @@
 # [http://shop.oreilly.com/product/0636920021421.do Example code] for the book Parallel R by McCallum and Weston.
 # [http://www.stat.berkeley.edu/scf/paciorek-distribComp.pdf An introduction to distributed memory parallelism in R and C]
-# [http://danielmarcelino.com/parallel-processing/Parallel Processing: When does it worth?」
+# [http://danielmarcelino.com/parallel-processing/Parallel Processing: When does it worth?]
 === Windows Security Warning ===