R

From 太極
Revision as of 16:27, 7 November 2012 by Brb (talk | contribs)
Jump to navigation Jump to search

CRAN local repository

How to set up a package repository

Guide: http://cran.r-project.org/doc/manuals/R-admin.html#Setting-up-a-package-repository

To create CRAN repository

Before creating a local repository please give a dry run first. You don't want to be surprised how long will it take to mirror a directory.

Dry run

rsync -avn cran.r-project.org::CRAN > crandryrun.txt

To mirror only partial repository, it is necessary to create directories before running rsync command.

mkdir /media/WD640/CRAN
mkdir /media/WD640/CRAN/bin
mkdir /media/WD640/CRAN/bin/windows
mkdir /media/WD640/CRAN/bin/windows/contrib
mkdir /media/WD640/CRAN/bin/windows/contrib/2.15
rsync -rtlzv --delete cran.r-project.org::CRAN/bin/windows/contrib/2.15/ /media/WD640/CRAN/bin/windows/contrib/2.15
(one line with space before /media)
rsync -rtlzv --delete cran.r-project.org::CRAN/src/ /media/WD640/CRAN/src/

And optionally

library(tools)
write_PACKAGES("/var/www/CRAN/bin/windows/contrib/2.15", type="win.binary") 

We can use du -h to check the folder size.

To create Bioconductor repository

Dry run

rsync -avn bioconductor.org::2.11 > biocdryrun.txt

Then creates directories before running rsync.

The software part:

mkdir /media/WD640/Bioc/packages/2.11/bioc
mkdir /media/WD640/Bioc/packages/2.11/bioc/bin
mkdir /media/WD640/Bioc/packages/2.11/bioc/bin/windows
rsync -zrtlv  --delete bioconductor.org::2.11/bioc/bin/windows/ /media/WD640/Bioc/packages/2.11/bioc/bin/windows

and annotation part:

mkdir /media/WD640/Bioc/packages/2.11/data
mkdir /media/WD640/Bioc/packages/2.11/data/annotation
mkdir /media/WD640/Bioc/packages/2.11/data/annotation/bin
mkdir /media/WD640/Bioc/packages/2.11/data/annotation/bin/windows
rsync -zrtlv --delete bioconductor.org::2.11/data/annotation/bin/windows/ /media/WD640/Bioc/packages/2.11/data/annotation/bin/windows

To test local repository

To test CRAN

r <- getOption("repos"); r["CRAN"] <- "http://arraytools.no-ip.org/CRAN"
options(repos=r)
install.packages("pamr")

To test Bioconductor

options("BioC_mirror" = "http://arraytools.no-ip.org/Bioc")
source("http://bioconductor.org/biocLite.R")
# This source biocLite.R line can be placed either before or after the previous 2 lines
biocLite("aCGH")

If there is a connection problem, check folder attributes.

chmod -R 755 ~/CRAN/bin

Note that if a binary package was created for R 2.15.1, then it can be installed under R 2.15.1 but not R 2.15.2. The R console will show package xxx is not available (for R version 2.15.2).

For binary installs, the function also checks for the availability of a source package on the same repository, and reports if the source package has a later version, or is available but no binary version is. So for example, if the mirror does not have contents under src directory, we need to run the following line in order to successfully run install.packages() function.

options(install.packages.check.source = "no")

Hidden tool: rsync in Rtools

c:\Rtools\bin>rsync -avz "/cygdrive/c/users/limingc/Downloads/a.exe" "/cygdrive/c/users/limingc/Documents/"
sending incremental file list
a.exe

sent 323142 bytes  received 31 bytes  646346.00 bytes/sec
total size is 1198416  speedup is 3.71

c:\Rtools\bin>

And rsync works best when we need to sync folder.

c:\Rtools\bin>rsync -avz "/cygdrive/c/users/limingc/Downloads/binary" "/cygdrive/c/users/limingc/Documents/"
sending incremental file list
binary/
binary/Eula.txt
binary/cherrytree.lnk
binary/depends64.chm
binary/depends64.dll
binary/depends64.exe
binary/mtputty.exe
binary/procexp.chm
binary/procexp.exe
binary/pscp.exe
binary/putty.exe
binary/sqlite3.exe
binary/wget.exe

sent 4115294 bytes  received 244 bytes  1175868.00 bytes/sec
total size is 8036311  speedup is 1.95

c:\Rtools\bin>rm c:\users\limingc\Documents\binary\procexp.exe
cygwin warning:
  MS-DOS style path detected: c:\users\limingc\Documents\binary\procexp.exe
  Preferred POSIX equivalent is: /cygdrive/c/users/limingc/Documents/binary/procexp.exe
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames

c:\Rtools\bin>rsync -avz "/cygdrive/c/users/limingc/Downloads/binary" "/cygdrive/c/users/limingc/Documents/"
sending incremental file list
binary/
binary/procexp.exe

sent 1767277 bytes  received 35 bytes  3534624.00 bytes/sec
total size is 8036311  speedup is 4.55

c:\Rtools\bin>

Unforunately, if the destination is a network drive, I could get a permission denied (13) error. See also http://superuser.com/questions/69620/rsync-file-permissions-on-windows

Install rgdal package on ubuntu

sudo apt-get install libgdal1-dev libproj-dev
R
> install.packages("rgdal")

Embedding R

Reference http://bioconductor.org/help/course-materials/2012/Seattle-Oct-2012/AdvancedR.pdf

mli@PhenomIIx6:~/Downloads/R-2.15.2/library/AdvancedR/embedding$ export R_HOME=/home/mli/Downloads/R-2.15.2
mli@PhenomIIx6:~/Downloads/R-2.15.2/library/AdvancedR/embedding$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/mli/Downloads/R-2.15.2/lib
mli@PhenomIIx6:~/Downloads/R-2.15.2/library/AdvancedR/embedding$ g++ embed.c -I/home/mli/Downloads/R-2.15.2/include -L/home/mli/Downloads/R-2.15.2/lib -lR

mli@PhenomIIx6:~/Downloads/R-2.15.2/library/AdvancedR/embedding$ R CMD ./a.out
WARNING: ignoring environment value of R_HOME

R version 2.15.2 (2012-10-26) -- "Trick or Treat"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.


ns> require(stats); require(graphics)

ns> ns(women$height, df = 5)
                 1            2           3          4             5
 [1,] 0.000000e+00 0.000000e+00  0.00000000 0.00000000  0.0000000000
 [2,] 7.592323e-03 0.000000e+00 -0.08670223 0.26010669 -0.1734044626
 [3,] 6.073858e-02 0.000000e+00 -0.15030440 0.45091320 -0.3006088020
 [4,] 2.047498e-01 6.073858e-05 -0.16778345 0.50335034 -0.3355668952
 [5,] 4.334305e-01 1.311953e-02 -0.13244035 0.39732106 -0.2648807067
 [6,] 6.256681e-01 8.084305e-02 -0.07399720 0.22199159 -0.1479943948
 [7,] 6.477162e-01 2.468416e-01 -0.02616007 0.07993794 -0.0532919575
 [8,] 4.791667e-01 4.791667e-01  0.01406302 0.02031093 -0.0135406187
 [9,] 2.468416e-01 6.477162e-01  0.09733619 0.02286023 -0.0152401533
[10,] 8.084305e-02 6.256681e-01  0.27076826 0.06324188 -0.0405213106
[11,] 1.311953e-02 4.334305e-01  0.48059836 0.12526031 -0.0524087186
[12,] 6.073858e-05 2.047498e-01  0.59541597 0.19899261  0.0007809246
[13,] 0.000000e+00 6.073858e-02  0.50097182 0.27551020  0.1627793975
[14,] 0.000000e+00 7.592323e-03  0.22461127 0.35204082  0.4157555879
[15,] 0.000000e+00 0.000000e+00 -0.14285714 0.42857143  0.7142857143
attr(,"degree")
[1] 3
attr(,"knots")
 20%  40%  60%  80%
60.8 63.6 66.4 69.2
attr(,"Boundary.knots")
[1] 58 72
attr(,"intercept")
[1] FALSE
attr(,"class")
[1] "ns"     "basis"  "matrix"

ns> summary(fm1 <- lm(weight ~ ns(height, df = 5), data = women))

Call:
lm(formula = weight ~ ns(height, df = 5), data = women)

Residuals:
     Min       1Q   Median       3Q      Max
-0.38333 -0.12585  0.07083  0.15401  0.30426

Coefficients:
                    Estimate Std. Error t value Pr(>|t|)
(Intercept)         114.7447     0.2338  490.88  < 2e-16 ***
ns(height, df = 5)1  15.9474     0.3699   43.12 9.69e-12 ***
ns(height, df = 5)2  25.1695     0.4323   58.23 6.55e-13 ***
ns(height, df = 5)3  33.2582     0.3541   93.93 8.91e-15 ***
ns(height, df = 5)4  50.7894     0.6062   83.78 2.49e-14 ***
ns(height, df = 5)5  45.0363     0.2784  161.75  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2645 on 9 degrees of freedom
Multiple R-squared: 0.9998,     Adjusted R-squared: 0.9997
F-statistic:  9609 on 5 and 9 DF,  p-value: < 2.2e-16


ns> ## example of safe prediction
ns> plot(women, xlab = "Height (in)", ylab = "Weight (lb)")

ns> ht <- seq(57, 73, length.out = 200)

ns> lines(ht, predict(fm1, data.frame(height=ht)))

ns> ## Don't show:
ns> ## Consistency:
ns> x <- c(1:3,5:6)

ns> stopifnot(identical(ns(x), ns(x, df = 1)),
ns+           identical(ns(x, df=2), ns(x, df=2, knots=NULL)),# not true till 2.15.2
ns+           !is.null(kk <- attr(ns(x), "knots")),# not true till 1.5.1
ns+           length(kk) == 0)

ns> ## End Don't show
ns>
ns>
ns>

The above result can be compared with running

mli@PhenomIIx6:~/Downloads/R-2.15.2/library/AdvancedR/embedding$ R WARNING: ignoring environment value of R_HOME

R version 2.15.2 (2012-10-26) -- "Trick or Treat" Copyright (C) 2012 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details.

 Natural language support but running in an English locale

R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R.

> library(splines) > example("ns")

ns> require(stats); require(graphics)

ns> ns(women$height, df = 5)

                1            2           3          4             5
[1,] 0.000000e+00 0.000000e+00  0.00000000 0.00000000  0.0000000000
[2,] 7.592323e-03 0.000000e+00 -0.08670223 0.26010669 -0.1734044626
[3,] 6.073858e-02 0.000000e+00 -0.15030440 0.45091320 -0.3006088020
[4,] 2.047498e-01 6.073858e-05 -0.16778345 0.50335034 -0.3355668952
[5,] 4.334305e-01 1.311953e-02 -0.13244035 0.39732106 -0.2648807067
[6,] 6.256681e-01 8.084305e-02 -0.07399720 0.22199159 -0.1479943948
[7,] 6.477162e-01 2.468416e-01 -0.02616007 0.07993794 -0.0532919575
[8,] 4.791667e-01 4.791667e-01  0.01406302 0.02031093 -0.0135406187
[9,] 2.468416e-01 6.477162e-01  0.09733619 0.02286023 -0.0152401533

[10,] 8.084305e-02 6.256681e-01 0.27076826 0.06324188 -0.0405213106 [11,] 1.311953e-02 4.334305e-01 0.48059836 0.12526031 -0.0524087186 [12,] 6.073858e-05 2.047498e-01 0.59541597 0.19899261 0.0007809246 [13,] 0.000000e+00 6.073858e-02 0.50097182 0.27551020 0.1627793975 [14,] 0.000000e+00 7.592323e-03 0.22461127 0.35204082 0.4157555879 [15,] 0.000000e+00 0.000000e+00 -0.14285714 0.42857143 0.7142857143 attr(,"degree") [1] 3 attr(,"knots")

20%  40%  60%  80%

60.8 63.6 66.4 69.2 attr(,"Boundary.knots") [1] 58 72 attr(,"intercept") [1] FALSE attr(,"class") [1] "ns" "basis" "matrix"

ns> summary(fm1 <- lm(weight ~ ns(height, df = 5), data = women))

Call: lm(formula = weight ~ ns(height, df = 5), data = women)

Residuals:

    Min       1Q   Median       3Q      Max

-0.38333 -0.12585 0.07083 0.15401 0.30426

Coefficients:

                   Estimate Std. Error t value Pr(>|t|)

(Intercept) 114.7447 0.2338 490.88 < 2e-16 *** ns(height, df = 5)1 15.9474 0.3699 43.12 9.69e-12 *** ns(height, df = 5)2 25.1695 0.4323 58.23 6.55e-13 *** ns(height, df = 5)3 33.2582 0.3541 93.93 8.91e-15 *** ns(height, df = 5)4 50.7894 0.6062 83.78 2.49e-14 *** ns(height, df = 5)5 45.0363 0.2784 161.75 < 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2645 on 9 degrees of freedom Multiple R-squared: 0.9998, Adjusted R-squared: 0.9997 F-statistic: 9609 on 5 and 9 DF, p-value: < 2.2e-16


ns> ## example of safe prediction ns> plot(women, xlab = "Height (in)", ylab = "Weight (lb)")

ns> ht <- seq(57, 73, length.out = 200)

ns> lines(ht, predict(fm1, data.frame(height=ht)))

ns> ## Don't show: ns> ## Consistency: ns> x <- c(1:3,5:6)

ns> stopifnot(identical(ns(x), ns(x, df = 1)), ns+ identical(ns(x, df=2), ns(x, df=2, knots=NULL)),# not true till 2.15.2 ns+ !is.null(kk <- attr(ns(x), "knots")),# not true till 1.5.1 ns+ length(kk) == 0)

ns> ## End Don't show ns> ns> ns>

Note that if I follow the instruction to put embed.c at the end of g++ command, I will get an error.

mli@PhenomIIx6:~/Downloads/R-2.15.2/library/AdvancedR/embedding$ g++ -I/home/mli/Downloads/R-2.15.2/include -L/home/mli/Downloads/R-2.15.2/lib -lR embed.c
/tmp/cc7Vum5j.o: In function `main':
embed.c:(.text+0x1c): undefined reference to `Rf_initEmbeddedR'
embed.c:(.text+0x2b): undefined reference to `Rf_endEmbeddedR'
/tmp/cc7Vum5j.o: In function `doSplinesExample()':
embed.c:(.text+0x45): undefined reference to `Rf_mkString'
embed.c:(.text+0x52): undefined reference to `Rf_install'
embed.c:(.text+0x5d): undefined reference to `Rf_lang2'
embed.c:(.text+0x6d): undefined reference to `Rf_protect'
embed.c:(.text+0x74): undefined reference to `R_GlobalEnv'
embed.c:(.text+0x87): undefined reference to `R_tryEval'
embed.c:(.text+0x91): undefined reference to `Rf_unprotect'
embed.c:(.text+0x9b): undefined reference to `Rf_ScalarLogical'
embed.c:(.text+0xa8): undefined reference to `Rf_install'
embed.c:(.text+0xb3): undefined reference to `Rf_lang2'
embed.c:(.text+0xc3): undefined reference to `Rf_protect'
embed.c:(.text+0xcd): undefined reference to `Rf_install'
embed.c:(.text+0xdc): undefined reference to `CDR'
embed.c:(.text+0xe7): undefined reference to `SET_TAG'
embed.c:(.text+0xee): undefined reference to `R_GlobalEnv'
embed.c:(.text+0x102): undefined reference to `R_tryEval'
embed.c:(.text+0x10c): undefined reference to `Rf_unprotect'
embed.c:(.text+0x116): undefined reference to `Rf_mkString'
embed.c:(.text+0x123): undefined reference to `Rf_install'
embed.c:(.text+0x12e): undefined reference to `Rf_lang2'
embed.c:(.text+0x13e): undefined reference to `Rf_protect'
embed.c:(.text+0x145): undefined reference to `R_GlobalEnv'
embed.c:(.text+0x158): undefined reference to `R_tryEval'
embed.c:(.text+0x162): undefined reference to `Rf_unprotect'
collect2: ld returned 1 exit status