Rcpp

From 太極
Jump to navigation Jump to search

Misc

sudo apt-get install libblas-dev liblapack-dev
sudo apt-get install gfortran

Speed Comparison

  • A comparison of high-performance computing techniques in R. It compares Rcpp to an R looping operator (like mapply), a parallelized version of a looping operator (like mcmapply), explicit parallelization, via the parallel package or the ParallelR suite.
  • In the following example, C++ avoids the overhead of creating an intermediate object (eg vector of the same length as the original vector). The c++ uses an intermediate scalar. So C++ wins R over memory management in this case.
    # http://blog.mckuhn.de/2016/03/avoiding-unnecessary-memory-allocations.html 
    library(Rcpp)
    
    `%count<%` <- cppFunction('
    size_t count_less(NumericVector x, NumericVector y) {
      const size_t nx = x.size();
      const size_t ny = y.size();
      if (nx > 1 & ny > 1) stop("Only one parameter can be a vector!");
      size_t count = 0;
      if (nx == 1) {
        double c = x[0];
        for (int i = 0; i < ny; i++) count += c < y[i];
      } else {
        double c = y[0];
        for (int i = 0; i < nx; i++) count += x[i] < c;
      }
      return count;
    }
    ')
    
    set.seed(42)
    
    N <- 10^7
    v <- runif(N, 0, 10000)
    
    # Testing on my ODroid xu4 running ubuntu 15.10
    system.time(sum(v < 5000))
    #   user  system elapsed
    #  1.135   0.305   1.453
    system.time(v %count<% 5000)
    #   user  system elapsed
    #  0.535   0.000   0.540
  • Why R for data science – and not Python?
    library(Rcpp)
     
    bmi_R <- function(weight, height) {
      weight / (height * height)
    }
    bmi_R(80, 1.85) # body mass index of person with 80 kg and 185 cm
    ## [1] 23.37473
     
    cppFunction("
      float bmi_cpp(float weight, float height) {
        return weight / (height * height);
      }
    ")
    bmi_cpp(80, 1.85) # same with cpp function
    ## [1] 23.37473
  • Boost the speed of R calls from Rcpp

Use Rcpp in RStudio

RStudio makes it easy to use Rcpp package.

Open RStudio, click New File -> C++ File. It will create a C++ template on the RStudio editor

#include <Rcpp.h>
using namespace Rcpp;

// Below is a simple example of exporting a C++ function to R. You can
// source this function into an R session using the Rcpp::sourceCpp 
// function (or via the Source button on the editor toolbar)

// For more on using Rcpp click the Help button on the editor toolbar

// [[Rcpp::export]]
int timesTwo(int x) {
   return x * 2;
}

Now in R console, type

library(Rcpp)
sourceCpp("~/Downloads/timesTwo.cpp")
timesTwo(9)
# [1] 18

See more examples on http://adv-r.had.co.nz/Rcpp.html and Calculating a fuzzy kmeans membership matrix

If we wan to test Boost library, we can try it in RStudio. Consider the following example in stackoverflow.com.

// [[Rcpp::depends(BH)]]
#include <Rcpp.h>
#include <boost/foreach.hpp>
#include <boost/math/special_functions/gamma.hpp>

#define foreach BOOST_FOREACH

using namespace boost::math;

//[[Rcpp::export]]
Rcpp::NumericVector boost_gamma( Rcpp::NumericVector x ) {
  foreach( double& elem, x ) {
    elem = boost::math::tgamma(elem);
  };

  return x;
}

Then the R console

boost_gamma(0:10 + 1)
#  [1]       1       1       2       6      24     120     720    5040   40320
# [10]  362880 3628800

identical( boost_gamma(0:10 + 1), factorial(0:10) )
# [1] TRUE

From the useR 2019 tutorial

useR 2019 Rcpp tutorial

evalCpp()

evalCpp("2 + 2")

set.seed(42)
evalCpp("Rcpp::rnorm(2)")

library(Rcpp)
evalCpp("std::numeric_limits<double>::max()")

cppFunction()

https://www.rdocumentation.org/packages/Rcpp/versions/1.0.1/topics/cppFunction

Note that the c/cpp function can be used as a regular R function without the need of .Call() function.

cppFunction("
int exampleCpp11() {
auto x = 10; // guesses type
return x;
}", plugins=c("cpp11"))
// exampleCpp11()

Rcpp::cppFunction("Rcpp::NumericVector f(int n) { 
  srand(1234); 
  Rcpp::NumericVector res(n); 
  for(int i=0; i<n; i++) res[i] = rand(); 
  return(res); 
}")
//  f(); different result on different platform

cppFunction("int f(int a, int b) { return(a + b); }")
// f(21, 21)

// fibonacci
cppFunction('int g(int n) {
if (n < 2) return(n);
return(g(n-1) + g(n-2)); }')
// sapply(0:10, g)

sourceCpp(): the actual workhorse behind evalCpp() and cppFunction()

https://www.rdocumentation.org/packages/Rcpp/versions/1.0.1/topics/sourceCpp

Create the following in a file called testrcpp.cpp and in R run Rcpp::sourceCpp("testrcpp.cpp"). Any exported functions in .cpp will be available in an R session.

#include <Rcpp.h>
using namespace Rcpp;
// This is a simple example of exporting a C++ function to R. You can
// source this function into an R session using the Rcpp::sourceCpp()
// [[Rcpp::export]]
NumericVector timesTwo(NumericVector x) {
return x * 2;
}
// [[Rcpp::export]]
int g(int n) {
if (n < 2) return(n);
return(g(n-1) + g(n-2)); 
}
// timesTwo(c(2, 5.3))
// sapply(0:10, g)

g++ command line options

  • header location via -I,
  • library location via -L,
  • library via -llibraryname
g++ -I/usr/include -c qnorm_rmath.cpp
g++ -o qnorm_rmath qnorm_rmath.o -L/usr/lib -lRmath

C++ and R types

http://dirk.eddelbuettel.com/papers/useR2019_rcpp_tutorial.pdf#page=56

5 Types of Vectors

In R everything is a vector

// Vector
// [[Rcpp::export]]
double getMax(NumericVector v) {
    int n = v.size(); // vectors are describing
    double m = v[0]; // initialize
    for (int i=0; i<n; i++) {
        if v[i] > m {
            Rcpp::Rcout << ”Now ” << m << std::endl;
            m = v[i];
        }
    }
    return(m);
}

// Matrix
// [[Rcpp::export]]
Rcpp::NumericVector colSums(Rcpp::NumericMatrix mat) {
  size_t cols = mat.cols();
  Rcpp::NumericVector res(cols);
  for (size_t i=0; i<cols; i++) {
    res[i] = sum(mat.column(i));
  }
  return(res);
}

STL vectors (it copies data)

cppFunction("double getMax2(std::vector<double> v) {
  int n = v.size(); // vectors are describing
  double m = v[0]; // initialize
  for (int i=0; i<n; i++) {
    if (v[i] > m) {
      m = v[i];
    }
  }
  return(m);
}")

getMax2(c(4,5,2))

Sugar functions

max()

Packages

  • Overview
    1. Rcpp.package.skeleton("mypackage"). This will create a cpp source file src/rcpp_hello_world.cpp.
    2. Run Rcpp::compileAttributes() which generates src/RcppExports.cpp and R/RcppExports.R (do not edit by hand)
    3. Edit DESCRIPTION
    4. optional: Makevars and Makevars.win
    5. NAMESPACE
    6. rcpp_hello_world.Rd
    7. mypackage-package.Rd
  • Writing a package that uses Rcpp vignette
  • Read mypackage/Read-and-delete-me
  • The new package can be installed by devtools::load_all() or install.package("mypackage", repos = NULL, type = "source")
  • From Advanced R. Before building the package, you’ll need to run Rcpp::compileAttributes(). This function scans the C++ files for Rcpp::export attributes and generates the code required to make the functions available in R. Re-run compileAttributes() whenever functions are added, removed, or have their signatures changed. This is done automatically by the devtools package and by Rstudio.
  • Examples

Armadillo

Machine Learning

Examples

Example 1. convolution example

First, Rcpp package should be installed (I am working on Linux system). Next we try one example shipped in Rcpp package.

PS. If R was not available in global environment (such as built by ourselves), we need to modify 'Makefile' file by replacing 'R' command with its complete path (4 places).

cd ~/R/x86_64-pc-linux-gnu-library/3.0/Rcpp/examples/ConvolveBenchmarks/
make
R

Then type the following in an R session to see how it works. Note that we don't need to issue library(Rcpp) in R.

dyn.load("convolve3_cpp.so")
x <- .Call("convolve3cpp", 1:3, 4:6)
x # 4 13 28 27 18

If we have our own cpp file, we need to use the following way to create dynamic loaded library file. Note that the character (grave accent) ` is not (single quote)'. If you mistakenly use ', it won't work.

export PKG_CXXFLAGS=`Rscript -e "Rcpp:::CxxFlags()"`
export PKG_LIBS=`Rscript -e "Rcpp:::LdFlags()"`
R CMD SHLIB xxxx.cpp

Example 2. Use together with inline package

library(inline)
src <-'
 Rcpp::NumericVector xa(a);
 Rcpp::NumericVector xb(b);
 int n_xa = xa.size(), n_xb = xb.size();

 Rcpp::NumericVector xab(n_xa + n_xb - 1);
 for (int i = 0; i < n_xa; i++)
 for (int j = 0; j < n_xb; j++)
 xab[i + j] += xa[i] * xb[j];
 return xab;
'
fun <- cxxfunction(signature(a = "numeric", b = "numeric"),
 src, plugin = "Rcpp")
fun(1:3, 1:4) 
# [1]  1  4 10 16 17 12
  • From Section 1.3.3 C++ Solution of the book 'Seamless R and C++ Integration with Rcpp'.
a <- matrix(c(0.5, 0.1, 0.1, 0.5), nrow = 2)
u <- matrix(rnorm(10000), ncol=2)
code <- '
  arma::mat coeff = Rcpp::as<arma::mat> (a);
  arma::mat errors = Rcpp::as<arma::mat>(u);
  int m = errors.n_rows;
  int n = errors.n_cols;
  arma::mat simdata(m, n);
  simdata.row(0) = arma::zeros<arma::mat>(1, n);
  for (int row=1; row<m; row++) {
    simdata.row(row) = simdata.row(row-1)*trans(coeff) +
                       errors.row(row);
  }
  return Rcpp::wrap(simdata);
'
rcppSim <- inline::cxxfunction(signature(a = "numeric", u = "numeric"),
                       code, plugin = "RcppArmadillo")
rcppData <- rcppSim(a, u)

Example 3. Calling Rmath functions, random number generation

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector mypnorm(NumericVector x) {
  int n = x.size();
  NumericVector y(n);
  for (int i=0; i<n; i++)
    y[i] = R::pnorm(x[i], 0.0, 1.0, 1, 0);
  return y;
}
sourceCpp("code/using-rmath-rng.cpp")
x <- seq(-2, 2)
mypnorm(x)
pnorm(x)
#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericMatrix rngCpp(const int N) {
  NumericMatrix X(N, 4);
  X(_, 0) = runif(N);
  X(_, 1) = rnorm(N);
  X(_, 2) = rt(N, 5);
  X(_, 3) = rbeta(N, 1, 1);
  return X;
}
set.seed(42)     # setting seed
 M1 <- rngCpp(5)
 M1
set.seed(42)	  # resetting seed
 M2 <- cbind( runif(5), rnorm(5), rt(5, 5), rbeta(5, 1, 1))
 M2
all.equal(M1, M2)

Rcpp sugar

Rcpp syntactic sugar

The motivation of Rcpp sugar is to bring a subset of the high-level R syntax in C++. For example, instead of writing

RcppExport SEXP foo( SEXP x, SEXP y){
  Rcpp::NumericVector xx(x), yy(y) ;
  int n = xx.size() ;
  Rcpp::NumericVector res( n ) ;
  double x_ = 0.0, y_ = 0.0 ;
  for( int i=0; i<n; i++){
    x_ = xx[i] ;
    y_ = yy[i] ;
    if( x_ < y_ ){
      res[i] = x_ * x_ ;
    } else {
      res[i] = -( y_ * y_) ;
    }
  }
  return res ;
}

, we can use

RcppExport SEXP foo( SEXP xs, SEXP ys){
  Rcpp::NumericVector x(xs) ;
  Rcpp::NumericVector y(ys) ;
  return Rcpp::wrap( ifelse( x < y, x*x, -(y*y) )) ;
}

Operators

  • Binary arithmetic operators,
  • Binary logical operators,
  • Unary operators

Functions

  • Functions producing a single logical result
  • Functions producing sugar expressions
  • Mathematical functions
  • The d/q/p/r statistical functions

Implementation

  • The curiously recurring template pattern
  • The VectorBase class
  • Example : sapply

RcppParallel

rTRNG: parallel RNG in R

rTRNG: ADVANCED PARALLEL RNG IN R