Reproducible: Difference between revisions

From 太極
Jump to navigation Jump to search
Line 10: Line 10:
= Misc =
= Misc =
* digest: Create Compact Hash Digests of R Objects
* digest: Create Compact Hash Digests of R Objects
* memoise: Memoisation of Functions
* [https://cran.r-project.org/web/packages/memoise/index.html memoise]: [https://www.rdocumentation.org/packages/memoise/versions/1.1.0/topics/memoise Memoisation of Functions]. Need to understand how it works in order to take advantage. I modify the example from [https://csgillespie.github.io/efficientR/caching-variables.html Efficient R] by moving the data out of the function. The cache works in the 2nd call. I don't use benchmark() function since it performs the same operation each time (so favor memoise and mask some detail). <syntaxhighlight lang='rsplus'>
library(ggplot2) # mpg
library(memoise)
plot_mpg2 <- function(mpgdf, row_to_remove) {
  mpgdf = mpgdf[-row_to_remove,]
  plot(mpgdf$cty, mpgdf$hwy)
  lines(lowess(mpgdf$cty, mpgdf$hwy), col=2)
}
m_plot_mpg2 = memoise(plot_mpg2)
system.time(m_plot_mpg2(mpg, 12))
#  user  system elapsed
#  0.019  0.003  0.025
system.time(plot_mpg2(mpg, 12))
#  user  system elapsed
#  0.018  0.003  0.024
system.time(m_plot_mpg2(mpg, 12))
#  user  system elapsed
#  0.000  0.000  0.001
system.time(plot_mpg2(mpg, 12))
#  user  system elapsed
#  0.032  0.008  0.047
</syntaxhighlight>
* [https://cran.rstudio.com/web/packages/reproducible/index.html reproducible]: A Set of Tools that Enhance Reproducibility Beyond Package Management
* [https://cran.rstudio.com/web/packages/reproducible/index.html reproducible]: A Set of Tools that Enhance Reproducibility Beyond Package Management

Revision as of 11:14, 1 July 2019

Rmarkdown

Rmarkdown package

packrat

R packages → packrat

Docker & Singularity

Docker

Misc

  • digest: Create Compact Hash Digests of R Objects
  • memoise: Memoisation of Functions. Need to understand how it works in order to take advantage. I modify the example from Efficient R by moving the data out of the function. The cache works in the 2nd call. I don't use benchmark() function since it performs the same operation each time (so favor memoise and mask some detail).
    library(ggplot2) # mpg 
    library(memoise) 
    plot_mpg2 <- function(mpgdf, row_to_remove) {
      mpgdf = mpgdf[-row_to_remove,]
      plot(mpgdf$cty, mpgdf$hwy)
      lines(lowess(mpgdf$cty, mpgdf$hwy), col=2)
    }
    m_plot_mpg2 = memoise(plot_mpg2)
    system.time(m_plot_mpg2(mpg, 12))
    #   user  system elapsed
    #  0.019   0.003   0.025
    system.time(plot_mpg2(mpg, 12))
    #   user  system elapsed
    #  0.018   0.003   0.024
    system.time(m_plot_mpg2(mpg, 12))
    #   user  system elapsed
    #  0.000   0.000   0.001
    system.time(plot_mpg2(mpg, 12))
    #   user  system elapsed
    #  0.032   0.008   0.047
  • reproducible: A Set of Tools that Enhance Reproducibility Beyond Package Management