R web

From 太極
Jump to navigation Jump to search

R Web Applications

See also CRAN Task View: Web Technologies and Services

Rmarkdown: create HTML5 web, slides and more

Rmarkdown

HTTP protocol

An HTTP server is conceptually simple:

  1. Open port 80 for listening
  2. When contact is made, gather a little information (get mainly - you can ignore the rest for now)
  3. Translate the request into a file request
  4. Open the file and spit it back at the client

It gets more difficult depending on how much of HTTP you want to support - POST is a little more complicated, scripts, handling multiple requests, etc.

Example in R

> co <- socketConnection(port=8080, server=TRUE, blocking=TRUE) 
> # Now open a web browser and type http://localhost:8080/index.html
> readLines(co,1)
[1] "GET /index.html HTTP/1.1"
> readLines(co,1)
[1] "Host: localhost:8080"
> readLines(co,1)
[1] "User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0"
> readLines(co,1)
[1] "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
> readLines(co,1)
[1] "Accept-Language: en-US,en;q=0.5"
> readLines(co,1)
[1] "Accept-Encoding: gzip, deflate"
> readLines(co,1)
[1] "Connection: keep-alive"
> readLines(co,1)
[1] ""

Example in C (Very simple http server written in C, 187 lines)

Create a simple hello world html page and save it as <index.html> in the current directory (/home/brb/Downloads/)

Launch the server program (assume we have done gcc http_server.c -o http_server)

$ ./http_server -p 50002
Server started at port no. 50002 with root directory as /home/brb/Downloads

Secondly open a browser and type http://localhost:50002/index.html. The server will respond

GET /index.html HTTP/1.1
Host: localhost:50002
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

file: /home/brb/Downloads/index.html
GET /favicon.ico HTTP/1.1
Host: localhost:50002
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

file: /home/brb/Downloads/favicon.ico
GET /favicon.ico HTTP/1.1
Host: localhost:50003
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

file: /home/brb/Downloads/favicon.ico

The browser will show the page from <index.html> in server.

The only bad thing is the code does not close the port. For example, if I have use Ctrl+C to close the program and try to re-launch with the same port, it will complain socket() or bind(): Address already in use.

Another Example in C (55 lines)

http://mwaidyanatha.blogspot.com/2011/05/writing-simple-web-server-in-c.html

The response is embedded in the C code.

If we test the server program by opening a browser and type "http://localhost:15000/", the server received the follwing 7 lines

GET / HTTP/1.1
Host: localhost:15000
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

If we include a non-executable file's name in the url, we will be able to download that file. Try "http://localhost:15000/client.c".

If we use telnet program to test, wee need to type anything we want

$ telnet localhost 15000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
ThisCanBeAnything        <=== This is what I typed in the client and it is also shown on server
HTTP/1.1 200 OK          <=== From here is what I got from server
Content-length: 37Content-Type: text/html

HTML_DATA_HERE_AS_YOU_MENTIONED_ABOVE <=== The html tags are not passed from server, interesting!
Connection closed by foreign host.
$ 

See also more examples under C page.

Others

shiny

See Shiny.

plumber: Turning your R code into a RESTful Web API

1. Too heavy to install 2. Get an error with dependencies when I try it on Ubuntu 16.04 3. check out the servr package

Docker

httpuv and servr

httpuv is more low-level and flexible, while servr is higher-level and easier to use for specific tasks.

See also the servr package which can start an HTTP server in R to serve static files, or dynamic documents that can be converted to HTML files (e.g., R Markdown) under a given directory.

R built in Web server, servr::httw() to serve a local directory as a website

servr::httw("DIRECTORY")
Rscript -e "servr::httd()" # default port 4321
# If index.html was found it will be used
# Otherwise, it will serve files under the cur dir

# Open another terminal
Rscript -e "servr::httd('/tmp')" -p4000

Question: Why not just open an html file in a browser? Answer: While opening an HTML file directly in a browser can be fine for simple, static pages, using a local server like `servr` provides a more accurate and robust testing environment for more complex websites:

  • Relative links: If your website uses relative links (links that point to other pages within the same site), these links may not work correctly when you open an HTML file directly in your browser. This is because the browser treats the file as if it's not part of a larger site. The `servr` package solves this problem by serving the entire directory as a website, preserving the correct structure and allowing relative links to function as intended.
  • Dynamic content: Some websites include dynamic content that requires a server to function correctly. This could include things like form submissions or search functionality. By using `servr`, you can test these features locally before deploying your site.
  • Mimic production environment: Using `servr` allows you to mimic the environment in which your website will be deployed. This can help catch any issues that might not be apparent when simply opening an HTML file in a browser. Serving your site with servr allows you to test server-side code and functionality. For instance, if your site uses server-side scripting (like PHP or ASP.NET), opening an HTML file directly in a browser won’t execute this code. But when served with servr, this code will be executed, allowing you to fully test your site’s functionality.

beakr

workflowr

httr2

httr2 package - Perform HTTP Requests and Process the Responses.

httptest2

opencpu

RApache

gWidgetsWWW

Rook

See Rook.

sumo

Sumo is a fully-functional web application template that exposes an authenticated user's R session within java server pages. See the paper http://journal.r-project.org/archive/2012-1/RJournal_2012-1_Bergsma+Smith.pdf.

Stockplot

FastRWeb

http://cran.r-project.org/web/packages/FastRWeb/index.html

WebDriver

'WebDriver' Client for 'PhantomJS'

https://github.com/rstudio/webdriver

Rwui

CGHWithR and WebDevelopR

CGHwithR is still working with old version of R although it is removed from CRAN. Its successor is WebDevelopR. Its The vignette (year 2013) provides a review of several available methods.

manipulate from RStudio

This is not a web application. But the manipulate package can be used to create interactive plot within R(Studio) environment easily. Its source is available at here.

Mathematica also has manipulate function for plotting; see here.

RCloud

RCloud is an environment for collaboratively creating and sharing data analysis scripts. RCloud lets you mix analysis code in R, HTML5, Markdown, Python, and others. Much like Sage, iPython notebooks and Mathematica, RCloud provides a notebook interface that lets you easily record a session and annotate it with text, equations, and supporting images.

See also the Talk in UseR 2014.

cloudyr and flyio - Input Output Files in R from Cloud or Local

https://blog.socialcops.com/inside-sc/announcements/flyio-r-package-interact-data-cloud/ Announcing flyio, an R Package to Interact with Data in the Cloud]

Dropbox access

rdrop2 package

Javascript

JavaScript for R — ebook https://book.javascript-for-r.com/

Sketch

Sketch Package looks to add JavaScript to R packages

Web page scraping

http://www.slideshare.net/schamber/web-data-from-r#btnNext

xml2 package

rvest package depends on xml2.

rvest, extract tables

Reading Remote Data Files: rvest

Reading Remote Data Files

V8: Embedded JavaScript Engine for R

R⁶ — General (Attys) Distributions: V8, rvest, ggbeeswarm, hrbrthemes and tidyverse packages are used.

pubmed.mineR

Text mining of PubMed Abstracts (http://www.ncbi.nlm.nih.gov/pubmed). The algorithms are designed for two formats (text and XML) from PubMed.

R code for scraping the P-values from pubmed, calculating the Science-wise False Discovery Rate, et al (Jeff Leek)

Get API data

How to get API data with R. See how to write your own R code to pull data from an API using API key authentication.

webshot2 - take a screenshot of web page

webshot2. It uses headless Chrome via the Chromote package. You also need to have the Chrome browser installed on your system. You can also use other browsers based on Chromium, such as Chromium itself, Edge, Vivaldi, Brave, or Opera.

These R packages import sports, weather, stock data and more

Diving Into Dynamic Website Content with splashr

https://rud.is/b/2017/02/09/diving-into-dynamic-website-content-with-splashr/

Network

netstat

https://cran.r-project.org/web/packages/netstat/index.html

Send email

curl

Note

  • As we can see in the example snippet, there is a duplicate of sender's and recipient's emails.
  • It seems the recipients and sender parts are the real ones
  • In the message string, we can skip 'From' and 'To' line.
    • The sender's email entered in message is not used
    • The recipient's email entered in message is not used
  • Experiment: recipients & sender are correct but the two email addresses in message are wrong.
    • Answer: I can still receive the email.
    • (In the recipient's inbox) the email address in the 'From' part will be replaced with the correct one but the name (e.g. "R (curl package)") will be the one we use in the message
    • (In the recipient's inbox) the email address & the name in the 'To' part will be the one as we use in the message (interesting)
  • Question: does sender's email has to be gmail.com?
library(curl)
recipients <- "[email protected]"
sender <- '[email protected]'

# Full email message in RFC2822 format
message <- 'From: "R (curl package)" <[email protected]>
To: "Roger Recipient" <[email protected]>
Subject: Hello R user!

Dear R user,

I am sending this email using curl.'

# Send the email
send_mail(sender, recipients, message, smtp_server = 'smtps://smtp.gmail.com',
  username = 'curlpackage', password  = 'qyyjddvphjsrbnlm')

emayili

  • https://cran.r-project.org/web/packages/emayili/index.html
  • emayili: Sending Email from R. This shows how to send emails from the terminal command line tool curl if we don't want to use R to do the job. Compared to the 'curl' package, there is no need to repeat the sender's & recipient's email addresses in the message part.
  • You can add Cc, Bcc and Reply-To header fields using the cc(), bcc() and reply() methods. Files can be attached using the attachment() method.
  • Note that if the email in envelope(From) is different from the one in server(username), I see the email in evelope(From) will be ignore. So in the following simplified example I skip envelope(From).
  • If I change the 'to' email to a yahoo account, it works but the email went to the Trash folder.
  • Rendering R Markdown
    library(emayili)
    
    email <- envelope(
      to = "[email protected]",
      subject = "This is a plain text message!",
      text = "Hello!\nHello2"
    )
    smtp <- server(host = "smtp.gmail.com",
                   port = 465,
                   username = "[email protected]",
                   password = "bd40ef6d4a9413de9c1318a65cbae5d7")
    smtp(email, verbose = TRUE)
    
  • Creating Email Threads

blastula (RStudio)

mailR

Easiest. Require rJava package (not trivial to install, see rJava). mailR is an interface to Apache Commons Email to send emails from within R. See also send bulk email

Before we use the mailR package, we have followed here to have Allow less secure apps: 'ON' ; or you might get an error Error: EmailException (Java): Sending the email to the following server failed : smtp.gmail.com:465. Once we turn on this option, we may get an email for the notification of this change. Note that the recipient can be other than a gmail.

> send.mail(from = "[email protected]",
  to = c("[email protected]", "Recipient 2 <[email protected]>"),
  replyTo = c("Reply to someone else <[email protected]>")
  subject = "Subject of the email",
  body = "Body of the email",
  smtp = list(host.name = "smtp.gmail.com", port = 465, user.name = "gmail_username", passwd = "password", ssl = TRUE),
  attach.files ="./myattachment.txt",
  authenticate = TRUE,
  send = TRUE)
[1] "Java-Object{org.apache.commons.mail.SimpleEmail@7791a895}"

gmailr

More complicated. gmailr provides access the Google's gmail.com RESTful API. Vignette and an example on here. Note that it does not use a password; it uses a json file for oauth authentication downloaded from https://console.cloud.google.com/. See also https://github.com/jimhester/gmailr/issues/1.

library(gmailr)
gmail_auth('mysecret.json', scope = 'compose') 

test_email <- mime() %>%
  to("[email protected]") %>%
  from("[email protected]") %>%
  subject("This is a subject") %>%
  html_body("<html><body>I wish this was bold</body></html>")
send_message(test_email)

sendmailR

sendmailR provides a simple SMTP client. It is not clear how to use the package (i.e. where to enter the password).

json

jsonlite

rjson

http://heuristically.wordpress.com/2013/05/20/geolocate-ip-addresses-in-r/

RJSONIO

Accessing Bitcoin Data with R

http://blog.revolutionanalytics.com/2015/11/accessing-bitcoin-data-with-r.html

Plot IP on google map

The following example is modified from the first of above list.

require(RJSONIO) # fromJSON
require(RCurl)   # getURL

temp = getURL("https://gist.github.com/arraytools/6743826/raw/23c8b0bc4b8f0d1bfe1c2fad985ca2e091aeb916/ip.txt", 
                           ssl.verifypeer = FALSE)
ip <- read.table(textConnection(temp), as.is=TRUE)
names(ip) <- "IP"
nr = nrow(ip)
 
Lon <- as.numeric(rep(NA, nr))
Lat <- Lon
Coords <- data.frame(Lon, Lat)
 
ip2coordinates <- function(ip) {
  api <- "http://freegeoip.net/json/"
  get.ips <- getURL(paste(api, URLencode(ip), sep=""))
  # result <- ldply(fromJSON(get.ips), data.frame)
  result <- data.frame(fromJSON(get.ips))
  names(result)[1] <- "ip.address"
  return(result)
}

for (i in 1:nr){
  cat(i, "\n")
  try(
  Coords[i, 1:2] <- ip2coordinates(ip$IP[i])[c("longitude", "latitude")]
  )
}
 
# append to log-file:
logfile <- data.frame(ip, Lat = Coords$Lat, Long = Coords$Lon,
                                       LatLong = paste(round(Coords$Lat, 1), round(Coords$Lon, 1), sep = ":")) 
log_gmap <- logfile[!is.na(logfile$Lat), ]

require(googleVis) # gvisMap
gmap <- gvisMap(log_gmap, "LatLong",
                options = list(showTip = TRUE, enableScrollWheel = TRUE,
                               mapType = 'hybrid', useMapTypeControl = TRUE,
                               width = 1024, height = 800))
plot(gmap)

File:GoogleVis.png

The plot.gvis() method in googleVis packages also teaches the startDynamicHelp() function in the tools package, which was used to launch a http server. See Jeffrey Horner's note about deploying Rook App.

Convert JSON to CSV using Linux shell

How to convert JSON to CSV using Linux / Unix shell

GEO (Gene Expression Omnibus)

See this internal link.

Interactive html output

webr

sendplot

RIGHT

The supported plot types include scatterplot, barplot, box plot, line plot and pie plot.

In addition to tooltip boxes, the package can create a table showing all information about selected nodes.

r2d3

r2d3 - R Interface to D3 Visualizations

d3Network

library(d3Network)

Source <- c("A", "A", "A", "A", "B", "B", "C", "C", "D") 
Target <- c("B", "C", "D", "J", "E", "F", "G", "H", "I") 
NetworkData <- data.frame(Source, Target) 

d3SimpleNetwork(NetworkData, height = 800, width = 1024, file="tmp.html")

htmlwidgets for R

Embed widgets in R Markdown documents and Shiny web applications.

igraph

visNetwork

networkD3

plotly

library(igraph)
library(ggplot2)
library(plotly)

# Create a data frame that represents edges
dat <- data.frame(name=c("Alice", "Bob", "Cecil"), age=c(48,33,45))


# Step 1: Create an igraph object
g <- graph_from_data_frame(dat, directed = FALSE)

# Get the layout of the graph
layout <- layout_nicely(g)

# Create a data frame for the vertices
vertices <- data.frame(id = V(g)$name, x = layout[,1], y = layout[,2],
                       var1=V(g)$name, var2=LETTERS[1:6])

# Get the edges and convert vertex names to coordinates
edges <- get.data.frame(g, what = "edges")
edges <- merge(edges, vertices, by.x = "from", by.y = "id")
edges <- merge(edges, vertices, by.x = "to", by.y = "id", suffixes = c(".from", ".to"))

# Step 2: Create a ggplot object
p <- ggplot(vertices, aes(x = x, y = y)) +
  geom_segment(data = edges, aes(x = x.from, y = y.from, xend = x.to, yend = y.to)) +
  geom_point(aes(text = paste("Var1:", var1, "\nVar2:", var2))) +
  geom_text(aes(x = x, y = y, label = id), vjust = 1, hjust = 0, nudge_x=.2) +
  expand_limits(x = c(-2, 1.5)) + 
  theme(
    axis.line = element_blank(),  # Hide axis lines
    axis.text = element_blank(),  # Hide axis text
    axis.ticks = element_blank(),  # Hide axis ticks
    axis.title = element_blank(),  # Hide axis labels
    panel.grid.major = element_blank(),  # Hide major grid
    panel.grid.minor = element_blank(),  # Hide minor grid
    panel.background = element_rect(fill = "white")  # Set background to white
  )


# Print the plot
print(p)

# Step 3: # Convert the ggplot object to a plotly object
#       Tooltip works only on 'points', not on labels.
ggplotly(p, tooltip = "text")

ggiraph

Convert ggraph to interative plot with plotyly or Network3D

scatterD3

scatterD3 is an HTML R widget for interactive scatter plots visualization. It is based on the htmlwidgets R package and on the d3.js javascript library.

dygraphs

rthreejs - Create interactive 3D scatter plots, network plots, and globes

Examples

rayshader: 2D and 3D mapping and data visualization with shades

https://github.com/tylermorganwall/rayshader

On Rstudio server, we need options(rgl.printRglwidget = TRUE) ; see Why is my 3D plot not showing up in R Studio plot viewer?.

d3heatmap

See R

collapsibleTree

svgPanZoom

This 'htmlwidget' provides pan and zoom interactivity to R graphics, including 'base', 'lattice', and 'ggplot2'. The interactivity is provided through the 'svg-pan-zoom.js' library.

DT: An R interface to the DataTables library

reactable: interactive table with rows that expand when clicked

How to create tables in R with expandable rows

getable: creating a 'dynamic' HTML table

Getting Tabular Data Through JavaScript in Compiled R Markdown Documents.

The content stays static while the data could be updated independently without rewriting or recompiling the HTML document. This could be done by utilizing JavaScript’s ability to asynchronously fetch data from the web and generate DOM elements based on these data.

plotly

p <- plotly::ggplotly(b)
htmlwidgets::saveWidget(p, "index.html")

highcharter: alternative to plotly

https://cran.r-project.org/web/packages/highcharter/ Good for time series plot.

Amazon

Download product information and reviews from Amazon.com (2016, not working in get_reviews() as of 6/7/2023)

sudo apt-get install libxml2-dev
sudo apt-get install libcurl4-openssl-dev

and in R

install.packages("devtools")
install.packages("XML")
install.packages("pbapply")
install.packages("dplyr")
devtools::install_github("56north/Rmazon")
product_info <- Rmazon::get_product_info("1593273843")
reviews <- Rmazon::get_reviews("1593273843")
reviews[1,6] # only show partial characters from the 1st review
nchar(reviews[1,6])
as.character(reviews[1,6]) # show the complete text from the 1st review

reviews <- Rmazon::get_reviews("B07BNGJXGS")
# Fetching 30 reviews of 'BOOX Note Ereader,Android 6.0 32 GB 10.3" Dual Touch HD Display'
#   |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 02s
reviews
# A tibble: 30 x 6
   reviewRating reviewDate reviewFormat Verified_Purcha… reviewHeadline
          <dbl> <chr>      <lgl>        <lgl>            <chr>         
 1            4 May 23, 2… NA           TRUE             Good for PDF …
 2            3 May 8, 20… NA           FALSE            The reading s…
 3            5 May 17, 2… NA           TRUE             E-reader and …
 4            3 May 24, 2… NA           TRUE             Good hardware…
 5            3 June 21, … NA           TRUE             Poor QC       
 6            5 August 5,… NA           TRUE             Excellent for…
 7            5 May 31, 2… NA           TRUE             Especially li…
 8            5 July 4, 2… NA           TRUE             Android 6 rea…
 9            4 July 15, … NA           TRUE             Remember the …
10            4 June 9, 2… NA           TRUE             Overall fanta…
# ... with 20 more rows, and 1 more variable: reviewText <chr>
reviews[1, 6] # 6-th column is the review text

Vignette: Scraping Amazon Reviews in R 2019

library(rvest)
library(dplyr)

url <- "https://www.amazon.com/dp/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"
page <- read_html(url)

reviews <- page %>%
  html_nodes(".review") %>%
  html_text()

titles <- page %>%
  html_nodes(".review-title-content") %>%
  html_text()

ratings <- page %>%
  html_nodes(".review-rating") %>%
  html_text()

df <- data.frame(reviews, titles, ratings)

Scraping Amazon Customer Reviews (almost). The url is obtained by clicking "See all reviews", like this one.

Google Bard answers: first 10 reviews.

library(rvest)

# Get the product ASIN code
asin <- "B07HSKPDBV"

# Create a URL for the product reviews page
url <- paste0("https://www.amazon.com/product-reviews/", asin)

# Read the HTML content of the product reviews page
html <- read_html(url)

# Extract the review titles, bodies, and ratings
reviews <- html %>%
  html_nodes(".review-text") %>%
  html_text() %>%  # so far is OK
  data.frame(
    title = strsplit(., "\n")[[1]][1],
    body = strsplit(., "\n")[-1],
    rating = strsplit(., " ")[1][2]
  )

# Print the first 5 reviews
head(reviews, 5)

All reviews.

library(rvest)

# Get the product ASIN code
asin <- "B07HSKPDBV"

# Create a function to scrape the reviews for a single page
scrape_reviews <- function(url) {
  html <- read_html(url)
  reviews <- html %>%
    html_nodes(".review-text") %>%
    html_text() %>%
    data.frame(
      title = strsplit(., "\n")[[1]][1],
      body = strsplit(., "\n")[-1],
      rating = strsplit(., " ")[1][2]
    )
  return(reviews)
}

# Create a vector of URLs for all of the review pages
urls <- seq(from = 1, to = 100, by = 10) %>%
  map_chr(function(x) paste0("https://www.amazon.com/product-reviews/", asin, "?pageNumber=", x))

# Scrape the reviews for all of the pages
reviews <- urls %>%
  map(scrape_reviews) %>%
  do.call(rbind, .)

# Print the first 5 reviews
head(reviews, 5)

gutenbergr

Edinbr: Text Mining with R

Feed

feedeR - Feed Reader Package for R

File sharing

{filebin} Quick & Easy File Sharing

Twitter

Faces of #rstats Twitter

OCR

Online

Wikipedia

WikipediR: R's MediaWiki API client library