R web: Difference between revisions

From 太極
Jump to navigation Jump to search
 
(42 intermediate revisions by the same user not shown)
Line 167: Line 167:


= [http://cran.r-project.org/web/packages/httpuv/index.html httpuv] and servr =
= [http://cran.r-project.org/web/packages/httpuv/index.html httpuv] and servr =
http and WebSocket library.
'''httpuv''' is more low-level and flexible, while '''servr''' is higher-level and easier to use for specific tasks.


See also the [https://cran.r-project.org/web/packages/servr/index.html servr] package which can start an HTTP server in R to serve static files, or dynamic documents that can be converted to HTML files (e.g., R Markdown) under a given directory.
See also the [https://cran.r-project.org/web/packages/servr/index.html servr] package which can start an HTTP server in R to serve static files, or dynamic documents that can be converted to HTML files (e.g., R Markdown) under a given directory.


[https://stackoverflow.com/a/59691029 R built in Web server]  
[https://stackoverflow.com/a/59691029 R built in Web server], [https://www.r-bloggers.com/2023/09/3-r-functions-that-i-enjoy/ servr::httw() to serve a local directory as a website]  


<pre>
servr::httw("DIRECTORY")
</pre>
<pre>
<pre>
Rscript -e "servr::httd()" # default port 4321
Rscript -e "servr::httd()" # default port 4321
Line 181: Line 184:
Rscript -e "servr::httd('/tmp')" -p4000
Rscript -e "servr::httd('/tmp')" -p4000
</pre>
</pre>
Question: Why not just open an html file in a browser? Answer: While opening an HTML file directly in a browser can be fine for simple, static pages, using a local server like `servr` provides a more accurate and robust testing environment for more complex websites:
* '''Relative links''': If your website uses relative links (links that point to other pages within the same site), these links may not work correctly when you open an HTML file directly in your browser. This is because the browser treats the file as if it's not part of a larger site. The `servr` package solves this problem by serving the entire directory as a website, preserving the correct structure and allowing relative links to function as intended.
* '''Dynamic content''': Some websites include dynamic content that requires a server to function correctly. This could include things like form submissions or search functionality. By using `servr`, you can test these features locally before deploying your site.
* '''Mimic production environment''': Using `servr` allows you to mimic the environment in which your website will be deployed. This can help catch any issues that might not be apparent when simply opening an HTML file in a browser. Serving your site with servr allows you to test server-side code and functionality. For instance, if your site uses server-side scripting (like PHP or ASP.NET), opening an HTML file directly in a browser won’t execute this code. But when served with servr, this code will be executed, allowing you to fully test your site’s functionality.


== beakr ==
== beakr ==
Line 187: Line 194:


== workflowr ==
== workflowr ==
= httr2 =
[https://cran.r-project.org/web//packages/httr2/index.html httr2] package - Perform HTTP Requests and Process the Responses.
== httptest2 ==
* [https://cran.r-project.org/web//packages/httptest2/index.html httptest2] - Test Helpers for 'httr2'
* [https://books.ropensci.org/http-testing/ HTTP testing in R], [https://www.r-consortium.org/blog/2023/05/15/better-understanding-your-tools-choices-online-book-http-testing-r Better Understanding Your Tools Choices with Online Book HTTP Testing in R]


= opencpu =
= opencpu =
Line 239: Line 253:
= Javascript =
= Javascript =
[https://paulvanderlaken.com/2020/12/01/javascript-for-r-ebook/ JavaScript for R — ebook] https://book.javascript-for-r.com/
[https://paulvanderlaken.com/2020/12/01/javascript-for-r-ebook/ JavaScript for R — ebook] https://book.javascript-for-r.com/
== Sketch ==
[https://www.r-consortium.org/blog/2023/04/26/sketch-package-looks-to-add-javascript-to-r-packages Sketch Package looks to add JavaScript to R packages]


= Web page scraping =
= Web page scraping =
Line 246: Line 263:
rvest package depends on xml2.
rvest package depends on xml2.


== [https://cran.r-project.org/web/packages/rvest/index.html rvest] ==
== [https://cran.r-project.org/web/packages/rvest/index.html rvest], extract tables ==
[http://blog.rstudio.org/2014/11/24/rvest-easy-web-scraping-with-r/ Easy web scraping with R]
<ul>
<li>[http://blog.rstudio.org/2014/11/24/rvest-easy-web-scraping-with-r/ Easy web scraping with R]
<li>[https://twitter.com/rlbarter/status/1646267698857541632 Blown away that web scraping in R with the rvest package is as easy as]
<pre>
page <- read_html("https://en.m.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population")
tables <- html_table(page)
tables[[2]]


On Ubuntu, we need to install two packages first!
# A tibble: 243 × 7
{{Pre}}
  Rank  `Country / Dependency` Population  Population Date  Source (official or …¹ Notes
  <chr> <chr>                  <chr>      <chr>      <chr> <chr>                  <chr>
1 Rank  Country / Dependency  Numbers    % of the … Date  Source (official or f… "Not…
2 –    World                  8,026,177,… 100%      16 A… UN projection[3]      "" 
3 1    China                  1,411,750,… 17.6%      31 D… Official estimate[4]  "[b]"
4 2    India                  1,392,329,… 17.3%      1 Ma… Official projection[5] "[c]…
5 3    United States          334,631,000 4.17%      16 A… National population c… "[d]"
6 4    Indonesia              275,773,800 3.44%      1 Ju… Official estimate[7]  "" 
7 5    Pakistan              235,825,000 2.94%      1 Ju… UN projection[3]      "[e]"
8 6    Nigeria                218,541,000 2.72%      1 Ju… UN projection[3]      "" 
9 7    Brazil                216,024,545 2.69%      16 A… National population c… "" 
10 8    Bangladesh            169,828,911 2.12%      15 J… 2022 final census res… "" 
# ℹ 233 more rows
# ℹ abbreviated name: ¹​`Source (official or from the United Nations)`
# ℹ Use `print(n = ...)` to see more rows
</pre>
<li>On Ubuntu, we need to install two packages first!
<syntaxhighlight lang="sh">
sudo apt-get install libcurl4-openssl-dev # OR libcurl4-gnutls-dev
sudo apt-get install libcurl4-openssl-dev # OR libcurl4-gnutls-dev


sudo apt-get install libxml2-dev
sudo apt-get install libxml2-dev
</pre>
</syntaxhighlight>
<li>https://github.com/tidyverse/rvest, [https://github.com/tidyverse/rvest/blob/master/NEWS.md NEWS]
<li>[http://datascienceplus.com/visualizing-obesity-across-united-states-by-using-data-from-wikipedia/ Visualizing obesity across United States by using data from Wikipedia]
<li>[https://stat4701.github.io/edav/2015/04/02/rvest_tutorial/ rvest tutorial: scraping the web using R]
<li>https://renkun.me/pipeR-tutorial/Examples/rvest.html
<li>http://zevross.com/blog/2015/05/19/scrape-website-data-with-the-new-r-package-rvest/
<li>[https://datascienceplus.com/google-scholar-scraping-with-rvest/ Google scholar scraping with rvest package]
<li>[https://www.radmuzom.com/2020/05/03/an-update-to-an-adventure-in-downloading-books/ An update to "An adventure in downloading books"]
<li>[https://www.r-bloggers.com/2024/09/how-to-webscrape-in-r/ How to webscrape in R?]
</ul>


* https://github.com/tidyverse/rvest, [https://github.com/tidyverse/rvest/blob/master/NEWS.md NEWS]
== Reading Remote Data Files: rvest ==
* [http://datascienceplus.com/visualizing-obesity-across-united-states-by-using-data-from-wikipedia/ Visualizing obesity across United States by using data from Wikipedia]
[https://kieranhealy.org/blog/archives/2023/03/25/reading-remote-data-files/ Reading Remote Data Files]
* [https://stat4701.github.io/edav/2015/04/02/rvest_tutorial/ rvest tutorial: scraping the web using R]
* https://renkun.me/pipeR-tutorial/Examples/rvest.html
* http://zevross.com/blog/2015/05/19/scrape-website-data-with-the-new-r-package-rvest/
* [https://datascienceplus.com/google-scholar-scraping-with-rvest/ Google scholar scraping with rvest package]
* [https://www.radmuzom.com/2020/05/03/an-update-to-an-adventure-in-downloading-books/ An update to "An adventure in downloading books"]


== [https://cran.r-project.org/web/packages/V8/index.html V8]: Embedded JavaScript Engine for R ==
== [https://cran.r-project.org/web/packages/V8/index.html V8]: Embedded JavaScript Engine for R ==
Line 274: Line 318:
== Get API data ==
== Get API data ==
[https://youtu.be/tlaJf0CHbFE How to get API data with R]. See how to write your own R code to pull data from an API using API key authentication.
[https://youtu.be/tlaJf0CHbFE How to get API data with R]. See how to write your own R code to pull data from an API using API key authentication.
== webshot2 - take a screenshot of web page ==
[https://rstudio.github.io/webshot2/ webshot2]. It uses headless Chrome via the Chromote package. You also need to have the Chrome browser installed on your system. You can also use other browsers based on Chromium, such as Chromium itself, Edge, Vivaldi, Brave, or Opera.


= These R packages import sports, weather, stock data and more =
= These R packages import sports, weather, stock data and more =
Line 281: Line 328:
* https://cran.r-project.org/web/packages/rnoaa/index.html. A personal API key (token) is required. 10,000 requests per day
* https://cran.r-project.org/web/packages/rnoaa/index.html. A personal API key (token) is required. 10,000 requests per day
* <strike>http://ram-n.github.io/weatherData/ </strike>  (not working)
* <strike>http://ram-n.github.io/weatherData/ </strike>  (not working)
* [https://datawookie.dev/blog/2022/08/historical-weather-data/ Historical Weather Data]


= Diving Into Dynamic Website Content with splashr =
= Diving Into Dynamic Website Content with splashr =
Line 293: Line 341:
* [https://www.rdocumentation.org/packages/curl/versions/4.2/topics/send_mail curl::send_mail()].  
* [https://www.rdocumentation.org/packages/curl/versions/4.2/topics/send_mail curl::send_mail()].  
* [https://petermeissner.de/blog/2020/09/07/web-send-mail-windows/ R Internet: Yet Another Way To Send Emails On Windows/Sending Emails with {curl} and Docker]
* [https://petermeissner.de/blog/2020/09/07/web-send-mail-windows/ R Internet: Yet Another Way To Send Emails On Windows/Sending Emails with {curl} and Docker]
* [https://blog.edmdesigner.com/send-email-from-linux-command-line/ 16 Command Examples to Send Email From The Linux Command Line]
* [https://medium.com/geekculture/5-extra-uses-for-curl-that-dont-involve-web-requests-6780a345877f 5 Extra Uses For Curl That Don’t Involve Web Requests]. Send an email, Check for open ports, Upload files using TFTP, Resume downloadsDownload files from an SMB share.


Note
Note
Line 326: Line 376:


=== emayili ===
=== emayili ===
* https://cran.r-project.org/web/packages/emayili/index.html
<ul>
* [https://datawookie.netlify.com/blog/2019/05/emayili-sending-email-from-r/ emayili: Sending Email from R]. This shows how to send emails from the terminal command line tool '''curl''' if we don't want to use '''R''' to do the job. Compared to the 'curl' package, there is no need to repeat the sender's & recipient's email addresses in the message part.
<li>https://cran.r-project.org/web/packages/emayili/index.html
* You can add Cc, Bcc and Reply-To header fields using the '''cc()''', '''bcc()''' and '''reply()''' methods. Files can be attached using the '''attachment()''' method.
<li>[https://datawookie.netlify.com/blog/2019/05/emayili-sending-email-from-r/ emayili: Sending Email from R]. This shows how to send emails from the terminal command line tool '''curl''' if we don't want to use '''R''' to do the job. Compared to the 'curl' package, there is no need to repeat the sender's & recipient's email addresses in the message part.
* Note that if the email in envelope(From) is different from the one in server(username), I see the email in evelope(From) will be ignore. So in the following simplified example I skip envelope(From).  
<li>You can add Cc, Bcc and Reply-To header fields using the '''cc()''', '''bcc()''' and '''reply()''' methods. Files can be attached using the '''attachment()''' method.
* If I change the 'to' email to an yahoo account, it does not go through.
<li>Note that if the email in envelope(From) is different from the one in server(username), I see the email in evelope(From) will be ignore. So in the following simplified example I skip envelope(From).  
<li>If I change the 'to' email to a yahoo account, it works but the email went to the Trash folder.
<li>[https://datawookie.dev/blog/2021/09/emayili-rendering-r-markdown/ Rendering R Markdown]
<pre>
<pre>
library(emayili)
library(emayili)
Line 345: Line 397:
smtp(email, verbose = TRUE)
smtp(email, verbose = TRUE)
</pre>
</pre>
<li>[https://www.r-bloggers.com/2024/06/creating-email-threads/ Creating Email Threads]
</ul>


== [https://cran.r-project.org/web/packages/blastula/index.html blastula] (RStudio) ==
== [https://cran.r-project.org/web/packages/blastula/index.html blastula] (RStudio) ==
Line 454: Line 508:
The plot.gvis() method in googleVis packages also teaches the startDynamicHelp() function in the tools package, which was used to launch a http server. See  
The plot.gvis() method in googleVis packages also teaches the startDynamicHelp() function in the tools package, which was used to launch a http server. See  
[http://jeffreyhorner.tumblr.com/page/3 Jeffrey Horner's note about deploying Rook App].
[http://jeffreyhorner.tumblr.com/page/3 Jeffrey Horner's note about deploying Rook App].
== Convert JSON to CSV using Linux shell ==
[https://www.cyberciti.biz/faq/how-to-convert-json-to-csv-using-linux-unix-shell/ How to convert JSON to CSV using Linux / Unix shell]


= [http://www.ncbi.nlm.nih.gov/geo/ GEO (Gene Expression Omnibus)] =
= [http://www.ncbi.nlm.nih.gov/geo/ GEO (Gene Expression Omnibus)] =
Line 459: Line 516:


= Interactive html output =
= Interactive html output =
== webr ==
* [https://r.iresmi.net/posts/2024/webr/index.html Playing with webr] R in your browser.
* https://github.com/coatless/quarto-webr?tab=readme-ov-file
* [https://nrennie.rbind.io/blog/webr-shiny-tidytuesday/ A webR powered Shiny app for browsing TidyTuesday plots]
== [http://cran.r-project.org/web/packages/sendplot/index.html sendplot] ==
== [http://cran.r-project.org/web/packages/sendplot/index.html sendplot] ==
== [http://cran.r-project.org/web/packages/RIGHT/index.html RIGHT] ==
== [http://cran.r-project.org/web/packages/RIGHT/index.html RIGHT] ==
The supported plot types include scatterplot, barplot, box plot, line plot and pie plot.
The supported plot types include scatterplot, barplot, box plot, line plot and pie plot.
Line 487: Line 550:
* [http://deanattali.com/blog/htmlwidgets-tips/ How to write a useful htmlwidgets in R: tips and walk-through a real example]
* [http://deanattali.com/blog/htmlwidgets-tips/ How to write a useful htmlwidgets in R: tips and walk-through a real example]


== [http://cran.r-project.org/web/packages/networkD3/index.html networkD3] ==
== igraph ==
This is a port of Christopher Gandrud's [http://christophergandrud.github.io/d3Network/ d3Network] package to the htmlwidgets framework.
<ul>
<li>https://cran.r-project.org/web/packages/igraph/index.html
* On Ubuntu, run <syntaxhighlight lang='sh' inline>apt install libglpk-dev </syntaxhighlight>
<li>[https://www.geeksforgeeks.org/creating-an-igraph-object-in-r/ Creating an igraph object in R]
<li>[https://robwiederstein.github.io/network_analysis/igraph.html Network Analysis in R]
<li>[https://shiring.github.io/genome/2016/12/14/homologous_genes_part2_post creating directed networks with igraph]
<li>Tips
* [https://stackoverflow.com/a/14400780 Can we vary the text size along with node size in R-igraph?]
* [https://www.reddit.com/r/Rlanguage/comments/yv8jjg/igraph_how_do_i_make_the_font_smaller_to_fit/ igraph: How do I make the font smaller to fit inside of the node so that it's readable?]
* [https://stackoverflow.com/a/38452176 Add legend in igraph to annotate difference vertices size]
* Compare two graphs: https://igraph.org/c/doc/igraph-Isomorphism.html. In simple terms, two graphs are isomorphic if they become indistinguishable from each other once their vertex labels are removed.
<pre>
g1 <- make_ring(10)
g2 <- make_ring(10)
all.equal(g1, g2) # FALSE, checking nearly equal
identical(g1, g2) # FALSE
 
# Check if the graphs are isomorphic
isomorphic(g1, g2) # TRUE
</pre>
<li>Extract coordinates: there are different layouts. The default is layout.auto().
<pre>
data(karate, package="igraphdata")
G <- upgrade_graph(karate)
 
plot(G) # same as
plot(G, layout = layout_nicely(G))
plot(G, layout = layout.fruchterman.reingold(G))
plot(G, layout = layout.circle(G)) # not good if there are too many vertices
plot(G, layout = layout.sphere(G)) # complicated
plot(G, layout = layout.random(G)) # complicated
 
L <- layout.fruchterman.reingold(G)
dim(L) # 34  2
</pre>
<li>Message when I use load() to load igraph objects created from igraph version 1.4.2 in R with igraph 2.0.3.
<pre>
This graph was created by an old(er) igraph version.
  Call upgrade_graph() on it to use with the current igraph version
  For now we convert it on the fly...
</pre>
</ul>
 
=== visNetwork ===
* https://cran.r-project.org/web/packages/visNetwork/index.html
* [https://www.youtube.com/watch?v=hgUJ-UFv4YY Create Interactive networks using R programming]
* It is used by [https://bioconductor.org/packages/release/bioc/html/FELLA.html FELLA::launchApp()].
 
=== [http://cran.r-project.org/web/packages/networkD3/index.html networkD3] ===
* This is a port of Christopher Gandrud's [http://christophergandrud.github.io/d3Network/ d3Network] package to the htmlwidgets framework.
* [https://datasandbox.netlify.app/posts/2022-07-12-network-graphs-in-r/index.en.html Network Graphs in R] from ''The Data Sandbox''.
 
=== plotly ===
* [https://minimaxir.com/notebooks/interactive-network/ How to Create an Interactive WebGL Network Graph Using R and Plotly]
* A working example
<syntaxhighlight lang='r'>
library(igraph)
library(ggplot2)
library(plotly)
 
# Create a data frame that represents edges
dat <- data.frame(name=c("Alice", "Bob", "Cecil"), age=c(48,33,45))
 
 
# Step 1: Create an igraph object
g <- graph_from_data_frame(dat, directed = FALSE)
 
# Get the layout of the graph
layout <- layout_nicely(g)
 
# Create a data frame for the vertices
vertices <- data.frame(id = V(g)$name, x = layout[,1], y = layout[,2],
                      var1=V(g)$name, var2=LETTERS[1:6])
 
# Get the edges and convert vertex names to coordinates
edges <- get.data.frame(g, what = "edges")
edges <- merge(edges, vertices, by.x = "from", by.y = "id")
edges <- merge(edges, vertices, by.x = "to", by.y = "id", suffixes = c(".from", ".to"))
 
# Step 2: Create a ggplot object
p <- ggplot(vertices, aes(x = x, y = y)) +
  geom_segment(data = edges, aes(x = x.from, y = y.from, xend = x.to, yend = y.to)) +
  geom_point(aes(text = paste("Var1:", var1, "\nVar2:", var2))) +
  geom_text(aes(x = x, y = y, label = id), vjust = 1, hjust = 0, nudge_x=.2) +
  expand_limits(x = c(-2, 1.5)) +
  theme(
    axis.line = element_blank(),  # Hide axis lines
    axis.text = element_blank(),  # Hide axis text
    axis.ticks = element_blank(),  # Hide axis ticks
    axis.title = element_blank(),  # Hide axis labels
    panel.grid.major = element_blank(),  # Hide major grid
    panel.grid.minor = element_blank(),  # Hide minor grid
    panel.background = element_rect(fill = "white")  # Set background to white
  )
 
 
# Print the plot
print(p)
 
# Step 3: # Convert the ggplot object to a plotly object
#      Tooltip works only on 'points', not on labels.
ggplotly(p, tooltip = "text")
</syntaxhighlight>
 
=== ggiraph ===
[https://stackoverflow.com/a/64959301 Convert ggraph to interative plot with plotyly or Network3D]


== [http://cran.r-project.org/web/packages/scatterD3/index.html scatterD3] ==
== [http://cran.r-project.org/web/packages/scatterD3/index.html scatterD3] ==
Line 527: Line 695:


== plotly ==
== plotly ==
<ul>
<li>[https://plotly.com/r/3d-scatter-plots/ How to make interactive 3D scatter plots], https://plotly.com/r/reference/#scatter-mode
{{Pre}}
mtcars$am[which(mtcars$am == 0)] <- 'Automatic'
mtcars$am[which(mtcars$am == 1)] <- 'Manual'
mtcars$am <- as.factor(mtcars$am)
fig <- plot_ly(mtcars, x = ~wt, y = ~hp, z = ~qsec, color = ~am, colors = c('#BF382A', '#0C4B8E'))
fig <- fig %>% add_markers()
fig <- fig %>% layout(scene = list(xaxis = list(title = 'Weight'),
                    yaxis = list(title = 'Gross horsepower'),
                    zaxis = list(title = '1/4 mile time')))
fig
</pre>
[[File:Plotly3d.png|350px]]
{{Pre}}
x <- rnorm(100); y <- rnorm(100) ; z <- rnorm(100)
labels <- paste("point", 1:100)
w <- sample(scales::hue_pal()(3), 100, replace=T)  # use ggplot2's default color scheme
plot_ly(x=x, y=y, z=z, text=labels, type ="scatter3d", mode="markers", marker = list(color = z, colorscale ="Viridis"))
plot_ly(x=x, y=y, z=z, text=labels, type ="scatter3d", mode="markers", marker = list(color = w))
</pre>
<li>[https://community.rstudio.com/t/how-to-add-more-data-to-tooltip-in-r-plotly-besides-the-text-argument/91971/2 How to add more data to tooltip in R plotly (besides the `text` argument)] </li>
<li>[https://plotly.com/r/line-and-scatter/ Data Labels on Hover] (Hover text), [https://plotly.com/r/hover-text-and-formatting/ Hover Text and Formatting in R] </li>
<li>[https://plotly.com/ggplot2/hover-text-and-formatting/ Hover Text and Formatting in ggplot2] </li>
<pre>
gobj <- ggplot(aes(x=var1, y=var2, text = paste0("another :", var3))) + geom_jitter()
ggplotly(gobj) # var3 will be shown on the tooltip/hover text
</pre>
</ul>
* [http://moderndata.plot.ly/power-curves-r-plotly-ggplot2/ Power curves] and ggplot2.
* [http://moderndata.plot.ly/power-curves-r-plotly-ggplot2/ Power curves] and ggplot2.
* [http://moderndata.plot.ly/time-series-charts-by-the-economist-in-r-using-plotly/ TIME SERIES CHARTS BY THE ECONOMIST IN R USING PLOTLY] & [https://moderndata.plot.ly/interactive-r-visualizations-with-d3-ggplot2-rstudio/ FIVE INTERACTIVE R VISUALIZATIONS WITH D3, GGPLOT2, & RSTUDIO]
* [http://moderndata.plot.ly/time-series-charts-by-the-economist-in-r-using-plotly/ TIME SERIES CHARTS BY THE ECONOMIST IN R USING PLOTLY] & [https://moderndata.plot.ly/interactive-r-visualizations-with-d3-ggplot2-rstudio/ FIVE INTERACTIVE R VISUALIZATIONS WITH D3, GGPLOT2, & RSTUDIO]
Line 538: Line 739:
* [https://www.displayr.com/how-to-add-trend-lines-in-r-using-plotly/?utm_medium=Feed&utm_source=Syndication How to add Trend Lines in R Using Plotly]
* [https://www.displayr.com/how-to-add-trend-lines-in-r-using-plotly/?utm_medium=Feed&utm_source=Syndication How to add Trend Lines in R Using Plotly]
* [https://blog.methodsconsultants.com/posts/introduction-to-interactive-graphics-in-r-with-plotly/ Introduction to Interactive Graphics in R with plotly]
* [https://blog.methodsconsultants.com/posts/introduction-to-interactive-graphics-in-r-with-plotly/ Introduction to Interactive Graphics in R with plotly]
* [https://plotly-r.com/controlling-tooltips.html#tooltip-text-ggplotly Tooltip]
* [https://plotly-r.com/ Interactive web-based data visualization with R, plotly, and shiny] (ebook) by Carson Sievert
** [https://stackoverflow.com/a/43571726 Formatting mouse over labels in plotly when using ggplotly]
** [https://plotly-r.com/controlling-tooltips.html#tooltip-text-ggplotly Tooltip], [https://stackoverflow.com/a/43571726 Formatting mouse over labels in plotly when using ggplotly]
* [https://plotly.com/r/3d-scatter-plots/ 3D Scatter Plots]
**[https://plotly.com/r/3d-scatter-plots/ 3D Scatter Plots], [https://shirinsplayground.netlify.app/2021/03/kmeans_101/ k-Means 101: An introductory guide to k-Means clustering in R]. Note the 3D plot is displayed on a browser.
** [https://shirinsplayground.netlify.app/2021/03/kmeans_101/ k-Means 101: An introductory guide to k-Means clustering in R]. Note the 3D plot is displayed on a browser.
* save interaction plots in HTML
:<syntaxhighlight lang='rsplus'>
p <- plotly::ggplotly(b)
htmlwidgets::saveWidget(p, "index.html")
</syntaxhighlight>
* Used plotly in shiny. [https://rdrr.io/cran/plotly/man/plotly-shiny.html plotlyOutput()]


== highcharter ==
== highcharter: alternative to plotly ==
https://cran.r-project.org/web/packages/highcharter/ Good for time series plot.
https://cran.r-project.org/web/packages/highcharter/ Good for time series plot.


= Amazon =
= Amazon =
[https://github.com/56north/Rmazon Download product information and reviews from Amazon.com]
[https://github.com/56north/Rmazon Download product information and reviews from Amazon.com] (2016, not working in get_reviews() as of 6/7/2023)
{{Pre}}
{{Pre}}
sudo apt-get install libxml2-dev
sudo apt-get install libxml2-dev
Line 584: Line 790:
# ... with 20 more rows, and 1 more variable: reviewText <chr>
# ... with 20 more rows, and 1 more variable: reviewText <chr>
reviews[1, 6] # 6-th column is the review text
reviews[1, 6] # 6-th column is the review text
</pre>
[https://martinctc.github.io/blog/vignette-scraping-amazon-reviews-in-r/ Vignette: Scraping Amazon Reviews in R] 2019
<pre>
library(rvest)
library(dplyr)
url <- "https://www.amazon.com/dp/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"
page <- read_html(url)
reviews <- page %>%
  html_nodes(".review") %>%
  html_text()
titles <- page %>%
  html_nodes(".review-title-content") %>%
  html_text()
ratings <- page %>%
  html_nodes(".review-rating") %>%
  html_text()
df <- data.frame(reviews, titles, ratings)
</pre>
[https://stackoverflow.com/a/42656204 Scraping Amazon Customer Reviews] (almost). The url is obtained by clicking "See all reviews", like [https://www.amazon.com/Art-Programming-Statistical-Software-Design/product-reviews/1593273843/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews this one].
Google Bard answers: first 10 reviews.
<pre>
library(rvest)
# Get the product ASIN code
asin <- "B07HSKPDBV"
# Create a URL for the product reviews page
url <- paste0("https://www.amazon.com/product-reviews/", asin)
# Read the HTML content of the product reviews page
html <- read_html(url)
# Extract the review titles, bodies, and ratings
reviews <- html %>%
  html_nodes(".review-text") %>%
  html_text() %>%  # so far is OK
  data.frame(
    title = strsplit(., "\n")[[1]][1],
    body = strsplit(., "\n")[-1],
    rating = strsplit(., " ")[1][2]
  )
# Print the first 5 reviews
head(reviews, 5)
</pre>
All reviews.
<pre>
library(rvest)
# Get the product ASIN code
asin <- "B07HSKPDBV"
# Create a function to scrape the reviews for a single page
scrape_reviews <- function(url) {
  html <- read_html(url)
  reviews <- html %>%
    html_nodes(".review-text") %>%
    html_text() %>%
    data.frame(
      title = strsplit(., "\n")[[1]][1],
      body = strsplit(., "\n")[-1],
      rating = strsplit(., " ")[1][2]
    )
  return(reviews)
}
# Create a vector of URLs for all of the review pages
urls <- seq(from = 1, to = 100, by = 10) %>%
  map_chr(function(x) paste0("https://www.amazon.com/product-reviews/", asin, "?pageNumber=", x))
# Scrape the reviews for all of the pages
reviews <- urls %>%
  map(scrape_reviews) %>%
  do.call(rbind, .)
# Print the first 5 reviews
head(reviews, 5)
</pre>
</pre>


Line 591: Line 882:
= Feed =
= Feed =
[https://github.com/datawookie/feeder feedeR] - Feed Reader Package for R
[https://github.com/datawookie/feeder feedeR] - Feed Reader Package for R
= File sharing =
[https://datawookie.dev/blog/2021/11/filebin-quick-easy-file-sharing/ {filebin} Quick & Easy File Sharing]


= Twitter =
= Twitter =
Line 596: Line 890:


= OCR =
= OCR =
* [http://ropensci.org/blog/blog/2016/11/16/tesseract Tesseract package: High Quality OCR in R], [https://www.r-bloggers.com/how-to-do-optical-character-recognition-ocr-of-non-english-documents-in-r-using-tesseract/ How to do Optical Character Recognition (OCR) of non-English documents in R using Tesseract?]
* [https://cran.r-project.org/web/packages/tesseract/vignettes/intro.html Using the Tesseract OCR engine in R]
** [http://ropensci.org/blog/blog/2016/11/16/tesseract Tesseract package: High Quality OCR in R], [https://www.r-bloggers.com/how-to-do-optical-character-recognition-ocr-of-non-english-documents-in-r-using-tesseract/ How to do Optical Character Recognition (OCR) of non-English documents in R using Tesseract?]
* https://cran.r-project.org/web/packages/abbyyR/index.html
* https://cran.r-project.org/web/packages/abbyyR/index.html
== Online ==
* https://www.onlineocr.net/ (works)
* https://ocr.space/ (not expected)
* https://www.newocr.com/ (not expected)


= Wikipedia =
= Wikipedia =
[https://github.com/ironholds/wikipedir WikipediR]: R's MediaWiki API client library
[https://github.com/ironholds/wikipedir WikipediR]: R's MediaWiki API client library

Latest revision as of 08:12, 10 September 2024

R Web Applications

See also CRAN Task View: Web Technologies and Services

Rmarkdown: create HTML5 web, slides and more

Rmarkdown

HTTP protocol

An HTTP server is conceptually simple:

  1. Open port 80 for listening
  2. When contact is made, gather a little information (get mainly - you can ignore the rest for now)
  3. Translate the request into a file request
  4. Open the file and spit it back at the client

It gets more difficult depending on how much of HTTP you want to support - POST is a little more complicated, scripts, handling multiple requests, etc.

Example in R

> co <- socketConnection(port=8080, server=TRUE, blocking=TRUE) 
> # Now open a web browser and type http://localhost:8080/index.html
> readLines(co,1)
[1] "GET /index.html HTTP/1.1"
> readLines(co,1)
[1] "Host: localhost:8080"
> readLines(co,1)
[1] "User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0"
> readLines(co,1)
[1] "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
> readLines(co,1)
[1] "Accept-Language: en-US,en;q=0.5"
> readLines(co,1)
[1] "Accept-Encoding: gzip, deflate"
> readLines(co,1)
[1] "Connection: keep-alive"
> readLines(co,1)
[1] ""

Example in C (Very simple http server written in C, 187 lines)

Create a simple hello world html page and save it as <index.html> in the current directory (/home/brb/Downloads/)

Launch the server program (assume we have done gcc http_server.c -o http_server)

$ ./http_server -p 50002
Server started at port no. 50002 with root directory as /home/brb/Downloads

Secondly open a browser and type http://localhost:50002/index.html. The server will respond

GET /index.html HTTP/1.1
Host: localhost:50002
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

file: /home/brb/Downloads/index.html
GET /favicon.ico HTTP/1.1
Host: localhost:50002
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

file: /home/brb/Downloads/favicon.ico
GET /favicon.ico HTTP/1.1
Host: localhost:50003
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

file: /home/brb/Downloads/favicon.ico

The browser will show the page from <index.html> in server.

The only bad thing is the code does not close the port. For example, if I have use Ctrl+C to close the program and try to re-launch with the same port, it will complain socket() or bind(): Address already in use.

Another Example in C (55 lines)

http://mwaidyanatha.blogspot.com/2011/05/writing-simple-web-server-in-c.html

The response is embedded in the C code.

If we test the server program by opening a browser and type "http://localhost:15000/", the server received the follwing 7 lines

GET / HTTP/1.1
Host: localhost:15000
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

If we include a non-executable file's name in the url, we will be able to download that file. Try "http://localhost:15000/client.c".

If we use telnet program to test, wee need to type anything we want

$ telnet localhost 15000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
ThisCanBeAnything        <=== This is what I typed in the client and it is also shown on server
HTTP/1.1 200 OK          <=== From here is what I got from server
Content-length: 37Content-Type: text/html

HTML_DATA_HERE_AS_YOU_MENTIONED_ABOVE <=== The html tags are not passed from server, interesting!
Connection closed by foreign host.
$ 

See also more examples under C page.

Others

shiny

See Shiny.

plumber: Turning your R code into a RESTful Web API

1. Too heavy to install 2. Get an error with dependencies when I try it on Ubuntu 16.04 3. check out the servr package

Docker

httpuv and servr

httpuv is more low-level and flexible, while servr is higher-level and easier to use for specific tasks.

See also the servr package which can start an HTTP server in R to serve static files, or dynamic documents that can be converted to HTML files (e.g., R Markdown) under a given directory.

R built in Web server, servr::httw() to serve a local directory as a website

servr::httw("DIRECTORY")
Rscript -e "servr::httd()" # default port 4321
# If index.html was found it will be used
# Otherwise, it will serve files under the cur dir

# Open another terminal
Rscript -e "servr::httd('/tmp')" -p4000

Question: Why not just open an html file in a browser? Answer: While opening an HTML file directly in a browser can be fine for simple, static pages, using a local server like `servr` provides a more accurate and robust testing environment for more complex websites:

  • Relative links: If your website uses relative links (links that point to other pages within the same site), these links may not work correctly when you open an HTML file directly in your browser. This is because the browser treats the file as if it's not part of a larger site. The `servr` package solves this problem by serving the entire directory as a website, preserving the correct structure and allowing relative links to function as intended.
  • Dynamic content: Some websites include dynamic content that requires a server to function correctly. This could include things like form submissions or search functionality. By using `servr`, you can test these features locally before deploying your site.
  • Mimic production environment: Using `servr` allows you to mimic the environment in which your website will be deployed. This can help catch any issues that might not be apparent when simply opening an HTML file in a browser. Serving your site with servr allows you to test server-side code and functionality. For instance, if your site uses server-side scripting (like PHP or ASP.NET), opening an HTML file directly in a browser won’t execute this code. But when served with servr, this code will be executed, allowing you to fully test your site’s functionality.

beakr

workflowr

httr2

httr2 package - Perform HTTP Requests and Process the Responses.

httptest2

opencpu

RApache

gWidgetsWWW

Rook

See Rook.

sumo

Sumo is a fully-functional web application template that exposes an authenticated user's R session within java server pages. See the paper http://journal.r-project.org/archive/2012-1/RJournal_2012-1_Bergsma+Smith.pdf.

Stockplot

FastRWeb

http://cran.r-project.org/web/packages/FastRWeb/index.html

WebDriver

'WebDriver' Client for 'PhantomJS'

https://github.com/rstudio/webdriver

Rwui

CGHWithR and WebDevelopR

CGHwithR is still working with old version of R although it is removed from CRAN. Its successor is WebDevelopR. Its The vignette (year 2013) provides a review of several available methods.

manipulate from RStudio

This is not a web application. But the manipulate package can be used to create interactive plot within R(Studio) environment easily. Its source is available at here.

Mathematica also has manipulate function for plotting; see here.

RCloud

RCloud is an environment for collaboratively creating and sharing data analysis scripts. RCloud lets you mix analysis code in R, HTML5, Markdown, Python, and others. Much like Sage, iPython notebooks and Mathematica, RCloud provides a notebook interface that lets you easily record a session and annotate it with text, equations, and supporting images.

See also the Talk in UseR 2014.

cloudyr and flyio - Input Output Files in R from Cloud or Local

https://blog.socialcops.com/inside-sc/announcements/flyio-r-package-interact-data-cloud/ Announcing flyio, an R Package to Interact with Data in the Cloud]

Dropbox access

rdrop2 package

Javascript

JavaScript for R — ebook https://book.javascript-for-r.com/

Sketch

Sketch Package looks to add JavaScript to R packages

Web page scraping

http://www.slideshare.net/schamber/web-data-from-r#btnNext

xml2 package

rvest package depends on xml2.

rvest, extract tables

Reading Remote Data Files: rvest

Reading Remote Data Files

V8: Embedded JavaScript Engine for R

R⁶ — General (Attys) Distributions: V8, rvest, ggbeeswarm, hrbrthemes and tidyverse packages are used.

pubmed.mineR

Text mining of PubMed Abstracts (http://www.ncbi.nlm.nih.gov/pubmed). The algorithms are designed for two formats (text and XML) from PubMed.

R code for scraping the P-values from pubmed, calculating the Science-wise False Discovery Rate, et al (Jeff Leek)

Get API data

How to get API data with R. See how to write your own R code to pull data from an API using API key authentication.

webshot2 - take a screenshot of web page

webshot2. It uses headless Chrome via the Chromote package. You also need to have the Chrome browser installed on your system. You can also use other browsers based on Chromium, such as Chromium itself, Edge, Vivaldi, Brave, or Opera.

These R packages import sports, weather, stock data and more

Diving Into Dynamic Website Content with splashr

https://rud.is/b/2017/02/09/diving-into-dynamic-website-content-with-splashr/

Network

netstat

https://cran.r-project.org/web/packages/netstat/index.html

Send email

curl

Note

  • As we can see in the example snippet, there is a duplicate of sender's and recipient's emails.
  • It seems the recipients and sender parts are the real ones
  • In the message string, we can skip 'From' and 'To' line.
    • The sender's email entered in message is not used
    • The recipient's email entered in message is not used
  • Experiment: recipients & sender are correct but the two email addresses in message are wrong.
    • Answer: I can still receive the email.
    • (In the recipient's inbox) the email address in the 'From' part will be replaced with the correct one but the name (e.g. "R (curl package)") will be the one we use in the message
    • (In the recipient's inbox) the email address & the name in the 'To' part will be the one as we use in the message (interesting)
  • Question: does sender's email has to be gmail.com?
library(curl)
recipients <- "[email protected]"
sender <- '[email protected]'

# Full email message in RFC2822 format
message <- 'From: "R (curl package)" <[email protected]>
To: "Roger Recipient" <[email protected]>
Subject: Hello R user!

Dear R user,

I am sending this email using curl.'

# Send the email
send_mail(sender, recipients, message, smtp_server = 'smtps://smtp.gmail.com',
  username = 'curlpackage', password  = 'qyyjddvphjsrbnlm')

emayili

  • https://cran.r-project.org/web/packages/emayili/index.html
  • emayili: Sending Email from R. This shows how to send emails from the terminal command line tool curl if we don't want to use R to do the job. Compared to the 'curl' package, there is no need to repeat the sender's & recipient's email addresses in the message part.
  • You can add Cc, Bcc and Reply-To header fields using the cc(), bcc() and reply() methods. Files can be attached using the attachment() method.
  • Note that if the email in envelope(From) is different from the one in server(username), I see the email in evelope(From) will be ignore. So in the following simplified example I skip envelope(From).
  • If I change the 'to' email to a yahoo account, it works but the email went to the Trash folder.
  • Rendering R Markdown
    library(emayili)
    
    email <- envelope(
      to = "[email protected]",
      subject = "This is a plain text message!",
      text = "Hello!\nHello2"
    )
    smtp <- server(host = "smtp.gmail.com",
                   port = 465,
                   username = "[email protected]",
                   password = "bd40ef6d4a9413de9c1318a65cbae5d7")
    smtp(email, verbose = TRUE)
    
  • Creating Email Threads

blastula (RStudio)

mailR

Easiest. Require rJava package (not trivial to install, see rJava). mailR is an interface to Apache Commons Email to send emails from within R. See also send bulk email

Before we use the mailR package, we have followed here to have Allow less secure apps: 'ON' ; or you might get an error Error: EmailException (Java): Sending the email to the following server failed : smtp.gmail.com:465. Once we turn on this option, we may get an email for the notification of this change. Note that the recipient can be other than a gmail.

> send.mail(from = "[email protected]",
  to = c("[email protected]", "Recipient 2 <[email protected]>"),
  replyTo = c("Reply to someone else <[email protected]>")
  subject = "Subject of the email",
  body = "Body of the email",
  smtp = list(host.name = "smtp.gmail.com", port = 465, user.name = "gmail_username", passwd = "password", ssl = TRUE),
  attach.files ="./myattachment.txt",
  authenticate = TRUE,
  send = TRUE)
[1] "Java-Object{org.apache.commons.mail.SimpleEmail@7791a895}"

gmailr

More complicated. gmailr provides access the Google's gmail.com RESTful API. Vignette and an example on here. Note that it does not use a password; it uses a json file for oauth authentication downloaded from https://console.cloud.google.com/. See also https://github.com/jimhester/gmailr/issues/1.

library(gmailr)
gmail_auth('mysecret.json', scope = 'compose') 

test_email <- mime() %>%
  to("[email protected]") %>%
  from("[email protected]") %>%
  subject("This is a subject") %>%
  html_body("<html><body>I wish this was bold</body></html>")
send_message(test_email)

sendmailR

sendmailR provides a simple SMTP client. It is not clear how to use the package (i.e. where to enter the password).

json

jsonlite

rjson

http://heuristically.wordpress.com/2013/05/20/geolocate-ip-addresses-in-r/

RJSONIO

Accessing Bitcoin Data with R

http://blog.revolutionanalytics.com/2015/11/accessing-bitcoin-data-with-r.html

Plot IP on google map

The following example is modified from the first of above list.

require(RJSONIO) # fromJSON
require(RCurl)   # getURL

temp = getURL("https://gist.github.com/arraytools/6743826/raw/23c8b0bc4b8f0d1bfe1c2fad985ca2e091aeb916/ip.txt", 
                           ssl.verifypeer = FALSE)
ip <- read.table(textConnection(temp), as.is=TRUE)
names(ip) <- "IP"
nr = nrow(ip)
 
Lon <- as.numeric(rep(NA, nr))
Lat <- Lon
Coords <- data.frame(Lon, Lat)
 
ip2coordinates <- function(ip) {
  api <- "http://freegeoip.net/json/"
  get.ips <- getURL(paste(api, URLencode(ip), sep=""))
  # result <- ldply(fromJSON(get.ips), data.frame)
  result <- data.frame(fromJSON(get.ips))
  names(result)[1] <- "ip.address"
  return(result)
}

for (i in 1:nr){
  cat(i, "\n")
  try(
  Coords[i, 1:2] <- ip2coordinates(ip$IP[i])[c("longitude", "latitude")]
  )
}
 
# append to log-file:
logfile <- data.frame(ip, Lat = Coords$Lat, Long = Coords$Lon,
                                       LatLong = paste(round(Coords$Lat, 1), round(Coords$Lon, 1), sep = ":")) 
log_gmap <- logfile[!is.na(logfile$Lat), ]

require(googleVis) # gvisMap
gmap <- gvisMap(log_gmap, "LatLong",
                options = list(showTip = TRUE, enableScrollWheel = TRUE,
                               mapType = 'hybrid', useMapTypeControl = TRUE,
                               width = 1024, height = 800))
plot(gmap)

File:GoogleVis.png

The plot.gvis() method in googleVis packages also teaches the startDynamicHelp() function in the tools package, which was used to launch a http server. See Jeffrey Horner's note about deploying Rook App.

Convert JSON to CSV using Linux shell

How to convert JSON to CSV using Linux / Unix shell

GEO (Gene Expression Omnibus)

See this internal link.

Interactive html output

webr

sendplot

RIGHT

The supported plot types include scatterplot, barplot, box plot, line plot and pie plot.

In addition to tooltip boxes, the package can create a table showing all information about selected nodes.

r2d3

r2d3 - R Interface to D3 Visualizations

d3Network

library(d3Network)

Source <- c("A", "A", "A", "A", "B", "B", "C", "C", "D") 
Target <- c("B", "C", "D", "J", "E", "F", "G", "H", "I") 
NetworkData <- data.frame(Source, Target) 

d3SimpleNetwork(NetworkData, height = 800, width = 1024, file="tmp.html")

htmlwidgets for R

Embed widgets in R Markdown documents and Shiny web applications.

igraph

visNetwork

networkD3

plotly

library(igraph)
library(ggplot2)
library(plotly)

# Create a data frame that represents edges
dat <- data.frame(name=c("Alice", "Bob", "Cecil"), age=c(48,33,45))


# Step 1: Create an igraph object
g <- graph_from_data_frame(dat, directed = FALSE)

# Get the layout of the graph
layout <- layout_nicely(g)

# Create a data frame for the vertices
vertices <- data.frame(id = V(g)$name, x = layout[,1], y = layout[,2],
                       var1=V(g)$name, var2=LETTERS[1:6])

# Get the edges and convert vertex names to coordinates
edges <- get.data.frame(g, what = "edges")
edges <- merge(edges, vertices, by.x = "from", by.y = "id")
edges <- merge(edges, vertices, by.x = "to", by.y = "id", suffixes = c(".from", ".to"))

# Step 2: Create a ggplot object
p <- ggplot(vertices, aes(x = x, y = y)) +
  geom_segment(data = edges, aes(x = x.from, y = y.from, xend = x.to, yend = y.to)) +
  geom_point(aes(text = paste("Var1:", var1, "\nVar2:", var2))) +
  geom_text(aes(x = x, y = y, label = id), vjust = 1, hjust = 0, nudge_x=.2) +
  expand_limits(x = c(-2, 1.5)) + 
  theme(
    axis.line = element_blank(),  # Hide axis lines
    axis.text = element_blank(),  # Hide axis text
    axis.ticks = element_blank(),  # Hide axis ticks
    axis.title = element_blank(),  # Hide axis labels
    panel.grid.major = element_blank(),  # Hide major grid
    panel.grid.minor = element_blank(),  # Hide minor grid
    panel.background = element_rect(fill = "white")  # Set background to white
  )


# Print the plot
print(p)

# Step 3: # Convert the ggplot object to a plotly object
#       Tooltip works only on 'points', not on labels.
ggplotly(p, tooltip = "text")

ggiraph

Convert ggraph to interative plot with plotyly or Network3D

scatterD3

scatterD3 is an HTML R widget for interactive scatter plots visualization. It is based on the htmlwidgets R package and on the d3.js javascript library.

dygraphs

rthreejs - Create interactive 3D scatter plots, network plots, and globes

Examples

rayshader: 2D and 3D mapping and data visualization with shades

https://github.com/tylermorganwall/rayshader

On Rstudio server, we need options(rgl.printRglwidget = TRUE) ; see Why is my 3D plot not showing up in R Studio plot viewer?.

d3heatmap

See R

collapsibleTree

svgPanZoom

This 'htmlwidget' provides pan and zoom interactivity to R graphics, including 'base', 'lattice', and 'ggplot2'. The interactivity is provided through the 'svg-pan-zoom.js' library.

DT: An R interface to the DataTables library

reactable: interactive table with rows that expand when clicked

How to create tables in R with expandable rows

getable: creating a 'dynamic' HTML table

Getting Tabular Data Through JavaScript in Compiled R Markdown Documents.

The content stays static while the data could be updated independently without rewriting or recompiling the HTML document. This could be done by utilizing JavaScript’s ability to asynchronously fetch data from the web and generate DOM elements based on these data.

plotly

p <- plotly::ggplotly(b)
htmlwidgets::saveWidget(p, "index.html")

highcharter: alternative to plotly

https://cran.r-project.org/web/packages/highcharter/ Good for time series plot.

Amazon

Download product information and reviews from Amazon.com (2016, not working in get_reviews() as of 6/7/2023)

sudo apt-get install libxml2-dev
sudo apt-get install libcurl4-openssl-dev

and in R

install.packages("devtools")
install.packages("XML")
install.packages("pbapply")
install.packages("dplyr")
devtools::install_github("56north/Rmazon")
product_info <- Rmazon::get_product_info("1593273843")
reviews <- Rmazon::get_reviews("1593273843")
reviews[1,6] # only show partial characters from the 1st review
nchar(reviews[1,6])
as.character(reviews[1,6]) # show the complete text from the 1st review

reviews <- Rmazon::get_reviews("B07BNGJXGS")
# Fetching 30 reviews of 'BOOX Note Ereader,Android 6.0 32 GB 10.3" Dual Touch HD Display'
#   |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 02s
reviews
# A tibble: 30 x 6
   reviewRating reviewDate reviewFormat Verified_Purcha… reviewHeadline
          <dbl> <chr>      <lgl>        <lgl>            <chr>         
 1            4 May 23, 2… NA           TRUE             Good for PDF …
 2            3 May 8, 20… NA           FALSE            The reading s…
 3            5 May 17, 2… NA           TRUE             E-reader and …
 4            3 May 24, 2… NA           TRUE             Good hardware…
 5            3 June 21, … NA           TRUE             Poor QC       
 6            5 August 5,… NA           TRUE             Excellent for…
 7            5 May 31, 2… NA           TRUE             Especially li…
 8            5 July 4, 2… NA           TRUE             Android 6 rea…
 9            4 July 15, … NA           TRUE             Remember the …
10            4 June 9, 2… NA           TRUE             Overall fanta…
# ... with 20 more rows, and 1 more variable: reviewText <chr>
reviews[1, 6] # 6-th column is the review text

Vignette: Scraping Amazon Reviews in R 2019

library(rvest)
library(dplyr)

url <- "https://www.amazon.com/dp/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"
page <- read_html(url)

reviews <- page %>%
  html_nodes(".review") %>%
  html_text()

titles <- page %>%
  html_nodes(".review-title-content") %>%
  html_text()

ratings <- page %>%
  html_nodes(".review-rating") %>%
  html_text()

df <- data.frame(reviews, titles, ratings)

Scraping Amazon Customer Reviews (almost). The url is obtained by clicking "See all reviews", like this one.

Google Bard answers: first 10 reviews.

library(rvest)

# Get the product ASIN code
asin <- "B07HSKPDBV"

# Create a URL for the product reviews page
url <- paste0("https://www.amazon.com/product-reviews/", asin)

# Read the HTML content of the product reviews page
html <- read_html(url)

# Extract the review titles, bodies, and ratings
reviews <- html %>%
  html_nodes(".review-text") %>%
  html_text() %>%  # so far is OK
  data.frame(
    title = strsplit(., "\n")[[1]][1],
    body = strsplit(., "\n")[-1],
    rating = strsplit(., " ")[1][2]
  )

# Print the first 5 reviews
head(reviews, 5)

All reviews.

library(rvest)

# Get the product ASIN code
asin <- "B07HSKPDBV"

# Create a function to scrape the reviews for a single page
scrape_reviews <- function(url) {
  html <- read_html(url)
  reviews <- html %>%
    html_nodes(".review-text") %>%
    html_text() %>%
    data.frame(
      title = strsplit(., "\n")[[1]][1],
      body = strsplit(., "\n")[-1],
      rating = strsplit(., " ")[1][2]
    )
  return(reviews)
}

# Create a vector of URLs for all of the review pages
urls <- seq(from = 1, to = 100, by = 10) %>%
  map_chr(function(x) paste0("https://www.amazon.com/product-reviews/", asin, "?pageNumber=", x))

# Scrape the reviews for all of the pages
reviews <- urls %>%
  map(scrape_reviews) %>%
  do.call(rbind, .)

# Print the first 5 reviews
head(reviews, 5)

gutenbergr

Edinbr: Text Mining with R

Feed

feedeR - Feed Reader Package for R

File sharing

{filebin} Quick & Easy File Sharing

Twitter

Faces of #rstats Twitter

OCR

Online

Wikipedia

WikipediR: R's MediaWiki API client library