R web: Difference between revisions
(60 intermediate revisions by the same user not shown) | |||
Line 140: | Line 140: | ||
= [https://www.rplumber.io/ plumber]: Turning your R code into a RESTful Web API = | = [https://www.rplumber.io/ plumber]: Turning your R code into a RESTful Web API = | ||
1. Too heavy to install 2. Get an error with dependencies when I try it on Ubuntu 16.04 3. check out the '''servr''' package | |||
* https://github.com/trestletech/plumber | * https://github.com/trestletech/plumber | ||
* [https://blog.learningtree.com/creating-web-service-in-r/ Creating a Web Service in R] | |||
* https://www.rstudio.com/resources/videos/plumber-turning-your-r-code-into-an-api/ | * https://www.rstudio.com/resources/videos/plumber-turning-your-r-code-into-an-api/ | ||
* [https://blog.rstudio.com/2018/10/23/rstudio-1-2-preview-plumber-integration/ RStudio 1.2 Preview: Plumber Integration] | * [https://blog.rstudio.com/2018/10/23/rstudio-1-2-preview-plumber-integration/ RStudio 1.2 Preview: Plumber Integration] | ||
Line 163: | Line 166: | ||
* [https://www.statworx.com/de/blog/running-your-r-script-in-docker/ Running your R script in Docker]. Goal: containerizing an R script to eventually execute it automatically each time the container is started, without any user interaction. An enhanced version of the instruction is at [https://github.com/arraytools/RinDocker this page]. | * [https://www.statworx.com/de/blog/running-your-r-script-in-docker/ Running your R script in Docker]. Goal: containerizing an R script to eventually execute it automatically each time the container is started, without any user interaction. An enhanced version of the instruction is at [https://github.com/arraytools/RinDocker this page]. | ||
= [http://cran.r-project.org/web/packages/httpuv/index.html httpuv] = | = [http://cran.r-project.org/web/packages/httpuv/index.html httpuv] and servr = | ||
'''httpuv''' is more low-level and flexible, while '''servr''' is higher-level and easier to use for specific tasks. | |||
See also the [https://cran.r-project.org/web/packages/servr/index.html servr] package which can start an HTTP server in R to serve static files, or dynamic documents that can be converted to HTML files (e.g., R Markdown) under a given directory. | See also the [https://cran.r-project.org/web/packages/servr/index.html servr] package which can start an HTTP server in R to serve static files, or dynamic documents that can be converted to HTML files (e.g., R Markdown) under a given directory. | ||
[https://stackoverflow.com/a/59691029 R built in Web server], [https://www.r-bloggers.com/2023/09/3-r-functions-that-i-enjoy/ servr::httw() to serve a local directory as a website] | |||
<pre> | |||
servr::httw("DIRECTORY") | |||
</pre> | |||
<pre> | |||
Rscript -e "servr::httd()" # default port 4321 | |||
# If index.html was found it will be used | |||
# Otherwise, it will serve files under the cur dir | |||
# Open another terminal | |||
Rscript -e "servr::httd('/tmp')" -p4000 | |||
</pre> | |||
Question: Why not just open an html file in a browser? Answer: While opening an HTML file directly in a browser can be fine for simple, static pages, using a local server like `servr` provides a more accurate and robust testing environment for more complex websites: | |||
* '''Relative links''': If your website uses relative links (links that point to other pages within the same site), these links may not work correctly when you open an HTML file directly in your browser. This is because the browser treats the file as if it's not part of a larger site. The `servr` package solves this problem by serving the entire directory as a website, preserving the correct structure and allowing relative links to function as intended. | |||
* '''Dynamic content''': Some websites include dynamic content that requires a server to function correctly. This could include things like form submissions or search functionality. By using `servr`, you can test these features locally before deploying your site. | |||
* '''Mimic production environment''': Using `servr` allows you to mimic the environment in which your website will be deployed. This can help catch any issues that might not be apparent when simply opening an HTML file in a browser. Serving your site with servr allows you to test server-side code and functionality. For instance, if your site uses server-side scripting (like PHP or ASP.NET), opening an HTML file directly in a browser won’t execute this code. But when served with servr, this code will be executed, allowing you to fully test your site’s functionality. | |||
== beakr == | |||
* [https://working-with-data.mazamascience.com/2020/10/30/beakr-a-small-web-framework-for-r/ What is beakr?] | |||
* [https://working-with-data.mazamascience.com/2020/11/11/web-frameworks-for-r-a-brief-overview/ Web Frameworks for R – A Brief Overview] and hello world examples. | |||
== workflowr == | |||
= httr2 = | |||
[https://cran.r-project.org/web//packages/httr2/index.html httr2] package - Perform HTTP Requests and Process the Responses. | |||
== httptest2 == | |||
* [https://cran.r-project.org/web//packages/httptest2/index.html httptest2] - Test Helpers for 'httr2' | |||
* [https://books.ropensci.org/http-testing/ HTTP testing in R], [https://www.r-consortium.org/blog/2023/05/15/better-understanding-your-tools-choices-online-book-http-testing-r Better Understanding Your Tools Choices with Online Book HTTP Testing in R] | |||
= opencpu = | |||
* https://github.com/opencpu/opencpu | |||
* [https://github.com/kdpsingh/rjs rjs]: R in JavaScript | |||
= [http://rapache.net/ RApache] = | = [http://rapache.net/ RApache] = | ||
Line 212: | Line 250: | ||
= Dropbox access = | = Dropbox access = | ||
[https://cran.r-project.org/web/packages/rdrop2/index.html rdrop2] package | [https://cran.r-project.org/web/packages/rdrop2/index.html rdrop2] package | ||
= Javascript = | |||
[https://paulvanderlaken.com/2020/12/01/javascript-for-r-ebook/ JavaScript for R — ebook] https://book.javascript-for-r.com/ | |||
== Sketch == | |||
[https://www.r-consortium.org/blog/2023/04/26/sketch-package-looks-to-add-javascript-to-r-packages Sketch Package looks to add JavaScript to R packages] | |||
= Web page scraping = | = Web page scraping = | ||
Line 219: | Line 263: | ||
rvest package depends on xml2. | rvest package depends on xml2. | ||
== [https://cran.r-project.org/web/packages/rvest/index.html rvest] == | == [https://cran.r-project.org/web/packages/rvest/index.html rvest], extract tables == | ||
[http://blog.rstudio.org/2014/11/24/rvest-easy-web-scraping-with-r/ Easy web scraping with R] | <ul> | ||
<li>[http://blog.rstudio.org/2014/11/24/rvest-easy-web-scraping-with-r/ Easy web scraping with R] | |||
<li>[https://twitter.com/rlbarter/status/1646267698857541632 Blown away that web scraping in R with the rvest package is as easy as] | |||
<pre> | |||
page <- read_html("https://en.m.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population") | |||
tables <- html_table(page) | |||
tables[[2]] | |||
On Ubuntu, we need to install two packages first! | # A tibble: 243 × 7 | ||
Rank `Country / Dependency` Population Population Date Source (official or …¹ Notes | |||
<chr> <chr> <chr> <chr> <chr> <chr> <chr> | |||
1 Rank Country / Dependency Numbers % of the … Date Source (official or f… "Not… | |||
2 – World 8,026,177,… 100% 16 A… UN projection[3] "" | |||
3 1 China 1,411,750,… 17.6% 31 D… Official estimate[4] "[b]" | |||
4 2 India 1,392,329,… 17.3% 1 Ma… Official projection[5] "[c]… | |||
5 3 United States 334,631,000 4.17% 16 A… National population c… "[d]" | |||
6 4 Indonesia 275,773,800 3.44% 1 Ju… Official estimate[7] "" | |||
7 5 Pakistan 235,825,000 2.94% 1 Ju… UN projection[3] "[e]" | |||
8 6 Nigeria 218,541,000 2.72% 1 Ju… UN projection[3] "" | |||
9 7 Brazil 216,024,545 2.69% 16 A… National population c… "" | |||
10 8 Bangladesh 169,828,911 2.12% 15 J… 2022 final census res… "" | |||
# ℹ 233 more rows | |||
# ℹ abbreviated name: ¹`Source (official or from the United Nations)` | |||
# ℹ Use `print(n = ...)` to see more rows | |||
</pre> | |||
<li>On Ubuntu, we need to install two packages first! | |||
<syntaxhighlight lang="sh"> | |||
sudo apt-get install libcurl4-openssl-dev # OR libcurl4-gnutls-dev | sudo apt-get install libcurl4-openssl-dev # OR libcurl4-gnutls-dev | ||
sudo apt-get install libxml2-dev | sudo apt-get install libxml2-dev | ||
</ | </syntaxhighlight> | ||
<li>https://github.com/tidyverse/rvest, [https://github.com/tidyverse/rvest/blob/master/NEWS.md NEWS] | |||
<li>[http://datascienceplus.com/visualizing-obesity-across-united-states-by-using-data-from-wikipedia/ Visualizing obesity across United States by using data from Wikipedia] | |||
<li>[https://stat4701.github.io/edav/2015/04/02/rvest_tutorial/ rvest tutorial: scraping the web using R] | |||
<li>https://renkun.me/pipeR-tutorial/Examples/rvest.html | |||
<li>http://zevross.com/blog/2015/05/19/scrape-website-data-with-the-new-r-package-rvest/ | |||
<li>[https://datascienceplus.com/google-scholar-scraping-with-rvest/ Google scholar scraping with rvest package] | |||
<li>[https://www.radmuzom.com/2020/05/03/an-update-to-an-adventure-in-downloading-books/ An update to "An adventure in downloading books"] | |||
<li>[https://www.r-bloggers.com/2024/09/how-to-webscrape-in-r/ How to webscrape in R?] | |||
</ul> | |||
== Reading Remote Data Files: rvest == | |||
[https://kieranhealy.org/blog/archives/2023/03/25/reading-remote-data-files/ Reading Remote Data Files] | |||
== [https://cran.r-project.org/web/packages/V8/index.html V8]: Embedded JavaScript Engine for R == | == [https://cran.r-project.org/web/packages/V8/index.html V8]: Embedded JavaScript Engine for R == | ||
Line 247: | Line 318: | ||
== Get API data == | == Get API data == | ||
[https://youtu.be/tlaJf0CHbFE How to get API data with R]. See how to write your own R code to pull data from an API using API key authentication. | [https://youtu.be/tlaJf0CHbFE How to get API data with R]. See how to write your own R code to pull data from an API using API key authentication. | ||
== webshot2 - take a screenshot of web page == | |||
[https://rstudio.github.io/webshot2/ webshot2]. It uses headless Chrome via the Chromote package. You also need to have the Chrome browser installed on your system. You can also use other browsers based on Chromium, such as Chromium itself, Edge, Vivaldi, Brave, or Opera. | |||
= These R packages import sports, weather, stock data and more = | = These R packages import sports, weather, stock data and more = | ||
Line 254: | Line 328: | ||
* https://cran.r-project.org/web/packages/rnoaa/index.html. A personal API key (token) is required. 10,000 requests per day | * https://cran.r-project.org/web/packages/rnoaa/index.html. A personal API key (token) is required. 10,000 requests per day | ||
* <strike>http://ram-n.github.io/weatherData/ </strike> (not working) | * <strike>http://ram-n.github.io/weatherData/ </strike> (not working) | ||
* [https://datawookie.dev/blog/2022/08/historical-weather-data/ Historical Weather Data] | |||
= Diving Into Dynamic Website Content with splashr = | = Diving Into Dynamic Website Content with splashr = | ||
Line 266: | Line 341: | ||
* [https://www.rdocumentation.org/packages/curl/versions/4.2/topics/send_mail curl::send_mail()]. | * [https://www.rdocumentation.org/packages/curl/versions/4.2/topics/send_mail curl::send_mail()]. | ||
* [https://petermeissner.de/blog/2020/09/07/web-send-mail-windows/ R Internet: Yet Another Way To Send Emails On Windows/Sending Emails with {curl} and Docker] | * [https://petermeissner.de/blog/2020/09/07/web-send-mail-windows/ R Internet: Yet Another Way To Send Emails On Windows/Sending Emails with {curl} and Docker] | ||
* [https://blog.edmdesigner.com/send-email-from-linux-command-line/ 16 Command Examples to Send Email From The Linux Command Line] | |||
* [https://medium.com/geekculture/5-extra-uses-for-curl-that-dont-involve-web-requests-6780a345877f 5 Extra Uses For Curl That Don’t Involve Web Requests]. Send an email, Check for open ports, Upload files using TFTP, Resume downloadsDownload files from an SMB share. | |||
Note | Note | ||
* As we can see in the example snippet, there is a duplicate of sender's and recipient's emails. | * As we can see in the example snippet, there is a duplicate of sender's and recipient's emails. | ||
* It seems the '''recipients''' and '''sender''' parts are the real ones | * It seems the '''recipients''' and '''sender''' parts are the real ones | ||
* The sender's email entered in message | * In the '''message''' string, we can skip 'From' and 'To' line. | ||
* The recipient's email entered in message will be | ** The sender's email entered in message is not used | ||
** The recipient's email entered in message is not used | |||
* Experiment: recipients & sender are correct but the two email addresses in message are wrong. | |||
** Answer: I can still receive the email. | |||
** (In the recipient's inbox) the email address in the 'From' part will be replaced with the correct one but the name (e.g. "R (curl package)") will be the one we use in the message | |||
** (In the recipient's inbox) the email address & the name in the 'To' part will be the one as we use in the message (interesting) | |||
* Question: does sender's email has to be gmail.com? | * Question: does sender's email has to be gmail.com? | ||
Line 294: | Line 376: | ||
=== emayili === | === emayili === | ||
<ul> | |||
<li>https://cran.r-project.org/web/packages/emayili/index.html | |||
<li>[https://datawookie.netlify.com/blog/2019/05/emayili-sending-email-from-r/ emayili: Sending Email from R]. This shows how to send emails from the terminal command line tool '''curl''' if we don't want to use '''R''' to do the job. Compared to the 'curl' package, there is no need to repeat the sender's & recipient's email addresses in the message part. | |||
<li>You can add Cc, Bcc and Reply-To header fields using the '''cc()''', '''bcc()''' and '''reply()''' methods. Files can be attached using the '''attachment()''' method. | |||
<li>Note that if the email in envelope(From) is different from the one in server(username), I see the email in evelope(From) will be ignore. So in the following simplified example I skip envelope(From). | |||
<li>If I change the 'to' email to a yahoo account, it works but the email went to the Trash folder. | |||
<li>[https://datawookie.dev/blog/2021/09/emayili-rendering-r-markdown/ Rendering R Markdown] | |||
<pre> | |||
library(emayili) | |||
email <- envelope( | |||
to = "[email protected]", | |||
subject = "This is a plain text message!", | |||
text = "Hello!\nHello2" | |||
) | |||
smtp <- server(host = "smtp.gmail.com", | |||
port = 465, | |||
username = "[email protected]", | |||
password = "bd40ef6d4a9413de9c1318a65cbae5d7") | |||
smtp(email, verbose = TRUE) | |||
</pre> | |||
<li>[https://www.r-bloggers.com/2024/06/creating-email-threads/ Creating Email Threads] | |||
</ul> | |||
== [https://cran.r-project.org/web/packages/blastula/index.html blastula] (RStudio) == | == [https://cran.r-project.org/web/packages/blastula/index.html blastula] (RStudio) == | ||
Line 337: | Line 440: | ||
== [https://cran.r-project.org/web/packages/sendmailR/index.html sendmailR] == | == [https://cran.r-project.org/web/packages/sendmailR/index.html sendmailR] == | ||
sendmailR provides a simple SMTP client. It is not clear how to use the package (i.e. where to enter the password). | sendmailR provides a simple SMTP client. It is not clear how to use the package (i.e. where to enter the password). | ||
= json = | |||
== jsonlite == | |||
* [https://www.dataenq.com/2020/09/reading-json-file-web-data-frame.html Reading JSON file from web and preparing data for analysis] | |||
* [https://blog.ephorie.de/xkcd-comics-as-a-minimal-example-for-calling-apis-downloading-files-and-displaying-png-images-with-r xkcd Comics as a Minimal Example for Calling APIs, Downloading Files and Displaying PNG Images with R] | |||
== [http://cran.r-project.org/web/packages/rjson/index.html rjson] == | |||
http://heuristically.wordpress.com/2013/05/20/geolocate-ip-addresses-in-r/ | |||
== [http://cran.r-project.org/web/packages/RJSONIO/index.html RJSONIO] == | |||
=== Accessing Bitcoin Data with R === | |||
http://blog.revolutionanalytics.com/2015/11/accessing-bitcoin-data-with-r.html | |||
=== Plot IP on google map === | |||
* http://thebiobucket.blogspot.com/2011/12/some-fun-with-googlevis-plotting-blog.html#more (RCurl, RJONIO, plyr, googleVis) | |||
* http://devblog.icans-gmbh.com/using-the-maxmind-geoip-api-with-r/ (RCurl, RJONIO, maps) | |||
* http://cran.r-project.org/web/packages/geoPlot/index.html (geoPlot package (deprecated as 8/12/2013)) | |||
* http://archive09.linux.com/feature/135384 (Not R) ApacheMap | |||
* http://batchgeo.com/features/geolocation-ip-lookup/ (Not R) (Enter a spreadsheet of adress, city, zip or a column of IPs and it will show the location on google map) | |||
* http://code.google.com/p/apachegeomap/ | |||
The following example is modified from the first of above list. | |||
{{Pre}} | |||
require(RJSONIO) # fromJSON | |||
require(RCurl) # getURL | |||
temp = getURL("https://gist.github.com/arraytools/6743826/raw/23c8b0bc4b8f0d1bfe1c2fad985ca2e091aeb916/ip.txt", | |||
ssl.verifypeer = FALSE) | |||
ip <- read.table(textConnection(temp), as.is=TRUE) | |||
names(ip) <- "IP" | |||
nr = nrow(ip) | |||
Lon <- as.numeric(rep(NA, nr)) | |||
Lat <- Lon | |||
Coords <- data.frame(Lon, Lat) | |||
ip2coordinates <- function(ip) { | |||
api <- "http://freegeoip.net/json/" | |||
get.ips <- getURL(paste(api, URLencode(ip), sep="")) | |||
# result <- ldply(fromJSON(get.ips), data.frame) | |||
result <- data.frame(fromJSON(get.ips)) | |||
names(result)[1] <- "ip.address" | |||
return(result) | |||
} | |||
for (i in 1:nr){ | |||
cat(i, "\n") | |||
try( | |||
Coords[i, 1:2] <- ip2coordinates(ip$IP[i])[c("longitude", "latitude")] | |||
) | |||
} | |||
# append to log-file: | |||
logfile <- data.frame(ip, Lat = Coords$Lat, Long = Coords$Lon, | |||
LatLong = paste(round(Coords$Lat, 1), round(Coords$Lon, 1), sep = ":")) | |||
log_gmap <- logfile[!is.na(logfile$Lat), ] | |||
require(googleVis) # gvisMap | |||
gmap <- gvisMap(log_gmap, "LatLong", | |||
options = list(showTip = TRUE, enableScrollWheel = TRUE, | |||
mapType = 'hybrid', useMapTypeControl = TRUE, | |||
width = 1024, height = 800)) | |||
plot(gmap) | |||
</pre> | |||
[[:File:GoogleVis.png]] | |||
The plot.gvis() method in googleVis packages also teaches the startDynamicHelp() function in the tools package, which was used to launch a http server. See | |||
[http://jeffreyhorner.tumblr.com/page/3 Jeffrey Horner's note about deploying Rook App]. | |||
== Convert JSON to CSV using Linux shell == | |||
[https://www.cyberciti.biz/faq/how-to-convert-json-to-csv-using-linux-unix-shell/ How to convert JSON to CSV using Linux / Unix shell] | |||
= [http://www.ncbi.nlm.nih.gov/geo/ GEO (Gene Expression Omnibus)] = | = [http://www.ncbi.nlm.nih.gov/geo/ GEO (Gene Expression Omnibus)] = | ||
Line 342: | Line 516: | ||
= Interactive html output = | = Interactive html output = | ||
== webr == | |||
* [https://r.iresmi.net/posts/2024/webr/index.html Playing with webr] R in your browser. | |||
* https://github.com/coatless/quarto-webr?tab=readme-ov-file | |||
* [https://nrennie.rbind.io/blog/webr-shiny-tidytuesday/ A webR powered Shiny app for browsing TidyTuesday plots] | |||
== [http://cran.r-project.org/web/packages/sendplot/index.html sendplot] == | == [http://cran.r-project.org/web/packages/sendplot/index.html sendplot] == | ||
== [http://cran.r-project.org/web/packages/RIGHT/index.html RIGHT] == | == [http://cran.r-project.org/web/packages/RIGHT/index.html RIGHT] == | ||
The supported plot types include scatterplot, barplot, box plot, line plot and pie plot. | The supported plot types include scatterplot, barplot, box plot, line plot and pie plot. | ||
Line 370: | Line 550: | ||
* [http://deanattali.com/blog/htmlwidgets-tips/ How to write a useful htmlwidgets in R: tips and walk-through a real example] | * [http://deanattali.com/blog/htmlwidgets-tips/ How to write a useful htmlwidgets in R: tips and walk-through a real example] | ||
== [http://cran.r-project.org/web/packages/networkD3/index.html networkD3] == | == igraph == | ||
This is a port of Christopher Gandrud's [http://christophergandrud.github.io/d3Network/ d3Network] package to the htmlwidgets framework. | <ul> | ||
<li>https://cran.r-project.org/web/packages/igraph/index.html | |||
* On Ubuntu, run <syntaxhighlight lang='sh' inline>apt install libglpk-dev </syntaxhighlight> | |||
<li>[https://www.geeksforgeeks.org/creating-an-igraph-object-in-r/ Creating an igraph object in R] | |||
<li>[https://robwiederstein.github.io/network_analysis/igraph.html Network Analysis in R] | |||
<li>[https://shiring.github.io/genome/2016/12/14/homologous_genes_part2_post creating directed networks with igraph] | |||
<li>Tips | |||
* [https://stackoverflow.com/a/14400780 Can we vary the text size along with node size in R-igraph?] | |||
* [https://www.reddit.com/r/Rlanguage/comments/yv8jjg/igraph_how_do_i_make_the_font_smaller_to_fit/ igraph: How do I make the font smaller to fit inside of the node so that it's readable?] | |||
* [https://stackoverflow.com/a/38452176 Add legend in igraph to annotate difference vertices size] | |||
* Compare two graphs: https://igraph.org/c/doc/igraph-Isomorphism.html. In simple terms, two graphs are isomorphic if they become indistinguishable from each other once their vertex labels are removed. | |||
<pre> | |||
g1 <- make_ring(10) | |||
g2 <- make_ring(10) | |||
all.equal(g1, g2) # FALSE, checking nearly equal | |||
identical(g1, g2) # FALSE | |||
# Check if the graphs are isomorphic | |||
isomorphic(g1, g2) # TRUE | |||
</pre> | |||
<li>Extract coordinates: there are different layouts. The default is layout.auto(). | |||
<pre> | |||
data(karate, package="igraphdata") | |||
G <- upgrade_graph(karate) | |||
plot(G) # same as | |||
plot(G, layout = layout_nicely(G)) | |||
plot(G, layout = layout.fruchterman.reingold(G)) | |||
plot(G, layout = layout.circle(G)) # not good if there are too many vertices | |||
plot(G, layout = layout.sphere(G)) # complicated | |||
plot(G, layout = layout.random(G)) # complicated | |||
L <- layout.fruchterman.reingold(G) | |||
dim(L) # 34 2 | |||
</pre> | |||
<li>Message when I use load() to load igraph objects created from igraph version 1.4.2 in R with igraph 2.0.3. | |||
<pre> | |||
This graph was created by an old(er) igraph version. | |||
Call upgrade_graph() on it to use with the current igraph version | |||
For now we convert it on the fly... | |||
</pre> | |||
</ul> | |||
=== visNetwork === | |||
* https://cran.r-project.org/web/packages/visNetwork/index.html | |||
* [https://www.youtube.com/watch?v=hgUJ-UFv4YY Create Interactive networks using R programming] | |||
* It is used by [https://bioconductor.org/packages/release/bioc/html/FELLA.html FELLA::launchApp()]. | |||
=== [http://cran.r-project.org/web/packages/networkD3/index.html networkD3] === | |||
* This is a port of Christopher Gandrud's [http://christophergandrud.github.io/d3Network/ d3Network] package to the htmlwidgets framework. | |||
* [https://datasandbox.netlify.app/posts/2022-07-12-network-graphs-in-r/index.en.html Network Graphs in R] from ''The Data Sandbox''. | |||
=== plotly === | |||
* [https://minimaxir.com/notebooks/interactive-network/ How to Create an Interactive WebGL Network Graph Using R and Plotly] | |||
* A working example | |||
<syntaxhighlight lang='r'> | |||
library(igraph) | |||
library(ggplot2) | |||
library(plotly) | |||
# Create a data frame that represents edges | |||
dat <- data.frame(name=c("Alice", "Bob", "Cecil"), age=c(48,33,45)) | |||
# Step 1: Create an igraph object | |||
g <- graph_from_data_frame(dat, directed = FALSE) | |||
# Get the layout of the graph | |||
layout <- layout_nicely(g) | |||
# Create a data frame for the vertices | |||
vertices <- data.frame(id = V(g)$name, x = layout[,1], y = layout[,2], | |||
var1=V(g)$name, var2=LETTERS[1:6]) | |||
# Get the edges and convert vertex names to coordinates | |||
edges <- get.data.frame(g, what = "edges") | |||
edges <- merge(edges, vertices, by.x = "from", by.y = "id") | |||
edges <- merge(edges, vertices, by.x = "to", by.y = "id", suffixes = c(".from", ".to")) | |||
# Step 2: Create a ggplot object | |||
p <- ggplot(vertices, aes(x = x, y = y)) + | |||
geom_segment(data = edges, aes(x = x.from, y = y.from, xend = x.to, yend = y.to)) + | |||
geom_point(aes(text = paste("Var1:", var1, "\nVar2:", var2))) + | |||
geom_text(aes(x = x, y = y, label = id), vjust = 1, hjust = 0, nudge_x=.2) + | |||
expand_limits(x = c(-2, 1.5)) + | |||
theme( | |||
axis.line = element_blank(), # Hide axis lines | |||
axis.text = element_blank(), # Hide axis text | |||
axis.ticks = element_blank(), # Hide axis ticks | |||
axis.title = element_blank(), # Hide axis labels | |||
panel.grid.major = element_blank(), # Hide major grid | |||
panel.grid.minor = element_blank(), # Hide minor grid | |||
panel.background = element_rect(fill = "white") # Set background to white | |||
) | |||
# Print the plot | |||
print(p) | |||
# Step 3: # Convert the ggplot object to a plotly object | |||
# Tooltip works only on 'points', not on labels. | |||
ggplotly(p, tooltip = "text") | |||
</syntaxhighlight> | |||
=== ggiraph === | |||
[https://stackoverflow.com/a/64959301 Convert ggraph to interative plot with plotyly or Network3D] | |||
== [http://cran.r-project.org/web/packages/scatterD3/index.html scatterD3] == | == [http://cran.r-project.org/web/packages/scatterD3/index.html scatterD3] == | ||
Line 392: | Line 677: | ||
== collapsibleTree == | == collapsibleTree == | ||
https://github.com/adeelk93/collapsibletree | * https://github.com/adeelk93/collapsibletree | ||
* https://cran.r-project.org/web/packages/jsTreeR/index.html | |||
== [https://cran.r-project.org/web/packages/svgPanZoom/index.html svgPanZoom] == | == [https://cran.r-project.org/web/packages/svgPanZoom/index.html svgPanZoom] == | ||
Line 409: | Line 695: | ||
== plotly == | == plotly == | ||
<ul> | |||
<li>[https://plotly.com/r/3d-scatter-plots/ How to make interactive 3D scatter plots], https://plotly.com/r/reference/#scatter-mode | |||
{{Pre}} | |||
mtcars$am[which(mtcars$am == 0)] <- 'Automatic' | |||
mtcars$am[which(mtcars$am == 1)] <- 'Manual' | |||
mtcars$am <- as.factor(mtcars$am) | |||
fig <- plot_ly(mtcars, x = ~wt, y = ~hp, z = ~qsec, color = ~am, colors = c('#BF382A', '#0C4B8E')) | |||
fig <- fig %>% add_markers() | |||
fig <- fig %>% layout(scene = list(xaxis = list(title = 'Weight'), | |||
yaxis = list(title = 'Gross horsepower'), | |||
zaxis = list(title = '1/4 mile time'))) | |||
fig | |||
</pre> | |||
[[File:Plotly3d.png|350px]] | |||
{{Pre}} | |||
x <- rnorm(100); y <- rnorm(100) ; z <- rnorm(100) | |||
labels <- paste("point", 1:100) | |||
w <- sample(scales::hue_pal()(3), 100, replace=T) # use ggplot2's default color scheme | |||
plot_ly(x=x, y=y, z=z, text=labels, type ="scatter3d", mode="markers", marker = list(color = z, colorscale ="Viridis")) | |||
plot_ly(x=x, y=y, z=z, text=labels, type ="scatter3d", mode="markers", marker = list(color = w)) | |||
</pre> | |||
<li>[https://community.rstudio.com/t/how-to-add-more-data-to-tooltip-in-r-plotly-besides-the-text-argument/91971/2 How to add more data to tooltip in R plotly (besides the `text` argument)] </li> | |||
<li>[https://plotly.com/r/line-and-scatter/ Data Labels on Hover] (Hover text), [https://plotly.com/r/hover-text-and-formatting/ Hover Text and Formatting in R] </li> | |||
<li>[https://plotly.com/ggplot2/hover-text-and-formatting/ Hover Text and Formatting in ggplot2] </li> | |||
<pre> | |||
gobj <- ggplot(aes(x=var1, y=var2, text = paste0("another :", var3))) + geom_jitter() | |||
ggplotly(gobj) # var3 will be shown on the tooltip/hover text | |||
</pre> | |||
</ul> | |||
* [http://moderndata.plot.ly/power-curves-r-plotly-ggplot2/ Power curves] and ggplot2. | * [http://moderndata.plot.ly/power-curves-r-plotly-ggplot2/ Power curves] and ggplot2. | ||
* [http://moderndata.plot.ly/time-series-charts-by-the-economist-in-r-using-plotly/ TIME SERIES CHARTS BY THE ECONOMIST IN R USING PLOTLY] & [https://moderndata.plot.ly/interactive-r-visualizations-with-d3-ggplot2-rstudio/ FIVE INTERACTIVE R VISUALIZATIONS WITH D3, GGPLOT2, & RSTUDIO] | * [http://moderndata.plot.ly/time-series-charts-by-the-economist-in-r-using-plotly/ TIME SERIES CHARTS BY THE ECONOMIST IN R USING PLOTLY] & [https://moderndata.plot.ly/interactive-r-visualizations-with-d3-ggplot2-rstudio/ FIVE INTERACTIVE R VISUALIZATIONS WITH D3, GGPLOT2, & RSTUDIO] | ||
Line 420: | Line 739: | ||
* [https://www.displayr.com/how-to-add-trend-lines-in-r-using-plotly/?utm_medium=Feed&utm_source=Syndication How to add Trend Lines in R Using Plotly] | * [https://www.displayr.com/how-to-add-trend-lines-in-r-using-plotly/?utm_medium=Feed&utm_source=Syndication How to add Trend Lines in R Using Plotly] | ||
* [https://blog.methodsconsultants.com/posts/introduction-to-interactive-graphics-in-r-with-plotly/ Introduction to Interactive Graphics in R with plotly] | * [https://blog.methodsconsultants.com/posts/introduction-to-interactive-graphics-in-r-with-plotly/ Introduction to Interactive Graphics in R with plotly] | ||
* [https://plotly-r.com/controlling-tooltips.html#tooltip-text-ggplotly Tooltip] | * [https://plotly-r.com/ Interactive web-based data visualization with R, plotly, and shiny] (ebook) by Carson Sievert | ||
** [https://plotly-r.com/controlling-tooltips.html#tooltip-text-ggplotly Tooltip], [https://stackoverflow.com/a/43571726 Formatting mouse over labels in plotly when using ggplotly] | |||
**[https://plotly.com/r/3d-scatter-plots/ 3D Scatter Plots], [https://shirinsplayground.netlify.app/2021/03/kmeans_101/ k-Means 101: An introductory guide to k-Means clustering in R]. Note the 3D plot is displayed on a browser. | |||
* save interaction plots in HTML | |||
:<syntaxhighlight lang='rsplus'> | |||
p <- plotly::ggplotly(b) | |||
htmlwidgets::saveWidget(p, "index.html") | |||
</syntaxhighlight> | |||
* Used plotly in shiny. [https://rdrr.io/cran/plotly/man/plotly-shiny.html plotlyOutput()] | |||
== highcharter == | == highcharter: alternative to plotly == | ||
https://cran.r-project.org/web/packages/highcharter/ Good for time series plot. | https://cran.r-project.org/web/packages/highcharter/ Good for time series plot. | ||
= Amazon = | = Amazon = | ||
[https://github.com/56north/Rmazon Download product information and reviews from Amazon.com] | [https://github.com/56north/Rmazon Download product information and reviews from Amazon.com] (2016, not working in get_reviews() as of 6/7/2023) | ||
{{Pre}} | {{Pre}} | ||
sudo apt-get install libxml2-dev | sudo apt-get install libxml2-dev | ||
Line 463: | Line 790: | ||
# ... with 20 more rows, and 1 more variable: reviewText <chr> | # ... with 20 more rows, and 1 more variable: reviewText <chr> | ||
reviews[1, 6] # 6-th column is the review text | reviews[1, 6] # 6-th column is the review text | ||
</pre> | |||
[https://martinctc.github.io/blog/vignette-scraping-amazon-reviews-in-r/ Vignette: Scraping Amazon Reviews in R] 2019 | |||
<pre> | |||
library(rvest) | |||
library(dplyr) | |||
url <- "https://www.amazon.com/dp/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews" | |||
page <- read_html(url) | |||
reviews <- page %>% | |||
html_nodes(".review") %>% | |||
html_text() | |||
titles <- page %>% | |||
html_nodes(".review-title-content") %>% | |||
html_text() | |||
ratings <- page %>% | |||
html_nodes(".review-rating") %>% | |||
html_text() | |||
df <- data.frame(reviews, titles, ratings) | |||
</pre> | |||
[https://stackoverflow.com/a/42656204 Scraping Amazon Customer Reviews] (almost). The url is obtained by clicking "See all reviews", like [https://www.amazon.com/Art-Programming-Statistical-Software-Design/product-reviews/1593273843/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews this one]. | |||
Google Bard answers: first 10 reviews. | |||
<pre> | |||
library(rvest) | |||
# Get the product ASIN code | |||
asin <- "B07HSKPDBV" | |||
# Create a URL for the product reviews page | |||
url <- paste0("https://www.amazon.com/product-reviews/", asin) | |||
# Read the HTML content of the product reviews page | |||
html <- read_html(url) | |||
# Extract the review titles, bodies, and ratings | |||
reviews <- html %>% | |||
html_nodes(".review-text") %>% | |||
html_text() %>% # so far is OK | |||
data.frame( | |||
title = strsplit(., "\n")[[1]][1], | |||
body = strsplit(., "\n")[-1], | |||
rating = strsplit(., " ")[1][2] | |||
) | |||
# Print the first 5 reviews | |||
head(reviews, 5) | |||
</pre> | |||
All reviews. | |||
<pre> | |||
library(rvest) | |||
# Get the product ASIN code | |||
asin <- "B07HSKPDBV" | |||
# Create a function to scrape the reviews for a single page | |||
scrape_reviews <- function(url) { | |||
html <- read_html(url) | |||
reviews <- html %>% | |||
html_nodes(".review-text") %>% | |||
html_text() %>% | |||
data.frame( | |||
title = strsplit(., "\n")[[1]][1], | |||
body = strsplit(., "\n")[-1], | |||
rating = strsplit(., " ")[1][2] | |||
) | |||
return(reviews) | |||
} | |||
# Create a vector of URLs for all of the review pages | |||
urls <- seq(from = 1, to = 100, by = 10) %>% | |||
map_chr(function(x) paste0("https://www.amazon.com/product-reviews/", asin, "?pageNumber=", x)) | |||
# Scrape the reviews for all of the pages | |||
reviews <- urls %>% | |||
map(scrape_reviews) %>% | |||
do.call(rbind, .) | |||
# Print the first 5 reviews | |||
head(reviews, 5) | |||
</pre> | </pre> | ||
Line 470: | Line 882: | ||
= Feed = | = Feed = | ||
[https://github.com/datawookie/feeder feedeR] - Feed Reader Package for R | [https://github.com/datawookie/feeder feedeR] - Feed Reader Package for R | ||
= File sharing = | |||
[https://datawookie.dev/blog/2021/11/filebin-quick-easy-file-sharing/ {filebin} Quick & Easy File Sharing] | |||
= Twitter = | = Twitter = | ||
Line 475: | Line 890: | ||
= OCR = | = OCR = | ||
* [http://ropensci.org/blog/blog/2016/11/16/tesseract Tesseract package: High Quality OCR in R], [https://www.r-bloggers.com/how-to-do-optical-character-recognition-ocr-of-non-english-documents-in-r-using-tesseract/ How to do Optical Character Recognition (OCR) of non-English documents in R using Tesseract?] | * [https://cran.r-project.org/web/packages/tesseract/vignettes/intro.html Using the Tesseract OCR engine in R] | ||
** [http://ropensci.org/blog/blog/2016/11/16/tesseract Tesseract package: High Quality OCR in R], [https://www.r-bloggers.com/how-to-do-optical-character-recognition-ocr-of-non-english-documents-in-r-using-tesseract/ How to do Optical Character Recognition (OCR) of non-English documents in R using Tesseract?] | |||
* https://cran.r-project.org/web/packages/abbyyR/index.html | * https://cran.r-project.org/web/packages/abbyyR/index.html | ||
== Online == | |||
* https://www.onlineocr.net/ (works) | |||
* https://ocr.space/ (not expected) | |||
* https://www.newocr.com/ (not expected) | |||
= Wikipedia = | = Wikipedia = | ||
[https://github.com/ironholds/wikipedir WikipediR]: R's MediaWiki API client library | [https://github.com/ironholds/wikipedir WikipediR]: R's MediaWiki API client library |
Latest revision as of 09:12, 10 September 2024
R Web Applications
See also CRAN Task View: Web Technologies and Services
Rmarkdown: create HTML5 web, slides and more
HTTP protocol
- http://en.wikipedia.org/wiki/File:Http_request_telnet_ubuntu.png
- Query string
- How to capture http header? Use curl -i en.wikipedia.org.
- Web Inspector. Build-in in Chrome. Right click on any page and choose 'Inspect Element'.
- Web server
- Simple TCP/IP web server
- HTTP Made Really Easy
- Illustrated Guide to HTTP
- nweb: a tiny, safe Web server with 200 lines
- Tiny HTTPd
An HTTP server is conceptually simple:
- Open port 80 for listening
- When contact is made, gather a little information (get mainly - you can ignore the rest for now)
- Translate the request into a file request
- Open the file and spit it back at the client
It gets more difficult depending on how much of HTTP you want to support - POST is a little more complicated, scripts, handling multiple requests, etc.
Example in R
> co <- socketConnection(port=8080, server=TRUE, blocking=TRUE) > # Now open a web browser and type http://localhost:8080/index.html > readLines(co,1) [1] "GET /index.html HTTP/1.1" > readLines(co,1) [1] "Host: localhost:8080" > readLines(co,1) [1] "User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0" > readLines(co,1) [1] "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" > readLines(co,1) [1] "Accept-Language: en-US,en;q=0.5" > readLines(co,1) [1] "Accept-Encoding: gzip, deflate" > readLines(co,1) [1] "Connection: keep-alive" > readLines(co,1) [1] ""
Example in C (Very simple http server written in C, 187 lines)
Create a simple hello world html page and save it as <index.html> in the current directory (/home/brb/Downloads/)
Launch the server program (assume we have done gcc http_server.c -o http_server)
$ ./http_server -p 50002 Server started at port no. 50002 with root directory as /home/brb/Downloads
Secondly open a browser and type http://localhost:50002/index.html. The server will respond
GET /index.html HTTP/1.1 Host: localhost:50002 User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Connection: keep-alive file: /home/brb/Downloads/index.html GET /favicon.ico HTTP/1.1 Host: localhost:50002 User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Connection: keep-alive file: /home/brb/Downloads/favicon.ico GET /favicon.ico HTTP/1.1 Host: localhost:50003 User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Connection: keep-alive file: /home/brb/Downloads/favicon.ico
The browser will show the page from <index.html> in server.
The only bad thing is the code does not close the port. For example, if I have use Ctrl+C to close the program and try to re-launch with the same port, it will complain socket() or bind(): Address already in use.
Another Example in C (55 lines)
http://mwaidyanatha.blogspot.com/2011/05/writing-simple-web-server-in-c.html
The response is embedded in the C code.
If we test the server program by opening a browser and type "http://localhost:15000/", the server received the follwing 7 lines
GET / HTTP/1.1 Host: localhost:15000 User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:23.0) Gecko/20100101 Firefox/23.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Connection: keep-alive
If we include a non-executable file's name in the url, we will be able to download that file. Try "http://localhost:15000/client.c".
If we use telnet program to test, wee need to type anything we want
$ telnet localhost 15000 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. ThisCanBeAnything <=== This is what I typed in the client and it is also shown on server HTTP/1.1 200 OK <=== From here is what I got from server Content-length: 37Content-Type: text/html HTML_DATA_HERE_AS_YOU_MENTIONED_ABOVE <=== The html tags are not passed from server, interesting! Connection closed by foreign host. $
See also more examples under C page.
Others
- http://rosettacode.org/wiki/Hello_world/ (Different languages)
- http://kperisetla.blogspot.com/2012/07/simple-http-web-server-in-c.html (Windows web server)
- http://css.dzone.com/articles/web-server-c (handling HTTP GET request, handling content types(txt, html, jpg, zip. rar, pdf, php etc.), sending proper HTTP error codes, serving the files from a web root, change in web root in a config file, zero copy optimization using sendfile method and php file handling.)
- https://github.com/gtungatkar/Simple-HTTP-server
- https://github.com/davidmoreno/onion
shiny
See Shiny.
plumber: Turning your R code into a RESTful Web API
1. Too heavy to install 2. Get an error with dependencies when I try it on Ubuntu 16.04 3. check out the servr package
- https://github.com/trestletech/plumber
- Creating a Web Service in R
- https://www.rstudio.com/resources/videos/plumber-turning-your-r-code-into-an-api/
- RStudio 1.2 Preview: Plumber Integration
- Using docker to deploy an R plumber API
Docker
- There are two major Docker images. They include gcc, gfortran, .... So it can be used to install Rcpp package for example.
- Official which supports version tags. The official Docker image's dockerfile still points to rocker/r-base.
- rocker project which only has the latest tag
- Using Docker as a Personal Productivity Tool – Running Command Line Apps Bundled in Docker Containers
- Dockerized RStudio server from Duke University. 110 containers were set up on a cloud server (4 cores, 28GB RAM, 400GB disk). Each container has its own port number. Each student is mapped to a single container. https://github.com/mccahill/docker-rstudio
- RStudio in the cloud with Amazon Lightsail and docker
- Mark McCahill (RStudio + Docker)
- BiocImageBuilder
- Why Use Docker with R? A DevOps Perspective
- Running your R script in Docker. Goal: containerizing an R script to eventually execute it automatically each time the container is started, without any user interaction. An enhanced version of the instruction is at this page.
httpuv and servr
httpuv is more low-level and flexible, while servr is higher-level and easier to use for specific tasks.
See also the servr package which can start an HTTP server in R to serve static files, or dynamic documents that can be converted to HTML files (e.g., R Markdown) under a given directory.
R built in Web server, servr::httw() to serve a local directory as a website
servr::httw("DIRECTORY")
Rscript -e "servr::httd()" # default port 4321 # If index.html was found it will be used # Otherwise, it will serve files under the cur dir # Open another terminal Rscript -e "servr::httd('/tmp')" -p4000
Question: Why not just open an html file in a browser? Answer: While opening an HTML file directly in a browser can be fine for simple, static pages, using a local server like `servr` provides a more accurate and robust testing environment for more complex websites:
- Relative links: If your website uses relative links (links that point to other pages within the same site), these links may not work correctly when you open an HTML file directly in your browser. This is because the browser treats the file as if it's not part of a larger site. The `servr` package solves this problem by serving the entire directory as a website, preserving the correct structure and allowing relative links to function as intended.
- Dynamic content: Some websites include dynamic content that requires a server to function correctly. This could include things like form submissions or search functionality. By using `servr`, you can test these features locally before deploying your site.
- Mimic production environment: Using `servr` allows you to mimic the environment in which your website will be deployed. This can help catch any issues that might not be apparent when simply opening an HTML file in a browser. Serving your site with servr allows you to test server-side code and functionality. For instance, if your site uses server-side scripting (like PHP or ASP.NET), opening an HTML file directly in a browser won’t execute this code. But when served with servr, this code will be executed, allowing you to fully test your site’s functionality.
beakr
- What is beakr?
- Web Frameworks for R – A Brief Overview and hello world examples.
workflowr
httr2
httr2 package - Perform HTTP Requests and Process the Responses.
httptest2
- httptest2 - Test Helpers for 'httr2'
- HTTP testing in R, Better Understanding Your Tools Choices with Online Book HTTP Testing in R
opencpu
- https://github.com/opencpu/opencpu
- rjs: R in JavaScript
RApache
gWidgetsWWW
- http://www.jstatsoft.org/v49/i10/paper
- gWidgetsWWW2 gWidgetsWWW based on Rook
- Compare shiny with gWidgetsWWW2.rapache
Rook
See Rook.
sumo
Sumo is a fully-functional web application template that exposes an authenticated user's R session within java server pages. See the paper http://journal.r-project.org/archive/2012-1/RJournal_2012-1_Bergsma+Smith.pdf.
Stockplot
FastRWeb
http://cran.r-project.org/web/packages/FastRWeb/index.html
WebDriver
'WebDriver' Client for 'PhantomJS'
https://github.com/rstudio/webdriver
Rwui
CGHWithR and WebDevelopR
CGHwithR is still working with old version of R although it is removed from CRAN. Its successor is WebDevelopR. Its The vignette (year 2013) provides a review of several available methods.
manipulate from RStudio
This is not a web application. But the manipulate package can be used to create interactive plot within R(Studio) environment easily. Its source is available at here.
Mathematica also has manipulate function for plotting; see here.
RCloud
RCloud is an environment for collaboratively creating and sharing data analysis scripts. RCloud lets you mix analysis code in R, HTML5, Markdown, Python, and others. Much like Sage, iPython notebooks and Mathematica, RCloud provides a notebook interface that lets you easily record a session and annotate it with text, equations, and supporting images.
See also the Talk in UseR 2014.
cloudyr and flyio - Input Output Files in R from Cloud or Local
https://blog.socialcops.com/inside-sc/announcements/flyio-r-package-interact-data-cloud/ Announcing flyio, an R Package to Interact with Data in the Cloud]
Dropbox access
rdrop2 package
Javascript
JavaScript for R — ebook https://book.javascript-for-r.com/
Sketch
Sketch Package looks to add JavaScript to R packages
Web page scraping
http://www.slideshare.net/schamber/web-data-from-r#btnNext
xml2 package
rvest package depends on xml2.
rvest, extract tables
- Easy web scraping with R
- Blown away that web scraping in R with the rvest package is as easy as
page <- read_html("https://en.m.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population") tables <- html_table(page) tables[[2]] # A tibble: 243 × 7 Rank `Country / Dependency` Population Population Date Source (official or …¹ Notes <chr> <chr> <chr> <chr> <chr> <chr> <chr> 1 Rank Country / Dependency Numbers % of the … Date Source (official or f… "Not… 2 – World 8,026,177,… 100% 16 A… UN projection[3] "" 3 1 China 1,411,750,… 17.6% 31 D… Official estimate[4] "[b]" 4 2 India 1,392,329,… 17.3% 1 Ma… Official projection[5] "[c]… 5 3 United States 334,631,000 4.17% 16 A… National population c… "[d]" 6 4 Indonesia 275,773,800 3.44% 1 Ju… Official estimate[7] "" 7 5 Pakistan 235,825,000 2.94% 1 Ju… UN projection[3] "[e]" 8 6 Nigeria 218,541,000 2.72% 1 Ju… UN projection[3] "" 9 7 Brazil 216,024,545 2.69% 16 A… National population c… "" 10 8 Bangladesh 169,828,911 2.12% 15 J… 2022 final census res… "" # ℹ 233 more rows # ℹ abbreviated name: ¹`Source (official or from the United Nations)` # ℹ Use `print(n = ...)` to see more rows
- On Ubuntu, we need to install two packages first!
sudo apt-get install libcurl4-openssl-dev # OR libcurl4-gnutls-dev sudo apt-get install libxml2-dev
- https://github.com/tidyverse/rvest, NEWS
- Visualizing obesity across United States by using data from Wikipedia
- rvest tutorial: scraping the web using R
- https://renkun.me/pipeR-tutorial/Examples/rvest.html
- http://zevross.com/blog/2015/05/19/scrape-website-data-with-the-new-r-package-rvest/
- Google scholar scraping with rvest package
- An update to "An adventure in downloading books"
- How to webscrape in R?
Reading Remote Data Files: rvest
V8: Embedded JavaScript Engine for R
R⁶ — General (Attys) Distributions: V8, rvest, ggbeeswarm, hrbrthemes and tidyverse packages are used.
pubmed.mineR
Text mining of PubMed Abstracts (http://www.ncbi.nlm.nih.gov/pubmed). The algorithms are designed for two formats (text and XML) from PubMed.
R code for scraping the P-values from pubmed, calculating the Science-wise False Discovery Rate, et al (Jeff Leek)
Get API data
How to get API data with R. See how to write your own R code to pull data from an API using API key authentication.
webshot2 - take a screenshot of web page
webshot2. It uses headless Chrome via the Chromote package. You also need to have the Chrome browser installed on your system. You can also use other browsers based on Chromium, such as Chromium itself, Edge, Vivaldi, Brave, or Opera.
These R packages import sports, weather, stock data and more
- https://www.computerworld.com/article/3109890/data-analytics/these-r-packages-import-sports-weather-stock-data-and-more.html
- https://github.com/ALShum/rwunderground
- Accessing APIs from R (and a little R programming). A personal key is required. 500 times per day for a free account.
- https://cran.r-project.org/web/packages/rnoaa/index.html. A personal API key (token) is required. 10,000 requests per day
http://ram-n.github.io/weatherData/(not working)- Historical Weather Data
Diving Into Dynamic Website Content with splashr
https://rud.is/b/2017/02/09/diving-into-dynamic-website-content-with-splashr/
Network
netstat
https://cran.r-project.org/web/packages/netstat/index.html
Send email
curl
- curl::send_mail().
- R Internet: Yet Another Way To Send Emails On Windows/Sending Emails with {curl} and Docker
- 16 Command Examples to Send Email From The Linux Command Line
- 5 Extra Uses For Curl That Don’t Involve Web Requests. Send an email, Check for open ports, Upload files using TFTP, Resume downloadsDownload files from an SMB share.
Note
- As we can see in the example snippet, there is a duplicate of sender's and recipient's emails.
- It seems the recipients and sender parts are the real ones
- In the message string, we can skip 'From' and 'To' line.
- The sender's email entered in message is not used
- The recipient's email entered in message is not used
- Experiment: recipients & sender are correct but the two email addresses in message are wrong.
- Answer: I can still receive the email.
- (In the recipient's inbox) the email address in the 'From' part will be replaced with the correct one but the name (e.g. "R (curl package)") will be the one we use in the message
- (In the recipient's inbox) the email address & the name in the 'To' part will be the one as we use in the message (interesting)
- Question: does sender's email has to be gmail.com?
library(curl) recipients <- "[email protected]" sender <- '[email protected]' # Full email message in RFC2822 format message <- 'From: "R (curl package)" <[email protected]> To: "Roger Recipient" <[email protected]> Subject: Hello R user! Dear R user, I am sending this email using curl.' # Send the email send_mail(sender, recipients, message, smtp_server = 'smtps://smtp.gmail.com', username = 'curlpackage', password = 'qyyjddvphjsrbnlm')
emayili
- https://cran.r-project.org/web/packages/emayili/index.html
- emayili: Sending Email from R. This shows how to send emails from the terminal command line tool curl if we don't want to use R to do the job. Compared to the 'curl' package, there is no need to repeat the sender's & recipient's email addresses in the message part.
- You can add Cc, Bcc and Reply-To header fields using the cc(), bcc() and reply() methods. Files can be attached using the attachment() method.
- Note that if the email in envelope(From) is different from the one in server(username), I see the email in evelope(From) will be ignore. So in the following simplified example I skip envelope(From).
- If I change the 'to' email to a yahoo account, it works but the email went to the Trash folder.
- Rendering R Markdown
library(emayili) email <- envelope( to = "[email protected]", subject = "This is a plain text message!", text = "Hello!\nHello2" ) smtp <- server(host = "smtp.gmail.com", port = 465, username = "[email protected]", password = "bd40ef6d4a9413de9c1318a65cbae5d7") smtp(email, verbose = TRUE)
- Creating Email Threads
blastula (RStudio)
mailR
Easiest. Require rJava package (not trivial to install, see rJava). mailR is an interface to Apache Commons Email to send emails from within R. See also send bulk email
Before we use the mailR package, we have followed here to have Allow less secure apps: 'ON' ; or you might get an error Error: EmailException (Java): Sending the email to the following server failed : smtp.gmail.com:465. Once we turn on this option, we may get an email for the notification of this change. Note that the recipient can be other than a gmail.
> send.mail(from = "[email protected]", to = c("[email protected]", "Recipient 2 <[email protected]>"), replyTo = c("Reply to someone else <[email protected]>") subject = "Subject of the email", body = "Body of the email", smtp = list(host.name = "smtp.gmail.com", port = 465, user.name = "gmail_username", passwd = "password", ssl = TRUE), attach.files ="./myattachment.txt", authenticate = TRUE, send = TRUE) [1] "Java-Object{org.apache.commons.mail.SimpleEmail@7791a895}"
- Encrypt the password. First write your password on your environment variable Sys.setenv(GMAIL_PWD = 'mypassword') Then call your env variable in your script passwd = Sys.getenv("GMAIL_PWD").
- MailR SMTP Setup (Gmail, Outlook, Yahoo) | STARTTLS
gmailr
More complicated. gmailr provides access the Google's gmail.com RESTful API. Vignette and an example on here. Note that it does not use a password; it uses a json file for oauth authentication downloaded from https://console.cloud.google.com/. See also https://github.com/jimhester/gmailr/issues/1.
library(gmailr) gmail_auth('mysecret.json', scope = 'compose') test_email <- mime() %>% to("[email protected]") %>% from("[email protected]") %>% subject("This is a subject") %>% html_body("<html><body>I wish this was bold</body></html>") send_message(test_email)
sendmailR
sendmailR provides a simple SMTP client. It is not clear how to use the package (i.e. where to enter the password).
json
jsonlite
- Reading JSON file from web and preparing data for analysis
- xkcd Comics as a Minimal Example for Calling APIs, Downloading Files and Displaying PNG Images with R
rjson
http://heuristically.wordpress.com/2013/05/20/geolocate-ip-addresses-in-r/
RJSONIO
Accessing Bitcoin Data with R
http://blog.revolutionanalytics.com/2015/11/accessing-bitcoin-data-with-r.html
Plot IP on google map
- http://thebiobucket.blogspot.com/2011/12/some-fun-with-googlevis-plotting-blog.html#more (RCurl, RJONIO, plyr, googleVis)
- http://devblog.icans-gmbh.com/using-the-maxmind-geoip-api-with-r/ (RCurl, RJONIO, maps)
- http://cran.r-project.org/web/packages/geoPlot/index.html (geoPlot package (deprecated as 8/12/2013))
- http://archive09.linux.com/feature/135384 (Not R) ApacheMap
- http://batchgeo.com/features/geolocation-ip-lookup/ (Not R) (Enter a spreadsheet of adress, city, zip or a column of IPs and it will show the location on google map)
- http://code.google.com/p/apachegeomap/
The following example is modified from the first of above list.
require(RJSONIO) # fromJSON require(RCurl) # getURL temp = getURL("https://gist.github.com/arraytools/6743826/raw/23c8b0bc4b8f0d1bfe1c2fad985ca2e091aeb916/ip.txt", ssl.verifypeer = FALSE) ip <- read.table(textConnection(temp), as.is=TRUE) names(ip) <- "IP" nr = nrow(ip) Lon <- as.numeric(rep(NA, nr)) Lat <- Lon Coords <- data.frame(Lon, Lat) ip2coordinates <- function(ip) { api <- "http://freegeoip.net/json/" get.ips <- getURL(paste(api, URLencode(ip), sep="")) # result <- ldply(fromJSON(get.ips), data.frame) result <- data.frame(fromJSON(get.ips)) names(result)[1] <- "ip.address" return(result) } for (i in 1:nr){ cat(i, "\n") try( Coords[i, 1:2] <- ip2coordinates(ip$IP[i])[c("longitude", "latitude")] ) } # append to log-file: logfile <- data.frame(ip, Lat = Coords$Lat, Long = Coords$Lon, LatLong = paste(round(Coords$Lat, 1), round(Coords$Lon, 1), sep = ":")) log_gmap <- logfile[!is.na(logfile$Lat), ] require(googleVis) # gvisMap gmap <- gvisMap(log_gmap, "LatLong", options = list(showTip = TRUE, enableScrollWheel = TRUE, mapType = 'hybrid', useMapTypeControl = TRUE, width = 1024, height = 800)) plot(gmap)
The plot.gvis() method in googleVis packages also teaches the startDynamicHelp() function in the tools package, which was used to launch a http server. See Jeffrey Horner's note about deploying Rook App.
Convert JSON to CSV using Linux shell
How to convert JSON to CSV using Linux / Unix shell
GEO (Gene Expression Omnibus)
See this internal link.
Interactive html output
webr
- Playing with webr R in your browser.
- https://github.com/coatless/quarto-webr?tab=readme-ov-file
- A webR powered Shiny app for browsing TidyTuesday plots
sendplot
RIGHT
The supported plot types include scatterplot, barplot, box plot, line plot and pie plot.
In addition to tooltip boxes, the package can create a table showing all information about selected nodes.
r2d3
r2d3 - R Interface to D3 Visualizations
d3Network
- http://christophergandrud.github.io/d3Network/ (old)
- https://christophergandrud.github.io/networkD3/ (new)
library(d3Network) Source <- c("A", "A", "A", "A", "B", "B", "C", "C", "D") Target <- c("B", "C", "D", "J", "E", "F", "G", "H", "I") NetworkData <- data.frame(Source, Target) d3SimpleNetwork(NetworkData, height = 800, width = 1024, file="tmp.html")
htmlwidgets for R
Embed widgets in R Markdown documents and Shiny web applications.
- Official website http://www.htmlwidgets.org/.
- How to write a useful htmlwidgets in R: tips and walk-through a real example
igraph
- https://cran.r-project.org/web/packages/igraph/index.html
- On Ubuntu, run
apt install libglpk-dev
- On Ubuntu, run
- Creating an igraph object in R
- Network Analysis in R
- creating directed networks with igraph
- Tips
- Can we vary the text size along with node size in R-igraph?
- igraph: How do I make the font smaller to fit inside of the node so that it's readable?
- Add legend in igraph to annotate difference vertices size
- Compare two graphs: https://igraph.org/c/doc/igraph-Isomorphism.html. In simple terms, two graphs are isomorphic if they become indistinguishable from each other once their vertex labels are removed.
g1 <- make_ring(10) g2 <- make_ring(10) all.equal(g1, g2) # FALSE, checking nearly equal identical(g1, g2) # FALSE # Check if the graphs are isomorphic isomorphic(g1, g2) # TRUE
- Extract coordinates: there are different layouts. The default is layout.auto().
data(karate, package="igraphdata") G <- upgrade_graph(karate) plot(G) # same as plot(G, layout = layout_nicely(G)) plot(G, layout = layout.fruchterman.reingold(G)) plot(G, layout = layout.circle(G)) # not good if there are too many vertices plot(G, layout = layout.sphere(G)) # complicated plot(G, layout = layout.random(G)) # complicated L <- layout.fruchterman.reingold(G) dim(L) # 34 2
- Message when I use load() to load igraph objects created from igraph version 1.4.2 in R with igraph 2.0.3.
This graph was created by an old(er) igraph version. Call upgrade_graph() on it to use with the current igraph version For now we convert it on the fly...
visNetwork
- https://cran.r-project.org/web/packages/visNetwork/index.html
- Create Interactive networks using R programming
- It is used by FELLA::launchApp().
networkD3
- This is a port of Christopher Gandrud's d3Network package to the htmlwidgets framework.
- Network Graphs in R from The Data Sandbox.
plotly
- How to Create an Interactive WebGL Network Graph Using R and Plotly
- A working example
library(igraph) library(ggplot2) library(plotly) # Create a data frame that represents edges dat <- data.frame(name=c("Alice", "Bob", "Cecil"), age=c(48,33,45)) # Step 1: Create an igraph object g <- graph_from_data_frame(dat, directed = FALSE) # Get the layout of the graph layout <- layout_nicely(g) # Create a data frame for the vertices vertices <- data.frame(id = V(g)$name, x = layout[,1], y = layout[,2], var1=V(g)$name, var2=LETTERS[1:6]) # Get the edges and convert vertex names to coordinates edges <- get.data.frame(g, what = "edges") edges <- merge(edges, vertices, by.x = "from", by.y = "id") edges <- merge(edges, vertices, by.x = "to", by.y = "id", suffixes = c(".from", ".to")) # Step 2: Create a ggplot object p <- ggplot(vertices, aes(x = x, y = y)) + geom_segment(data = edges, aes(x = x.from, y = y.from, xend = x.to, yend = y.to)) + geom_point(aes(text = paste("Var1:", var1, "\nVar2:", var2))) + geom_text(aes(x = x, y = y, label = id), vjust = 1, hjust = 0, nudge_x=.2) + expand_limits(x = c(-2, 1.5)) + theme( axis.line = element_blank(), # Hide axis lines axis.text = element_blank(), # Hide axis text axis.ticks = element_blank(), # Hide axis ticks axis.title = element_blank(), # Hide axis labels panel.grid.major = element_blank(), # Hide major grid panel.grid.minor = element_blank(), # Hide minor grid panel.background = element_rect(fill = "white") # Set background to white ) # Print the plot print(p) # Step 3: # Convert the ggplot object to a plotly object # Tooltip works only on 'points', not on labels. ggplotly(p, tooltip = "text")
ggiraph
Convert ggraph to interative plot with plotyly or Network3D
scatterD3
scatterD3 is an HTML R widget for interactive scatter plots visualization. It is based on the htmlwidgets R package and on the d3.js javascript library.
dygraphs
rthreejs - Create interactive 3D scatter plots, network plots, and globes
rayshader: 2D and 3D mapping and data visualization with shades
https://github.com/tylermorganwall/rayshader
On Rstudio server, we need options(rgl.printRglwidget = TRUE) ; see Why is my 3D plot not showing up in R Studio plot viewer?.
d3heatmap
See R
collapsibleTree
- https://github.com/adeelk93/collapsibletree
- https://cran.r-project.org/web/packages/jsTreeR/index.html
svgPanZoom
This 'htmlwidget' provides pan and zoom interactivity to R graphics, including 'base', 'lattice', and 'ggplot2'. The interactivity is provided through the 'svg-pan-zoom.js' library.
DT: An R interface to the DataTables library
reactable: interactive table with rows that expand when clicked
How to create tables in R with expandable rows
getable: creating a 'dynamic' HTML table
Getting Tabular Data Through JavaScript in Compiled R Markdown Documents.
The content stays static while the data could be updated independently without rewriting or recompiling the HTML document. This could be done by utilizing JavaScript’s ability to asynchronously fetch data from the web and generate DOM elements based on these data.
plotly
- How to make interactive 3D scatter plots, https://plotly.com/r/reference/#scatter-mode
mtcars$am[which(mtcars$am == 0)] <- 'Automatic' mtcars$am[which(mtcars$am == 1)] <- 'Manual' mtcars$am <- as.factor(mtcars$am) fig <- plot_ly(mtcars, x = ~wt, y = ~hp, z = ~qsec, color = ~am, colors = c('#BF382A', '#0C4B8E')) fig <- fig %>% add_markers() fig <- fig %>% layout(scene = list(xaxis = list(title = 'Weight'), yaxis = list(title = 'Gross horsepower'), zaxis = list(title = '1/4 mile time'))) fig
x <- rnorm(100); y <- rnorm(100) ; z <- rnorm(100) labels <- paste("point", 1:100) w <- sample(scales::hue_pal()(3), 100, replace=T) # use ggplot2's default color scheme plot_ly(x=x, y=y, z=z, text=labels, type ="scatter3d", mode="markers", marker = list(color = z, colorscale ="Viridis")) plot_ly(x=x, y=y, z=z, text=labels, type ="scatter3d", mode="markers", marker = list(color = w))
- How to add more data to tooltip in R plotly (besides the `text` argument)
- Data Labels on Hover (Hover text), Hover Text and Formatting in R
- Hover Text and Formatting in ggplot2
gobj <- ggplot(aes(x=var1, y=var2, text = paste0("another :", var3))) + geom_jitter() ggplotly(gobj) # var3 will be shown on the tooltip/hover text
- Power curves and ggplot2.
- TIME SERIES CHARTS BY THE ECONOMIST IN R USING PLOTLY & FIVE INTERACTIVE R VISUALIZATIONS WITH D3, GGPLOT2, & RSTUDIO
- Filled chord diagram
- DASHBOARDS IN R WITH SHINY & PLOTLY
- Plotly Graphs in Shiny,
- How to plot basic charts with plotly
- How to add Trend Lines in R Using Plotly
- Introduction to Interactive Graphics in R with plotly
- Interactive web-based data visualization with R, plotly, and shiny (ebook) by Carson Sievert
- Tooltip, Formatting mouse over labels in plotly when using ggplotly
- 3D Scatter Plots, k-Means 101: An introductory guide to k-Means clustering in R. Note the 3D plot is displayed on a browser.
- save interaction plots in HTML
p <- plotly::ggplotly(b) htmlwidgets::saveWidget(p, "index.html")
- Used plotly in shiny. plotlyOutput()
highcharter: alternative to plotly
https://cran.r-project.org/web/packages/highcharter/ Good for time series plot.
Amazon
Download product information and reviews from Amazon.com (2016, not working in get_reviews() as of 6/7/2023)
sudo apt-get install libxml2-dev sudo apt-get install libcurl4-openssl-dev
and in R
install.packages("devtools") install.packages("XML") install.packages("pbapply") install.packages("dplyr") devtools::install_github("56north/Rmazon") product_info <- Rmazon::get_product_info("1593273843") reviews <- Rmazon::get_reviews("1593273843") reviews[1,6] # only show partial characters from the 1st review nchar(reviews[1,6]) as.character(reviews[1,6]) # show the complete text from the 1st review reviews <- Rmazon::get_reviews("B07BNGJXGS") # Fetching 30 reviews of 'BOOX Note Ereader,Android 6.0 32 GB 10.3" Dual Touch HD Display' # |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 02s reviews # A tibble: 30 x 6 reviewRating reviewDate reviewFormat Verified_Purcha… reviewHeadline <dbl> <chr> <lgl> <lgl> <chr> 1 4 May 23, 2… NA TRUE Good for PDF … 2 3 May 8, 20… NA FALSE The reading s… 3 5 May 17, 2… NA TRUE E-reader and … 4 3 May 24, 2… NA TRUE Good hardware… 5 3 June 21, … NA TRUE Poor QC 6 5 August 5,… NA TRUE Excellent for… 7 5 May 31, 2… NA TRUE Especially li… 8 5 July 4, 2… NA TRUE Android 6 rea… 9 4 July 15, … NA TRUE Remember the … 10 4 June 9, 2… NA TRUE Overall fanta… # ... with 20 more rows, and 1 more variable: reviewText <chr> reviews[1, 6] # 6-th column is the review text
Vignette: Scraping Amazon Reviews in R 2019
library(rvest) library(dplyr) url <- "https://www.amazon.com/dp/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews" page <- read_html(url) reviews <- page %>% html_nodes(".review") %>% html_text() titles <- page %>% html_nodes(".review-title-content") %>% html_text() ratings <- page %>% html_nodes(".review-rating") %>% html_text() df <- data.frame(reviews, titles, ratings)
Scraping Amazon Customer Reviews (almost). The url is obtained by clicking "See all reviews", like this one.
Google Bard answers: first 10 reviews.
library(rvest) # Get the product ASIN code asin <- "B07HSKPDBV" # Create a URL for the product reviews page url <- paste0("https://www.amazon.com/product-reviews/", asin) # Read the HTML content of the product reviews page html <- read_html(url) # Extract the review titles, bodies, and ratings reviews <- html %>% html_nodes(".review-text") %>% html_text() %>% # so far is OK data.frame( title = strsplit(., "\n")[[1]][1], body = strsplit(., "\n")[-1], rating = strsplit(., " ")[1][2] ) # Print the first 5 reviews head(reviews, 5)
All reviews.
library(rvest) # Get the product ASIN code asin <- "B07HSKPDBV" # Create a function to scrape the reviews for a single page scrape_reviews <- function(url) { html <- read_html(url) reviews <- html %>% html_nodes(".review-text") %>% html_text() %>% data.frame( title = strsplit(., "\n")[[1]][1], body = strsplit(., "\n")[-1], rating = strsplit(., " ")[1][2] ) return(reviews) } # Create a vector of URLs for all of the review pages urls <- seq(from = 1, to = 100, by = 10) %>% map_chr(function(x) paste0("https://www.amazon.com/product-reviews/", asin, "?pageNumber=", x)) # Scrape the reviews for all of the pages reviews <- urls %>% map(scrape_reviews) %>% do.call(rbind, .) # Print the first 5 reviews head(reviews, 5)
gutenbergr
Feed
feedeR - Feed Reader Package for R
File sharing
{filebin} Quick & Easy File Sharing
OCR
Online
- https://www.onlineocr.net/ (works)
- https://ocr.space/ (not expected)
- https://www.newocr.com/ (not expected)
Wikipedia
WikipediR: R's MediaWiki API client library