Ggplot2: Difference between revisions

From 太極
Jump to navigation Jump to search
Line 28: Line 28:
* [http://r-statistics.co/Complete-Ggplot2-Tutorial-Part1-With-R-Code.html The Complete ggplot2 Tutorial] from http://r-statistics.co
* [http://r-statistics.co/Complete-Ggplot2-Tutorial-Part1-With-R-Code.html The Complete ggplot2 Tutorial] from http://r-statistics.co
* [https://github.com/erikgahner/awesome-ggplot2 A curated list of awesome ggplot2 tutorials, packages etc.]
* [https://github.com/erikgahner/awesome-ggplot2 A curated list of awesome ggplot2 tutorials, packages etc.]
== Help ==
<pre>
> library(ggplot2)
Need help? Try Stackoverflow: https://stackoverflow.com/tags/ggplot2
</pre>


== Extensions ==
== Extensions ==

Revision as of 09:34, 1 May 2020

ggplot2

Books

The Grammar of Graphics

  • Data: Raw data that we'd like to visualize
  • Geometrics: shapes that we use to visualize data
  • Aesthetics: Properties of geometries (size, color, etc)
  • Scales: Mapping between geometries and aesthetics

Scatterplot aesthetics

  • x, y
  • shape
  • color
  • size
  • alpha

Tutorials

Help

> library(ggplot2)
Need help? Try Stackoverflow: https://stackoverflow.com/tags/ggplot2

Extensions

http://www.ggplot2-exts.org/gallery/

Some examples

Examples from 'R for Data Science' book - Aesthetic mappings

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy))
  # the 'mapping' is the 1st argument for all geom_* functions, so we can safely skip it.
# template
ggplot(data = <DATA>) + 
  <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))

# add another variable through color, size, alpha or shape
ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy, color = class))

ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy, size = class))

ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy, alpha = class))

ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy, shape = class))

ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy), color = "blue")

# add another variable through facets
ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy)) + 
  facet_wrap(~ class, nrow = 2)

# add another 2 variables through facets
ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy)) + 
  facet_grid(drv ~ cyl)

Examples from 'R for Data Science' book - Geometric objects, lines and smoothers

# Points
ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy))

# Smoothed
ggplot(data = mpg) + 
  geom_smooth(aes(x = displ, y = hwy))

# Points + smoother, add transparency to points, remove se
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + 
  geom_point(alpha=1/10) +
  geom_smooth(se=FALSE)    

# Colored points + smoother
ggplot(data = mpg, aes(x = displ, y = hwy)) + 
  geom_point(aes(color = class)) + 
  geom_smooth()

Examples from 'R for Data Science' book - Transformation

# y axis = counts
# bar plot
ggplot(data = diamonds) + 
  geom_bar(aes(x = cut))
# Or
ggplot(data = diamonds) + 
  stat_count(aes(x = cut))

# y axis = proportion
ggplot(data = diamonds) + 
  geom_bar(aes(x = cut, y = ..prop.., group = 1))

# bar plot with 2 variables
ggplot(data = diamonds) + 
  geom_bar(aes(x = cut, fill = clarity))

facet_wrap and facet_grid to create a panel of plots

Color palette

Color picker

https://github.com/daattali/colourpicker

Colour related aesthetics: colour, fill and alpha

https://ggplot2.tidyverse.org/reference/aes_colour_fill_alpha.html

ggplot2::scale functions and scales packages

  • Scales control the mapping from data to aesthetics. They take your data and turn it into something that you can see, like size, colour, position or shape.
  • Scales also provide the tools that let you read the plot: the axes and legends.

ggplot2::scale

https://ggplot2-book.org/scales.html

Naming convention: scale_AestheticName_Datatype

Examples:

Emulate ggplot2 default color palette

It is just equally spaced hues around the color wheel. Emulate ggplot2 default color palette

Answer 1

gg_color_hue <- function(n) {
  hues = seq(15, 375, length = n + 1)
  hcl(h = hues, l = 65, c = 100)[1:n]
}

n = 4
cols = gg_color_hue(n)

dev.new(width = 4, height = 4)
plot(1:n, pch = 16, cex = 2, col = cols)

Answer 2 (better, it shows the color values in HEX). It should be read from left to right and then top to down.

scales package

library(scales)
show_col(hue_pal()(4))
show_col(hue_pal()(2)) # (salmon, iris blue) 
           # see https://www.htmlcsscolor.com/ for color names

transform scales

How to make that crazy Fox News y axis chart with ggplot2 and scales

Class variables

"Set1" is a good choice. See RColorBrewer::display.brewer.all()

Heatmap for single channel

https://scales.r-lib.org/

# White <----> Blue
RColorBrewer::display.brewer.pal(n = 8, name = "Blues")

Heatmap for dual channels

http://www.sthda.com/english/wiki/colors-in-r

library(RcolorBrewer)
# Red <----> Blue
display.brewer.pal(n = 8, name = 'RdBu')
# Hexadecimal color specification 
brewer.pal(n = 8, name = "RdBu")

plot(1:8, col=brewer_pal(palette = "RdBu")(8), pch=20, cex=4)

# Blue <----> Red
plot(1:8, col=rev(brewer_pal(palette = "RdBu")(8)), pch=20, cex=4)

Twopalette.svg

Themes and background for ggplot2

ggthmr

ggthmr package

ggsci

https://nanx.me/ggsci/

Font size

Change Font Size of ggplot2 Plot in R (5 Examples) | Axis Text, Main Title & Legend

ggthemes package

https://cran.r-project.org/web/packages/ggthemes/index.html

Common plots

https://ggplot2.tidyverse.org/reference/index.html

Line plots

Histogram

ggplot(data = txhousing, aes(x = median)) +
  geom_histogram()

Histogram vs barplot from deeply trivial.

Boxplot with jittering

ggplot(data.frame(Wi), aes(y = Wi)) + 
  geom_boxplot()
# df2 is n x 2 
ggplot(df2, aes(x=nboot, y=boot)) +
  geom_boxplot(outlier.shape=NA) + #avoid plotting outliers twice
  geom_jitter(aes(color=nboot), position=position_jitter(width=.2, height=0, seed=1)) +
  labs(title="", y = "", x = "nboot")

If we omit the outlier.shape=NA option in geom_boxplot(), we will get the following plot.

Jitterboxplot.png

Violin plot

library(ggplot2)
ggplot(midwest, aes(state, area)) + geom_violin() + ggforce::geom_sina()

Violinplot.png

Kernel density plot

Back to back barplot

Bivariate analysis with ggpair

Correlation in R: Pearson & Spearman with Matrix Example

barplot

How to basic: bar plots

Ordered barplot and facet

Special plots

Bump plot: plot ranking over time

https://github.com/davidsjoberg/ggbump

Aesthetics

group

https://ggplot2.tidyverse.org/reference/aes_group_order.html

GUI

ggedit & ggplotgui – interactive ggplot aesthetic and theme editor

esquisse (French, means 'sketch'): creating ggplot2 interactively

https://cran.rstudio.com/web/packages/esquisse/index.html

A 'shiny' gadget to create 'ggplot2' charts interactively with drag-and-drop to map your variables. You can quickly visualize your data accordingly to their type, export to 'PNG' or 'PowerPoint', and retrieve the code to reproduce the chart.

The interface introduces basic terms used in ggplot2:

  • x, y,
  • fill (useful for geom_bar, geom_rect, geom_boxplot, & geom_raster, not useful for scatterplot),
  • color (edges for geom_bar, geom_line, geom_point),
  • size,
  • facet, split up your data by one or more variables and plot the subsets of data together.

It does not include all features in ggplot2. At the bottom of the interface,

  • Labels & title & caption.
  • Plot options. Palette, theme, legend position.
  • Data. Remove subset of data.
  • Export & code. Copy/save the R code. Export file as PNG or PowerPoint.

ggcharts

https://cran.r-project.org/web/packages/ggcharts/index.html

plotly

R → plotly

ggconf: Simpler Appearance Modification of 'ggplot2'

https://github.com/caprice-j/ggconf

Plotting individual observations and group means

https://drsimonj.svbtle.com/plotting-individual-observations-and-group-means-with-ggplot2

subplot

Easy way to mix multiple graphs on the same page

gridExtra

Force a regular plot object into a Grob for use in grid.arrange

gridGraphics package

make one panel blank/create a placeholder

https://stackoverflow.com/questions/20552226/make-one-panel-blank-in-ggplot2

labs

x and y labels

https://stackoverflow.com/questions/10438752/adding-x-and-y-axis-labels-in-ggplot2 or the Labels part of the cheatsheet

You can set the labels with xlab() and ylab(), or make it part of the scale_*.* call.

labs(x = "sample size", y = "ngenes (glmnet)")

name-value pairs

See several examples (color, fill, size, ...) from opioid prescribing habits in texas.

Prevent sorting of x labels

See Change the order of a discrete x scale.

The idea is to set the levels of x variable.

junk   # n x 2 table
colnames(junk) <- c("gset", "boot")
junk$gset <- factor(junk$gset, levels = as.character(junk$gset))
ggplot(data = junk, aes(x = gset, y = boot, group = 1)) + 
  geom_line() + 
  theme(axis.text.x=element_text(color = "black", angle=30, vjust=.8, hjust=0.8))

Legends

Legend title

scale_colour_manual("Treatment", values = c("black", "red"))

Hide legend

gg + theme(legend.position="none")

See Remove legend ggplot 2.2, How to remove legend from a ggplot.

guide_legend()

guide_legend() allows the modification of legends for scales, including fill, color, and shape.

This function can be used in scale_fill_manual(), scale_fill_continuous(), ... functions.

scale_fill_manual(values=c("orange", "blue"), 
                  guide=guide_legend(title = "My Legend Title",
                                     nrow=1,
                                     label.position = "top",
                                     keywidth=2.5))

Move the legend from right to top/bottom of the plot

theme(legend.position = "top")

ylim and xlim in ggplot2

https://stackoverflow.com/questions/3606697/how-to-set-limits-for-axes-in-ggplot2-r-plots or the Zooming part of the cheatsheet

Use one of the following

  • + scale_x_continuous(limits = c(-5000, 5000))
  • + coord_cartesian(xlim = c(-5000, 5000))
  • + xlim(-5000, 5000)

ggtitle()

Centered title

See the Legends part of the cheatsheet.

ggtitle("MY TITLE") +
  theme(plot.title = element_text(hjust = 0.5))

Subtitle

ggtitle("My title",
        subtitle = "My subtitle")

margins

https://stackoverflow.com/a/10840417

Time series plot

Multiple lines plot https://stackoverflow.com/questions/14860078/plot-multiple-lines-data-series-each-with-unique-color-in-r

set.seed(45)
nc <- 9
df <- data.frame(x=rep(1:5, nc), val=sample(1:100, 5*nc), 
                   variable=rep(paste0("category", 1:nc), each=5))
# plot
# http://colorbrewer2.org/#type=qualitative&scheme=Paired&n=9
ggplot(data = df, aes(x=x, y=val)) + 
    geom_line(aes(colour=variable)) + 
    scale_colour_manual(values=c("#a6cee3", "#1f78b4", "#b2df8a", "#33a02c", "#fb9a99", "#e31a1c", "#fdbf6f", "#ff7f00", "#cab2d6"))

Versus old fashion

dat <- matrix(runif(40,1,20),ncol=4) # make data
matplot(dat, type = c("b"),pch=1,col = 1:4) #plot
legend("topleft", legend = 1:4, col=1:4, pch=1) # optional legend

Github style calendar plot

geom_bar(), geom_col(), stat_count()

https://ggplot2.tidyverse.org/reference/geom_bar.html

geom_segment()

Line segments, arrows and curves

geom_errorbar(): error bars

set.seed(301)
x <- rnorm(10)
SE <- rnorm(10)
y <- 1:10

par(mfrow=c(2,1))
par(mar=c(0,4,4,4))
xlim <- c(-4, 4)
plot(x[1:5], 1:5, xlim=xlim, ylim=c(0+.1,6-.1), yaxs="i", xaxt = "n", ylab = "", pch = 16, las=1)
mtext("group 1", 4, las = 1, adj = 0, line = 1) # las=text rotation, adj=alignment, line=spacing
par(mar=c(5,4,0,4))
plot(x[6:10], 6:10, xlim=xlim, ylim=c(5+.1,11-.1), yaxs="i", ylab ="", pch = 16, las=1, xlab="")
arrows(x[6:10]-SE[6:10], 6:10, x[6:10]+SE[6:10], 6:10, code=3, angle=90, length=0)
mtext("group 2", 4, las = 1, adj = 0, line = 1)

Stklnpt.svg

geom_rect()

Note that we can use scale_fill_manual() to change the 'fill' colors (scheme/palette). The 'fill' parameter in geom_rect() is only used to define the discrete variable.

text annotations: ggrepel package

annotate("text", label="Toyota", x=3, y=100)

geom_text(aes(x, y, label), data, size, vjust, hjust, nudge_x)

Fonts

Adding Custom Fonts to ggplot in R

Save the plots

ggsave() We can specify dpi to increase the resolution. For example,

g1 <- ggplot(data = mydf) 
g1
ggsave("myfile.png", g1, height = 7, width = 8, units = "in", dpi = 500)

graphics::smoothScatter

smoothScatter with ggplot2

BBC

Add your brand to ggplot graph

You Need to Start Branding Your Graphs. Here's How, with ggplot!

Python

plotnine: A Grammar of Graphics for Python.

plotnine is an implementation of a grammar of graphics in Python, it is based on ggplot2. The grammar allows users to compose plots by explicitly mapping data to the visual objects that make up the plot.

The Hitchhiker’s Guide to Plotnine