Ggplot2

From 太極
Jump to navigation Jump to search

ggplot2

Books[edit]

The Grammar of Graphics[edit]

  • Data: Raw data that we'd like to visualize
  • Geometrics: shapes that we use to visualize data
  • Aesthetics: Properties of geometries (size, color, etc)
  • Scales: Mapping between geometries and aesthetics

Scatterplot aesthetics[edit]

geom_point(). The aesthetics is geom dependent.

  • x, y
  • shape
  • color
  • size. It is not always to put 'size' inside aes(). See an example at Legend layout.
  • alpha
library(ggplot2)
library(tidyverse)
set.seed(1)
x1 <- rbinom(100, 1, .5) - .5
x2 <- c(rnorm(50, 3, .8)*.1, rnorm(50, 8, .8)*.1)
x3 <- x1*x2*2
# x=1:100, y=x1, x2, x3
tibble(x=1:length(x1), T=x1, S=x2, I=x3) %>% 
  tidyr::pivot_longer(-x) %>% 
  ggplot(aes(x=x, y=value)) + 
  geom_point(aes(color=name))

# Cf
matplot(1:length(x1), cbind(x1, x2, x3), pch=16, 
        col=c('cornflowerblue', 'springgreen3', 'salmon'))

Online tutorials[edit]

Help[edit]

> library(ggplot2)
Need help? Try Stackoverflow: https://stackoverflow.com/tags/ggplot2

Gallery[edit]

Some examples[edit]

Examples from 'R for Data Science' book - Aesthetic mappings[edit]

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy))
  # the 'mapping' is the 1st argument for all geom_* functions, so we can safely skip it.
# template
ggplot(data = <DATA>) + 
  <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))

# add another variable through color, size, alpha or shape
ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy, color = class))

ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy, size = class))

ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy, alpha = class))

ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy, shape = class))

ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy), color = "blue")

# add another variable through facets
ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy)) + 
  facet_wrap(~ class, nrow = 2)

# add another 2 variables through facets
ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy)) + 
  facet_grid(drv ~ cyl)

Examples from 'R for Data Science' book - Geometric objects, lines and smoothers[edit]

How to Add a Regression Line to a ggplot?

# Points
ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy)) # we can add color to aes()

# Line plot
ggplot() +
  geom_line(aes(x, y))  # we can add color to aes()

# Smoothed
# 'size' controls the line width
ggplot(data = mpg) + 
  geom_smooth(aes(x = displ, y = hwy), size=1) 

# Points + smoother, add transparency to points, remove se
# We add transparency if we need to make smoothed line stands out
#                    and points less significant
# We move aes to the '''mapping''' option in ggplot()
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + 
  geom_point(alpha=1/10) +
  geom_smooth(se=FALSE)    

# Colored points + smoother
ggplot(data = mpg, aes(x = displ, y = hwy)) + 
  geom_point(aes(color = class)) + 
  geom_smooth()

Examples from 'R for Data Science' book - Transformation, bar plot[edit]

# y axis = counts
# bar plot
ggplot(data = diamonds) + 
  geom_bar(aes(x = cut))
# Or
ggplot(data = diamonds) + 
  stat_count(aes(x = cut))

# y axis = proportion
ggplot(data = diamonds) + 
  geom_bar(aes(x = cut, y = ..prop.., group = 1))

# bar plot with 2 variables
ggplot(data = diamonds) + 
  geom_bar(aes(x = cut, fill = clarity))

facet_wrap and facet_grid to create a panel of plots[edit]

  • The statement facet_grid() can be defined without a data. For example
    mylayout <- list(ggplot2::facet_grid(cat_y ~ cat_x))
    mytheme <- c(mylayout, 
                 list(ggplot2::theme_bw(), ggplot2::ylim(NA, 1)))
    # we haven't defined cat_y, cat_x variables
    ggplot() + geom_line() + 
      mylayout 
    
  • Multiclass predictive modeling for #TidyTuesday NBER papers

Color palette[edit]

Color blind[edit]

colorblindcheck: Check Color Palettes for Problems with Color Vision Deficiency

Color picker[edit]

https://github.com/daattali/colourpicker

> library(colourpicker)
> plotHelper(colours=5)

Listening on http://127.0.0.1:6023

Color names[edit]

  • ColorNameR - A tool for transforming coordinates in a color space to common color names using data from the Royal Horticultural Society and the International Union for the Protection of New Varieties of Plants.
  • ColorHexa

colorspace package[edit]

colorspace: A Toolbox for Manipulating and Assessing Colors and Palettes

scale_fill_discrete_qualitative(palette) and an example. The palette selections are different from scale_fill_XXX(). Note that the number of classes can be arbitrary in scale_fill_discrete_qualitative().

*paletteer package[edit]

paletteer_d("RColorBrewer::RdBu")
#67001FFF #B2182BFF #D6604DFF #F4A582FF #FDDBC7FF #F7F7F7FF 
#D1E5F0FF #92C5DEFF #4393C3FF #2166ACFF #053061FF 

paletteer_d("ggsci::uniform_startrek")
#CC0C00FF #5C88DAFF #84BD00FF #FFCD00FF #7C878EFF #00B5E2FF #00AF66FF 

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
      geom_point() +
      scale_color_paletteer_d("ggsci::uniform_startrek")
# the next is the same as above
ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
     geom_point() +
     scale_color_manual(values = c("setosa" = "#CC0C00FF", 
                                   "versicolor" = "#5C88DAFF", 
                                   "virginica" = "#84BD00FF"))

ggokabeito[edit]

ggokabeito: Colorblind-friendly, qualitative 'Okabe-Ito' Scales for ggplot2 and ggraph.

# Bad
ggplot(mpg, aes(hwy, color = class, fill = class)) +
     geom_density(alpha = .8)

# Bad (single color)
ggplot(mpg, aes(hwy, color = class, fill = class)) +
     geom_density(alpha = .8) +
     scale_fill_brewer(name = "Class") +
     scale_color_brewer(name = "Class")

# Bad
ggplot(mpg, aes(hwy, color = class, fill = class)) +
     geom_density(alpha = .8) +
     scale_fill_brewer(name = "Class", palette ="Set1") +
     scale_color_brewer(name = "Class", palette ="Set1")

# Nice
ggplot(mpg, aes(hwy, color = class, fill = class)) +
     geom_density(alpha = .8) +
     scale_fill_okabe_ito(name = "Class") +
     scale_color_okabe_ito(name = "Class")

Colour related aesthetics: colour, fill and alpha[edit]

https://ggplot2.tidyverse.org/reference/aes_colour_fill_alpha.html

Scatterplot with large number of points: alpha[edit]

smoothScatter with ggplot2

ggplot(aes(x, y)) +
    geom_point(alpha=.1) 

Combine colors and shapes in legend[edit]

  • https://ggplot2-book.org/scales.html#scale-details In order for legends to be merged, they must have the same name.
    df <- data.frame(x = 1:3, y = 1:3, z = c("a", "b", "c"))
    ggplot(df, aes(x, y)) + geom_point(aes(shape = z, colour = z), size=4)
    
  • How to Work with Scales in a ggplot2 in R. This solution is better since it allows to change the legend title. Just make sure the title name we put in both scale_* functions are the same.
    ggplot(mtcars, aes(x=hp, y=mpg)) +
       geom_point(aes(shape=factor(cyl), colour=factor(cyl))) +
       scale_shape_discrete("Cylinders") +
       scale_colour_discrete("Cylinders")
    

ggplot2::scale functions and scales packages[edit]

  • Scales control the mapping from data to aesthetics. They take your data and turn it into something that you can see, like size, colour, position or shape.
  • Scales also provide the tools that let you read the plot: the axes and legends.

ggplot2::scale - axes/axis, legend[edit]

https://ggplot2-book.org/scales.html

Naming convention: scale_AestheticName_NameDataType where

  • AestheticName can be x, y, color, fill, size, shape, ...
  • NameDataType can be continuous, discrete, manual or gradient.

Examples:

  • See Figure 12.1: Axis and legend components on the book ggplot2: Elegant Graphics for Data Analysis
    # Set x-axis label
    scale_x_discrete("Car type")   # or a shortcut xlab() or labs()
    scale_x_continuous("Displacement")
    
    # Set legend title
    scale_colour_discrete("Drive\ntrain")    # or a shortcut labs()
    
    # Change the default color
    scale_color_brewer()
    
    # Change the axis scale
    scale_x_sqrt()
    
    # Change breaks and their labels
    scale_x_continuous(breaks = c(2000, 4000), labels = c("2k", "4k"))
    
    # Relabel the breaks in a categorical scale
    scale_y_discrete(labels = c(a = "apple", b = "banana", c = "carrot"))
    
  • How to change the color in geom_point or lines in ggplot
    ggplot() + 
      geom_point(data = data, aes(x = time, y = y, color = sample),size=4) +
      scale_color_manual(values = c("A" = "black", "B" = "red"))
    
    ggplot(data = data, aes(x = time, y = y, color = sample)) + 
      geom_point(size=4) + 
      geom_line(aes(group = sample)) + 
      scale_color_manual(values = c("A" = "black", "B" = "red"))
    
  • See an example at geom_linerange where we have to specify the limits parameter in order to make "8" < "16" < "20"; otherwise it is 16 < 20 < 8.
    Browse[2]> order(coordinates$chr)
    [1] 3 4 1 2
    Browse[2]> coordinates$chr 
    [1] "20" "8"  "16" "16"
    

ylim and xlim in ggplot2 in axes[edit]

https://stackoverflow.com/questions/3606697/how-to-set-limits-for-axes-in-ggplot2-r-plots or the Zooming part of the cheatsheet

Use one of the following

  • + scale_x_continuous(limits = c(-5000, 5000))
  • + coord_cartesian(xlim = c(-5000, 5000))
  • + xlim(-5000, 5000)

Emulate ggplot2 default color palette[edit]

It is just equally spaced hues around the color wheel. Emulate ggplot2 default color palette

Answer 1

gg_color_hue <- function(n) {
  hues = seq(15, 375, length = n + 1)
  hcl(h = hues, l = 65, c = 100)[1:n]
}

n = 4
cols = gg_color_hue(n)

dev.new(width = 4, height = 4)
plot(1:n, pch = 16, cex = 2, col = cols)

Answer 2 (better, it shows the color values in HEX). It should be read from left to right and then top to down.

scales package

library(scales)
show_col(hue_pal()(4)) # ("#F8766D", "#7CAE00", "#00BFC4", "#C77CFF")
                       # (Salmon, Christi, Iris Blue, Heliotrope)
show_col(hue_pal()(2)) # ("#F8767D", "#00BFC4") = (salmon, iris blue) 
           # see https://www.htmlcsscolor.com/ for color names

See also the last example in ggsurv() where the KM plots have 4 strata. The colors can be obtained by scales::hue_pal()(4) with hue_pal()'s default arguments.

R has a function called colorName() to convert a hex code to color name; see roloc package.

transform scales[edit]

How to make that crazy Fox News y axis chart with ggplot2 and scales

Class variables[edit]

"Set1" is a good choice. See RColorBrewer::display.brewer.all()

Heatmap for single channel[edit]

How to Make a Heatmap of Customers in R, source code on github. geom_tile() and geom_text() were used. Heatmap in ggplot2 from https://r-charts.com/.

https://scales.r-lib.org/

# White <----> Blue
RColorBrewer::display.brewer.pal(n = 8, name = "Blues")

Heatmap for dual channels[edit]

http://www.sthda.com/english/wiki/colors-in-r

library(RColorBrewer)
# Red <----> Blue
display.brewer.pal(n = 8, name = 'RdBu')
# Hexadecimal color specification 
brewer.pal(n = 8, name = "RdBu")

plot(1:8, col=brewer_pal(palette = "RdBu")(8), pch=20, cex=4)

# Blue <----> Red
plot(1:8, col=rev(brewer_pal(palette = "RdBu")(8)), pch=20, cex=4)

Twopalette.svg

Themes and background for ggplot2[edit]

Background[edit]

  • Export plot in .png with transparent background in base R plot.
    x = c(1, 2, 3)
    op <- par(bg=NA)
    plot (x)
    
    dev.copy(png,'myplot.png')
    dev.off()
    par(op)
    
  • Transparent background with ggplot2
    library(ggplot2)
    data("airquality")
    
    p <- ggplot(airquality, aes(Solar.R, Temp)) +
         geom_point() +
         geom_smooth() +
         # set transparency
         theme(
            panel.grid.major = element_blank(), 
            panel.grid.minor = element_blank(),
            panel.background = element_rect(fill = "transparent",colour = NA),
            plot.background = element_rect(fill = "transparent",colour = NA)
            )
    p
    ggsave("airquality.png", p, bg = "transparent")
    
  • ggplot2 theme background color and grids
    ggplot() + geom_bar(aes(x=, fill=y)) +
               theme(panel.background=element_rect(fill='purple')) + 
               theme(plot.background=element_blank())
    
    ggplot() + geom_bar(aes(x=, fill=y)) + 
               theme(panel.background=element_blank()) + 
               theme(plot.background=element_blank()) # minimal background like base R
               # the grid lines are not gone; they are white so it is the same as the background
    
    ggplot() + geom_bar(aes(x=, fill=y)) + 
               theme(panel.background=element_blank()) + 
               theme(plot.background=element_blank()) +
               theme(panel.grid.major.y = element_line(color="grey"))
               # draw grid line on y-axis only
    
    ggplot() + geom_bar() +
               theme_bw()
    
    ggplot() + geom_bar() +
               theme_minimal()
    
    ggplot() + geom_bar() +
               theme_void()
    
    ggplot() + geom_bar() +
               theme_dark()
    

ggthmr[edit]

ggthmr package

ggsci[edit]

Font size[edit]

Change Font Size of ggplot2 Plot in R (5 Examples) | Axis Text, Main Title & Legend

Rotate x-axis labels[edit]

theme(axis.text.x = element_text(angle = 90)

Add axis on top or right hand side[edit]

Remove labels[edit]

Plotting with ggplot: : adding titles and axis names

ggthemes package[edit]

https://cran.r-project.org/web/packages/ggthemes/index.html

ggplot() + geom_bar() +
           theme_solarized()   # sun color in the background

theme_excel()
theme_wsj()
theme_economist()
theme_fivethirtyeight()

rsthemes[edit]

rsthemes

thematic[edit]

thematic, Top R tips and news from RStudio Global 2021

Common plots[edit]

Scatterplot[edit]

Handling overlapping points (slides) and the ebook Fundamentals of Data Visualization by Claus O. Wilke.

Scatterplot with histograms[edit]

Bubble Chart[edit]

Ellipse[edit]

Line plots[edit]

Ridgeline plots, mountain diagram[edit]

Histogram[edit]

Histograms is a special case of bar plots. Instead of drawing each unique individual values as a bar, a histogram groups close data points into bins.

ggplot(data = txhousing, aes(x = median)) +
  geom_histogram()  # adding 'origin =0' if we don't expect negative values.
                    # adding 'bins=10' to adjust the number of bins
                    # adding 'binwidth=10' to adjust the bin width

Histogram vs barplot from deeply trivial.

Boxplot[edit]

Be careful that if we added scale_y_continuous(expand = c(0,0), limits = c(0,1)) to the code, it will change the boxplot if some data is outside the range of (0, 1). The console gives a warning message in this case.

Base R method[edit]

Box Plots - R Base Graphs

dim(df) # 112436 x 2
mycol <- c("#F8766D", "#7CAE00", "#00BFC4", "#C77CFF")
# mycol defines colors of 4 levels in df$Method (a factor)
boxplot(df$value ~ df$Method, col = mycol, xlab="Method")

Color fill/scale_fill_XXX[edit]

n <- 100
k <- 12
set.seed(1234)
cond <- factor(rep(LETTERS[1:k], each=n))
rating <- rnorm(n*k)
dat <- data.frame(cond = cond, rating = rating)

p <- ggplot(dat, aes(x=cond, y=rating, fill=cond)) + 
     geom_boxplot() 

p + scale_fill_hue() + labs(title="hue default") # Same as only p 
p + scale_fill_hue(l=40, c=35) + labs(title="hue options")
p + scale_fill_brewer(palette="Dark2") + labs(title="Dark2")
p + colorspace::scale_fill_discrete_qualitative(palette = "Dark 3") + labs(title="Dark 3")
p + scale_fill_brewer(palette="Accent") + labs(title="Accent")
p + scale_fill_brewer(palette="Pastel1") + labs(title="Pastel1")
p + scale_fill_brewer(palette="Set1") + labs(title="Set1")
p + scale_fill_brewer(palette="Spectral") + labs(title ="Spectral") 
p + scale_fill_brewer(palette="Paired") + labs(title="Paired")
# cbbPalette <- c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
# p + scale_fill_manual(values=cbbPalette)

Scalefill.png

ColorBrewer palettes RColorBrewer::display.brewer.all() to display all brewer palettes.

Reference from ggplot2. scale_fill_binned, scale_fill_brewer, scale_fill_continuous, scale_fill_date, scale_fill_datetime, scale_fill_discrete, scale_fill_distiller, scale_fill_gradient, scale_fill_gradientc, scale_fill_gradientn, scale_fill_grey, scale_fill_hue, scale_fill_identity, scale_fill_manual, scale_fill_ordinal, scale_fill_steps, scale_fill_steps2, scale_fill_stepsn, scale_fill_viridis_b, scale_fill_viridis_c, scale_fill_viridis_d

Jittering - plot the data on top of the boxplot[edit]

  • What is a boxplot
  • Quick look
    # Only 1 variable
    ggplot(data.frame(Wi), aes(y = Wi)) + 
      geom_boxplot()
    
    # Two variable, one of them is a factor
    ggplot() + geom_jitter(mapping = aes(x, y))
    
    # Box plot
    ggplot() + geom_boxplot(mapping = aes(x, y))
    
  • geom_jitter()
  • geom_jitter can affect both X and Y values.
    tibble(x=1:4, y=1:4) %>% ggplot(aes(x, y)) + geom_jitter()
    
  • https://stackoverflow.com/a/17560113
  • How to make scatterplot with geom_jitter plot reproducible?
    set.seed(1); data %>%
      ggplot() +
      geom_jitter(aes(T.categ, sex, colour = status))
    
  • https://www.tutorialgateway.org/r-ggplot2-jitter/
  • # df2 is n x 2 
    ggplot(df2, aes(x=nboot, y=boot)) +
      geom_boxplot(outlier.shape=NA) + #avoid plotting outliers twice
      geom_jitter(aes(color=nboot), position=position_jitter(width=.2, height=0, seed=1)) +
      labs(title="", y = "", x = "nboot")
    

    If we omit the outlier.shape=NA option in geom_boxplot(), we will get the following plot. (Another option is outlier.color = NA).

    Jitterboxplot.png

Groups of boxplots[edit]

mydata %>%
  ggplot(aes(x=Factor1, y=Response, fill=factor(Factor2))) +   
  geom_boxplot() 

Another method is to use ggpubr::ggboxplot().

ggboxplot(df, "dose", "len",
           fill = "dose", palette = c("#00AFBB", "#E7B800", "#FC4E07"), add.params=list(size=0.1),
           notch=T, add = "jitter", outlier.shape = NA, shape=16,
           size = 1/.pt, x.text.angle = 30, 
           ylab = "Silhouette Values", legend="right",
           ggtheme = theme_pubr(base_size = 8)) +
     theme(plot.title = element_text(size=8,hjust = 0.5), 
           text = element_text(size=8), 
           title = element_text(size=8),
           rect = element_rect(size = 0.75/.pt),
           line = element_line(size = 0.75/.pt),
           axis.text.x = element_text(size = 7),
           axis.line = element_line(colour = 'black', size = 0.75/.pt),
           legend.title = element_blank(),
           legend.position = c(0,1), 
           legend.justification = c(0,1),
           legend.key.size = unit(4,"mm"))

Violin plot and sina plot[edit]

sina plot from the ggforce package.

library(ggplot2)
ggplot(midwest, aes(state, area)) + geom_violin() + ggforce::geom_sina()

Violinplot.png

Kernel density plot[edit]

  • Overlay histograms with density plots
    library(ggplot2); library(tidyr)
    x <- data.frame(v1=rnorm(100), v2=rnorm(100,1,1), 
                    v3=rnorm(100, 0,2))
    data <- pivot_longer(x, cols=1:3)
    ggplot(data, aes(x=value, fill=name)) +
      geom_histogram(aes(y=..density..), alpha=.25) + 
      stat_density(geom="line", aes(color=name, linetype=name))
    ggplot(data, aes(x=value, fill=name, col =name)) +
      geom_density(alpha = .4)
    

Bivariate analysis with ggpair[edit]

Correlation in R: Pearson & Spearman with Matrix Example

GGally::ggpairs[edit]

barplot[edit]

Ordered barplot and facet[edit]

  • ?reorder. This, as relevel(), is a special case of simply calling factor(x, levels = levels(x)[....]).
    R> bymedian <- with(InsectSprays, reorder(spray, count, median))
    # bymedian will replace spray (a factor) 
    # The data is not changed except the order of levels (a factor) 
    # In this case, the order is determined by the median of count from each spray level
    #   from small to large.
    
    R> InsectSprays[1:3, ]
      count spray
    1    10     A
    2     7     A
    3    20     A
    R> bymedian
     [1] A A A A A A A A A A A A B B B B B B B B B B B B C C C C C C C C C C C C D D D D D D D
    [44] D D D D D E E E E E E E E E E E E F F F F F F F F F F F F
    attr(,"scores")
       A    B    C    D    E    F 
    14.0 16.5  1.5  5.0  3.0 15.0 
    Levels: C E D A F B
    R> InsectSprays$spray
     [1] A A A A A A A A A A A A B B B B B B B B B B B B C C C C C C C C C C C C D D D D D D D
    [44] D D D D D E E E E E E E E E E E E F F F F F F F F F F F F
    Levels: A B C D E F
    R> boxplot(count ~ bymedian, data = InsectSprays,
             xlab = "Type of spray", ylab = "Insect count",
             main = "InsectSprays data", varwidth = TRUE,
             col = "lightgray")
    

    Scatterplot

    tibble(y=sample(6), x=letters[1:6]) %>% 
      ggplot(aes(reorder(x, -y), y)) + geom_point(size=4)
    
  • Sorting the x-axis in bargraphs using ggplot2 or this one from Deeply Trivial. reorder(fac, value) was used.
    ggplot(df, aes(x=reorder(x, -y), y=y)) + geom_bar(stat = 'identity')
    
    df$order <- 1:nrow(df)
    # Assume df$y is a continuous variable and df$fac is a character/factor variable
    #   and we want to show factor according to the way they appear in the data
    #   (not following R's order even the variable is of type "character" not "factor")
    # We like to plot df$fac on the y-axis and df$y on x-axis. Fortunately,
    #   ggplot2 will draw barplot vertically or horizontally depending the 2 variables' types
    # The reason of using "-order" is to make the 1st name appears on the top
    ggplot(df, aes(x=y, y=reorder(fac, -order))) + geom_col()
    
    ggplot(df, aes(x=reorder(x, desc(y)), y=y)), geom_col()
    
  • Predict #TidyTuesday giant pumpkin weights with workflowsets. fct_reorder()
  • Reordering and facetting for ggplot2. tidytext::reorder_within() was used.
  • Chapter2 of data.table cookbook. reorder(fac, value) was used.
  • PCA and UMAP with tidymodels

Back to back barplot[edit]

Pyramid Chart[edit]

ggcharts::pyramid_chart()

Flip x and y axes[edit]

coord_flip()

Rotate x-axis labels[edit]

ggplot(mydf) + geom_col(aes(x = model, y=value, fill = method), position="dodge")+
  theme(axis.text.x = element_text(angle = 45, hjust=1))

Starts at zero[edit]

Starting bars and histograms at zero in ggplot2

scale_y_continuous(expand = c(0,0), limits = c(0, YourLimit))

Add patterns[edit]

Waterfall plot[edit]

Polygon and map plot[edit]

https://ggplot2.tidyverse.org/reference/geom_polygon.html

geom_step: Step function[edit]

Connect observations: geom_path(), geom_step()

Example: KM curves (without legend)

library(survival)
sf <- survfit(Surv(time, status) ~ x, data = aml)
sf
str(sf) # the first 10 forms one strata and the rest 10 forms the other
ggplot() + 
  geom_step(aes(x=c(0, sf$time[1:10]), y=c(1, sf$surv[1:10])), 
            col='red') + 
  scale_x_continuous('Time', limits = c(0, 161)) + 
  scale_y_continuous('Survival probability', limits = c(0, 1)) +
  geom_step(aes(x=c(0, sf$time[11:20]), y=c(1, sf$surv[11:20])), 
            col='black') 
# cf:  plot(sf, col = c('red', 'black'), mark.time=FALSE)

Same example but with legend (see Construct a manual legend for a complicated plot)

cols <- c("NEW"="#f04546","STD"="#3591d1")
ggplot() + 
  geom_step(aes(x=c(0, sf$time[1:10]), y=c(1, sf$surv[1:10]), col='NEW')) +
  scale_x_continuous('Time', limits = c(0, 161)) + 
  scale_y_continuous('Survival probability', limits = c(0, 1)) +
  geom_step(aes(x=c(0, sf$time[11:20]), y=c(1, sf$surv[11:20]), col='STD')) + 
  scale_colour_manual(name="Treatment", values = cols)

To control the line width, use the size parameter; e.g. geom_step(aes(x, y), size=.5). The default size is .5 (where to find this info?).

To allow different line types, use the linetype parameter. The first level is solid line, the 2nd level is dashed, ... We can change the default line types by using the scale_linetype_manual() function. See Line Types in R: The Ultimate Guide for R Base Plot and GGPLOT.

Coefficients, intervals, errorbars[edit]

Comparing similarities / differences between groups[edit]

comparing similarities / differences between groups

Special plots[edit]

Dot plot & forest plot[edit]

Correlation Analysis Different[edit]

Bump plot: plot ranking over time[edit]

https://github.com/davidsjoberg/ggbump

Gauge plots[edit]

Sankey diagrams[edit]

Aesthetics[edit]

  • We can create a new aesthetic name in aes(aesthetic = variable) function; for example, the "text2" below. In this case "text2" name will not be shown; only the original variable will be used.
    library(plotly)
    g <- ggplot(tail(iris), aes(Petal.Length, Sepal.Length, text2=Species)) + geom_point()
    ggplotly(g, tooltip = c("Petal.Length", "text2"))
    

aes_string()[edit]

group[edit]

https://ggplot2.tidyverse.org/reference/aes_group_order.html

GUI/Helper packages[edit]

ggedit & ggplotgui – interactive ggplot aesthetic and theme editor[edit]

esquisse (French, means 'sketch'): creating ggplot2 interactively[edit]

https://cran.rstudio.com/web/packages/esquisse/index.html

A 'shiny' gadget to create 'ggplot2' charts interactively with drag-and-drop to map your variables. You can quickly visualize your data accordingly to their type, export to 'PNG' or 'PowerPoint', and retrieve the code to reproduce the chart.

The interface introduces basic terms used in ggplot2:

  • x, y,
  • fill (useful for geom_bar, geom_rect, geom_boxplot, & geom_raster, not useful for scatterplot),
  • color (edges for geom_bar, geom_line, geom_point),
  • size,
  • facet, split up your data by one or more variables and plot the subsets of data together.

It does not include all features in ggplot2. At the bottom of the interface,

  • Labels & title & caption.
  • Plot options. Palette, theme, legend position.
  • Data. Remove subset of data.
  • Export & code. Copy/save the R code. Export file as PNG or PowerPoint.

ggcharts[edit]

https://cran.r-project.org/web/packages/ggcharts/index.html

ggeasy[edit]

ggx[edit]

https://github.com/brandmaier/ggx Create ggplot in natural language

Interactive[edit]

plotly[edit]

R web → plotly

ggiraph[edit]

ggiraph: Make 'ggplot2' Graphics Interactive

ggconf: Simpler Appearance Modification of 'ggplot2'[edit]

https://github.com/caprice-j/ggconf

Plotting individual observations and group means[edit]

https://drsimonj.svbtle.com/plotting-individual-observations-and-group-means-with-ggplot2

subplot[edit]

Adding/Inserting an image to ggplot2[edit]

Inserting an image to ggplot2: See annotation_custom.

See also ggbernie which uses a different way ggplot2::layer() and a self-defined geom (geometric object).

Easy way to mix/combine multiple graphs on the same page[edit]

Common legend[edit]

Add a common Legend for combined ggplots

library(ggplot2)
library(patchwork)

p1 <- ggplot(df1, aes(x = x, y = y, colour = group)) + 
  geom_point(position = position_jitter(w = 0.04, h = 0.02), size = 1.8)
p2 <- ggplot(df2, aes(x = x, y = y, colour = group)) + 
  geom_point(position = position_jitter(w = 0.04, h = 0.02), size = 1.8)

# Method 1:
p1 + p2 + theme(legend.position = "bottom") + plot_layout(guides = "collect")
                                          # two legends on the RHS
# Method 2:
p1 + p2 + plot_layout(guides = "collect") # two legends on the RHS
# Method 2:
p1 + theme(legend.position="none") + p2  # legend (based on p2) is on the RHS
# Method 3:
p1 + p2 + theme(legend.position="none")  # legend (based on p1) is in the middle!!

annotation_custom[edit]

  • predcurvePlot.R from TreatmentSelection. One issue is the font size is large for the text & labels at the bottom. The 2nd issue is the bottom part of the graph/annotation (marker value scale) can be truncated if the window size is too large. If the window is too small, the bottom part can overlap with the top part.
    p <- p + theme(plot.margin = unit(c(1,1,4,1), "lines"))  # hard coding
    p <- p + annotation_custom() # axis for marker value scale
    p <- p + annotation_custom() # label only
    
    • Similar plot but without using base R graphic. One issue is the text is not below the scale (this can be fixed by par(mar) & mtext(text, side=1, line=4)) and the 2nd issue is the same as ggplot2's approach.
      axis(1,at= breaks, label = round(quantile(x1, prob = breaks/100), 1),pos=-0.26) # hard coding
      
    • Another common problem is the plot saved by pdf() or png() can be truncated too. I have a better luck with png() though.

grid[edit]

gridExtra[edit]

Force a regular plot object into a Grob for use in grid.arrange[edit]

gridGraphics package

make one panel blank/create a placeholder[edit]

https://stackoverflow.com/questions/20552226/make-one-panel-blank-in-ggplot2

labs for x and y axes[edit]

x and y labels[edit]

https://stackoverflow.com/questions/10438752/adding-x-and-y-axis-labels-in-ggplot2 or the Labels part of the cheatsheet

You can set the labels with xlab() and ylab(), or make it part of the scale_*.* call.

labs(x = "sample size", y = "ngenes (glmnet)")

scale_x_discrete(name="sample size")
scale_y_continuous(name="ngenes (glmnet)", limits=c(100, 500))

Change tick mark labels[edit]

ggplot2 axis ticks : A guide to customize tick marks and labels

name-value pairs[edit]

See several examples (color, fill, size, ...) from opioid prescribing habits in texas.

Prevent sorting of x labels[edit]

See Change the order of a discrete x scale.

The idea is to set the levels of x variable.

junk   # n x 2 table
colnames(junk) <- c("gset", "boot")
junk$gset <- factor(junk$gset, levels = as.character(junk$gset))
ggplot(data = junk, aes(x = gset, y = boot, group = 1)) + 
  geom_line() + 
  theme(axis.text.x=element_text(color = "black", angle=30, vjust=.8, hjust=0.8))

Legends[edit]

Legend title[edit]

  • labs() function
    p <- ggplot(df, aes(x, y)) + geom_point(aes(colour = z))
    p + labs(x = "X axis", y = "Y axis", colour = "Colour\nlegend")
    
  • scale_colour_manual()
    scale_colour_manual("Treatment", values = c("black", "red"))
    
  • scale_color_discrete() and scale_shape_discrete(). See Combine colors and shapes in legend.
    df <- data.frame(x = 1:3, y = 1:3, z = c("a", "b", "c"))
    ggplot(df, aes(x, y)) + geom_point(aes(shape = z, colour = z), size=5) + 
      scale_color_discrete('new title') + scale_shape_discrete('new title')
    

Layout: move the legend from right to top/bottom of the plot or hide it[edit]

gg + theme(legend.position = "top")

# Useful in the boxplot case
gg + theme(legend.position="none")

Guide functions for finer control[edit]

https://ggplot2-book.org/scales.html#guide-functions The guide functions, guide_colourbar() and guide_legend(), offer additional control over the fine details of the legend.

guide_legend() allows the modification of legends for scales, including fill, color, and shape.

This function can be used in scale_fill_manual(), scale_fill_continuous(), ... functions.

scale_fill_manual(values=c("orange", "blue"), 
                  guide=guide_legend(title = "My Legend Title",
                                     nrow=1,  # multiple items in one row
                                     label.position = "top", # move the texts on top of the color key
                                     keywidth=2.5)) # increase the color key width

The problem with the default setting is it leaves a lot of white space above and below the legend. To change the position of the entire legend to the bottom of the plot, we use theme().

theme(legend.position = 'bottom')

Legend symbol background[edit]

ggplot() + geom_point(aes(x, y, color, size)) +
           theme(legend.key = element_blank())
           # remove the symbol background in legend

Construct a manual legend for a complicated plot[edit]

https://stackoverflow.com/a/17149021

Legend size[edit]

How to Change Legend Size in ggplot2 (With Examples)

ggtitle()[edit]

Centered title[edit]

See the Legends part of the cheatsheet.

ggtitle("MY TITLE") +
  theme(plot.title = element_text(hjust = 0.5))

Subtitle[edit]

ggtitle("My title",
        subtitle = "My subtitle")

margins[edit]

https://stackoverflow.com/a/10840417

Aspect ratio[edit]

?coord_fixed

p <- ggplot(mtcars, aes(mpg, wt)) + geom_point()
p + coord_fixed() # plot is compressed horizontally
p  # fill up plot region

Time series plot[edit]

Multiple lines plot https://stackoverflow.com/questions/14860078/plot-multiple-lines-data-series-each-with-unique-color-in-r

set.seed(45)
nc <- 9
df <- data.frame(x=rep(1:5, nc), val=sample(1:100, 5*nc), 
                   variable=rep(paste0("category", 1:nc), each=5))
# plot
# http://colorbrewer2.org/#type=qualitative&scheme=Paired&n=9
ggplot(data = df, aes(x=x, y=val)) + 
    geom_line(aes(colour=variable)) + 
    scale_colour_manual(values=c("#a6cee3", "#1f78b4", "#b2df8a", "#33a02c", "#fb9a99", "#e31a1c", "#fdbf6f", "#ff7f00", "#cab2d6"))

Versus old fashion

dat <- matrix(runif(40,1,20),ncol=4) # make data
matplot(dat, type = c("b"),pch=1,col = 1:4) #plot
legend("topleft", legend = 1:4, col=1:4, pch=1) # optional legend

calendR[edit]

Calendar plot in R using ggplot2

Github style calendar plot[edit]

geom_point()[edit]

df <- data.frame(x=1:3, y=1:3, color=c("red", "green", "blue"))
# Use I() to set aes values to the identify of a value from your data table
ggplot(df, aes(x,y, color=I(color))) + geom_point(size=10)
# VS
ggplot(df, aes(x,y, color=color)) + geom_point(size=10) # color is like a class label

geom_bar(), geom_col(), stat_count()[edit]

https://ggplot2.tidyverse.org/reference/geom_bar.html

geom_col(position = 'dodge')  # same as 
geom_bar(stat = 'identity', position = 'dodge')

geom_bar() can not specify the y-axis. To specify y-axis, use geom_col().

ggplot() + geom_col(mapping = aes(x, y))

Add numbers to the plot[edit]

An example

Ordered barplot and reorder()[edit]

Ordered barplot and facet

stat_function()[edit]

geom_area()[edit]

The Pfizer-Biontech Vaccine May Be A Lot More Effective Than You Think

geom_segment()[edit]

Line segments, arrows and curves

Cf annotate("segment", ...)

Square shaped plot[edit]

ggplot() + theme(aspect.ratio=1) # do not adjust xlim, ylim

xylim <- range(c(x, y))
ggplot() + coord_fixed(xlim=xylim, ylim=xylim) 

geom_line()[edit]

See also aes(..., group, ...).

Connect Paired Points with Lines in Scatterplot[edit]

Use geom_line() to create a square bracket to annotate the plot[edit]

Barchart with Significance Tests

geom_errorbar(): error bars[edit]

set.seed(301)
x <- rnorm(10)
SE <- rnorm(10)
y <- 1:10

par(mfrow=c(2,1))
par(mar=c(0,4,4,4))
xlim <- c(-4, 4)
plot(x[1:5], 1:5, xlim=xlim, ylim=c(0+.1,6-.1), yaxs="i", xaxt = "n", ylab = "", pch = 16, las=1)
mtext("group 1", 4, las = 1, adj = 0, line = 1) # las=text rotation, adj=alignment, line=spacing
par(mar=c(5,4,0,4))
plot(x[6:10], 6:10, xlim=xlim, ylim=c(5+.1,11-.1), yaxs="i", ylab ="", pch = 16, las=1, xlab="")
arrows(x[6:10]-SE[6:10], 6:10, x[6:10]+SE[6:10], 6:10, code=3, angle=90, length=0)
mtext("group 2", 4, las = 1, adj = 0, line = 1)

Stklnpt.svg

geom_rect(), geom_bar()[edit]

Note that we can use scale_fill_manual() to change the 'fill' colors (scheme/palette). The 'fill' parameter in geom_rect() is only used to define the discrete variable.

ggplot(data=) +
  geom_bar(aes(x=, fill=)) +
  scale_fill_manual(values = c("orange", "blue"))

geom_raster() and geom_tile()[edit]

geom_linerange[edit]

Circle[edit]

Circle in ggplot2 ggplot(data.frame(x = 0, y = 0), aes(x, y)) + geom_point(size = 25, pch = 1)

Annotation[edit]

geom_hline(), geom_vline()[edit]

geom_hline(yintercept=1000)
geom_vline(xintercept=99)

text annotations, annotate() and geom_text(): ggrepel package[edit]

  • ggrepel package. Found on Some datasets for teaching data science by Rafael Irizarry.
    p <- ggplot(dat, aes(wt, mpg, label = car)) +
      geom_point(color = "red")
    
    p1 <- p + geom_text() + labs(title = "geom_text()") # Bad
    
    p2 <- p + geom_text_repel() + labs(title = "geom_text_repel()") # Good
    

Text wrap[edit]

ggplot2 is there an easy way to wrap annotation text?

p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point()

# Solution 1: Not work with Chinese characters
wrapper <- function(x, ...) paste(strwrap(x, ...), collapse = "\n")
# The a label
my_label <- "Some arbitrarily larger text"
# and finally your plot with the label
p + annotate("text", x = 4, y = 25, label = wrapper(my_label, width = 5))

# Solution 2: Not work with Chinese characters
library(RGraphics)
library(ggplot2)
p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point()
grob1 <-  splitTextGrob("Some arbitrarily larger text")
p + annotation_custom(grob = grob1,  xmin = 3, xmax = 4, ymin = 25, ymax = 25) 

# Solution 3: stringr::str_wrap()
my_label <- "太極者無極而生。陰陽之母也。動之則分。靜之則合。無過不及。隨曲就伸。人剛我柔謂之走。我順人背謂之黏。"
p <- ggplot() + geom_point() + xlim(0, 400) + ylim(0, 300) # 400x300 e-paper
p + annotate("text", x = 0, y = 200, hjust=0, size=5,
             label = stringr::str_wrap(my_label, width =30)) +
    theme_bw () + 
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(), 
          panel.border = element_blank(),
          axis.title = element_blank(), 
          axis.text = element_blank(),
          axis.ticks = element_blank()) 

ggtext[edit]

ggtext: Improved text rendering support for ggplot2

ggforce - Annotate areas with ellipses[edit]

geom_mark_ellipse()

Other geoms[edit]

Exploring other {ggplot2} geoms

geomtextpath[edit]

geomtextpath- Create curved text in ggplot2

Fonts[edit]

Lines of best fit[edit]

Lines of best fit

Save the plots[edit]

My experience is ggsave() is better than png() because ggsave() makes the text larger when we save a file with a higher resolution.

...
ggsave("filename.png", object, width=8, height=4)
# vs
png("filename.png", width=1200, height=600)
...
dev.off()

ggsave() We can specify dpi to increase the resolution. For example,

g1 <- ggplot(data = mydf) 
g1
ggsave("myfile.png", g1, height = 7, width = 8, units = "in", dpi = 500)

I got an error - Error in loadNamespace(name) : there is no package called ‘svglite’. After I install the package, everything works fine.

ggsave("raw-output.bmp", p, width=4, height=3, dpi = 100)
# Will generate 4*100 x 3*100 pixel plot

Multiple pages in pdf[edit]

https://stackoverflow.com/a/53698682. The key is to save the plot in an object and use the print() function.

pdf("FileName", onefile = TRUE)
for(i in 1:I) {
  p <- ggplot()
  print(p)
}
dev.off()

graphics::smoothScatter[edit]

smoothScatter with ggplot2

Other tips/FAQs[edit]

Tips and tricks for working with images and figures in R Markdown documents

Ten Simple Rules for Better Figures[edit]

Ten Simple Rules for Better Figures

ggplot2 does not appear to work when inside a function[edit]

https://stackoverflow.com/a/17126172. Use print() or ggsave(). When you use these functions interactively at the command line, the result is automatically printed, but in source() or inside your own functions you will need an explicit print() statement.

BBC[edit]

Add your brand to ggplot graph[edit]

You Need to Start Branding Your Graphs. Here's How, with ggplot!

Animation and gganimate[edit]

ggstatsplot[edit]

ggstatsplot: ggplot2 Based Plots with Statistical Details

Write your own ggplot2 function: rlang[edit]

Some packages depend on ggplot2[edit]

dittoSeq from Bicoonductor

Meme[edit]

Python[edit]

plotnine: A Grammar of Graphics for Python.

plotnine is an implementation of a grammar of graphics in Python, it is based on ggplot2. The grammar allows users to compose plots by explicitly mapping data to the visual objects that make up the plot.

The Hitchhiker’s Guide to Plotnine