Ggplot2: Difference between revisions

From 太極
Jump to navigation Jump to search
 
(199 intermediate revisions by the same user not shown)
Line 57: Line 57:
* [https://cedricscherer.netlify.app/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/ A ggplot2 Tutorial for Beautiful Plotting in R]
* [https://cedricscherer.netlify.app/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/ A ggplot2 Tutorial for Beautiful Plotting in R]
* [https://rafalab.github.io/dsbook/ggplot2.html Chapter 7 ggplot2] from Introduction to Data Science Data Analysis and Prediction Algorithms with R, Rafael A. Irizarry
* [https://rafalab.github.io/dsbook/ggplot2.html Chapter 7 ggplot2] from Introduction to Data Science Data Analysis and Prediction Algorithms with R, Rafael A. Irizarry
* [https://youtu.be/h29g21z0a68 Plotting anything with ggplot2] - ggplot2 workshop part 1 (youtube) by Thomas Lin Pedersen


== Help ==
== Help ==
Line 67: Line 68:
* https://www.r-graph-gallery.com/ggplot2-package.html
* https://www.r-graph-gallery.com/ggplot2-package.html
* http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-Code.html
* http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-Code.html
* [https://www.cedricscherer.com/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/ A ggplot2 Tutorial for Beautiful Plotting in R]


= Some examples =
= Some examples =
Line 160: Line 162:


== facet_wrap and facet_grid to create a panel of plots ==
== facet_wrap and facet_grid to create a panel of plots ==
* '''facet_wrap'''(, nrow=4, ncol=3) in ggplot2 provides a solution similar to par(mfrow=c(4, 3)) in base R.
* http://www.cookbook-r.com/Graphs/Facets_(ggplot2)/
* http://www.cookbook-r.com/Graphs/Facets_(ggplot2)/
* Another example [http://freerangestats.info/blog/2019/05/19/polls-v-results Polls v results]
* Another example [http://freerangestats.info/blog/2019/05/19/polls-v-results Polls v results]
Line 177: Line 180:
[https://juliasilge.com/blog/nber-papers/ Multiclass predictive modeling for #TidyTuesday NBER papers]
[https://juliasilge.com/blog/nber-papers/ Multiclass predictive modeling for #TidyTuesday NBER papers]
</li>
</li>
<li>[https://stackoverflow.com/a/63858007 changing the facet_wrap labels using labeller in ggplot2]. The solution is to create a '''labeller''' function as a function of a variable x (or any other name as long as it's not the faceting variables' names) and then coerce to labeller with '''as_labeller'''.
</ul>
</ul>
== lattice::xyplot ==
<pre>
df <- data.frame(x = rnorm(100), y = rnorm(100), group = sample(c("A", "B"), 100, replace = TRUE))
# Use the xyplot() function to create the plot
# with each group represented by a different color
# result is 1 plot only
# no annotation
xyplot(y ~ x, data = df, groups = group)
</pre>
<pre>
df <- data.frame(x = rnorm(100), y = rnorm(100),
                group = sample(c("A", "B"), 100, replace = TRUE),
                time = sample(c("T1", "T2"), 100, replace = TRUE))
# 2 plots grouped by time
# two colors (defined by group) was used in each plot
# no annotation
xyplot(y ~ x | time, groups = group, data = df)
</pre>
For more complicated plot, we can use the '''panel''' parameter.


= Color palette =
= Color palette =
Line 187: Line 214:
* [https://twitter.com/moriah_taylor58/status/1395431000977649665?s=20 a MEGA thread about all the ways you can choose a palette] May 2021
* [https://twitter.com/moriah_taylor58/status/1395431000977649665?s=20 a MEGA thread about all the ways you can choose a palette] May 2021


== Color blind ==
== Top color palettes ==
[https://cran.r-project.org/web/packages/colorblindcheck/index.html colorblindcheck]: Check Color Palettes for Problems with Color Vision Deficiency
* [https://www.datanovia.com/en/blog/top-r-color-palettes-to-know-for-great-data-visualization/ Top R Color Palettes to Know for Great Data Visualization]
 
== Color picker ==
https://github.com/daattali/colourpicker


== Display color palettes ==
<ul>
<li>Use barplot()
<pre>
<pre>
> library(colourpicker)
pal <- c("#E41A1C", "#377EB8", "#4DAF4A", "#984EA3", "#FF7F00")
> plotHelper(colours=5)
# pal <- sample(colors(), 10) # randomly pick 10 colors


Listening on http://127.0.0.1:6023
barplot(rep(1, length(pal)), col = pal, space = 0,
        axes = FALSE, border = NA)
par()$usr
# [1] -0.20  5.20 -0.01 1.00
</pre>
</pre>
[[File:Palettebarplot.png|250px]]


== Color names ==
<li>Use heatmap()
* [https://github.com/msanchez-beeckman/colornamer ColorNameR] - A tool for transforming coordinates in a color space to common color names using data from the Royal Horticultural Society and the International Union for the Protection of New Varieties of Plants.
<pre>
* [https://www.colorhexa.com/color-names ColorHexa]
pal <- c("#E41A1C", "#377EB8", "#4DAF4A", "#984EA3", "#FF7F00")
pal <- matrix(pal, nr=2) # acknowledge a nice warning message
#      [,1]      [,2]      [,3]   
# [1,] "#E41A1C" "#4DAF4A" "#FF7F00"
# [2,] "#377EB8" "#984EA3" "#E41A1C"
pal_matrix <- matrix(seq_along(pal), nr=nrow(pal), nc=ncol(pal))
heatmap(pal_matrix, col = pal, Rowv = NA, Colv = NA, scale = "none",
        ylab = "", xlab = "", main = "", margins = c(5, 5))
# 2 rows, 3 columns with labeling on two axes
par()$usr
# [1] 0 1 0 1
</pre>
[[File:Paletteheatmap.png|250px]]


== colorspace package ==
<li>Use image()
[https://cran.r-project.org/web/packages/colorspace/index.html colorspace]: A Toolbox for Manipulating and Assessing Colors and Palettes
<pre>
pal <- palette() # R 4.0 has a new default palette
                # The old colors are highly saturated and vary enormousely
                # in terms of luminance
# [1] "black"  "#DF536B" "#61D04F" "#2297E6" "#28E2E5" "#CD0BBC" "#F5C710"
# [8] "gray62"
pal_matrix <- matrix(seq_along(pal), nr=1)
image(pal_matrix, col = pal, axes = FALSE)
# 8 rows, 1 column, but no labeling
# Starting from bottom, left.


[http://colorspace.r-forge.r-project.org/reference/scale_colour_discrete_qualitative.html scale_fill_discrete_qualitative(palette)] and an [https://www.brodrigues.co/blog/2020-04-12-basic_ggplot2/ example]. The palette selections are different from scale_fill_XXX(). Note that the number of classes can be arbitrary in scale_fill_discrete_qualitative().
par()$usr  # change with the data dim
text(0, (par()$usr[4]-par()$usr[3])/8*c(0:7),
    labels = pal)
</pre>
[[File:Rpalette.png|250px]]


== *paletteer package ==
<li>Use [https://scales.r-lib.org/reference/show_col.html scales::show_col()]
* [https://paulvanderlaken.com/2020/03/17/paletteer-hundreds-of-color-palettes-in-r/ The paletteer package offers direct access to 1759 color palettes, from 50 different packages!]
<pre>
* [https://emilhvitfeldt.github.io/paletteer/index.html paletteer], [https://emilhvitfeldt.github.io/paletteer/reference/paletteer_d.html paletteer_d()] function for getting discrete palette by package and name.
scales::show_col(palette())
* [https://awesomeopensource.com/project/EmilHvitfeldt/r-color-palettes *More examples with a gallery]
</pre>
[[File:Paletteshowcol.png|250px]]
</ul>
 
== colors() ==
In R, colors() is a function that returns a character vector of color names available in R.


To obtain the hexadecimal codes for all colors obtained by colors()
<pre>
<pre>
paletteer_d("RColorBrewer::RdBu")
rgb_values <- col2rgb(colors())
#67001FFF #B2182BFF #D6604DFF #F4A582FF #FDDBC7FF #F7F7F7FF
#D1E5F0FF #92C5DEFF #4393C3FF #2166ACFF #053061FF


paletteer_d("ggsci::uniform_startrek")
# Convert the RGB values to hexadecimal codes
#CC0C00FF #5C88DAFF #84BD00FF #FFCD00FF #7C878EFF #00B5E2FF #00AF66FF
hex_codes <- apply(rgb_values, 2,
                  function(x) rgb(x[1], x[2], x[3],
                  maxColorValue = 255))


ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
# View the first few hexadecimal codes
      geom_point() +
head(hex_codes)
      scale_color_paletteer_d("ggsci::uniform_startrek")
# the next is the same as above
ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
    geom_point() +
    scale_color_manual(values = c("setosa" = "#CC0C00FF",
                                  "versicolor" = "#5C88DAFF",
                                  "virginica" = "#84BD00FF"))
</pre>
</pre>


== ggokabeito ==
== palette() ==
[https://cran.r-project.org/web/packages/ggokabeito/index.html ggokabeito]: Colorblind-friendly, qualitative 'Okabe-Ito' Scales for ggplot2 and ggraph.
* [https://developer.r-project.org/Blog/public/2019/11/21/a-new-palette-for-r/ A New palette() for R 4.0]
<pre>
* [https://rdrr.io/r/grDevices/palette.html ?palette] and [https://stat.ethz.ch/R-manual/R-devel/library/grDevices/html/palette.html the dev version]
# Bad
* [https://detroitdatalab.com/2020/04/28/4-for-4-0-0-four-useful-new-features-in-r-4-0-0/ 4 for 4.0.0 – Four Useful New Features in R 4.0.0]
ggplot(mpg, aes(hwy, color = class, fill = class)) +
* [https://flowingdata.com/2023/05/10/improved-color-palettes-in-r/ Improved color palettes in R]
    geom_density(alpha = .8)
 
== rainbow ==
* [https://www.rdocumentation.org/packages/grDevices/versions/3.6.2/topics/Palettes ?rainbow]
* Below compare the effects of 's' and 'v' parameters. '''s (saturation)''' and '''v (value)''': These parameters control the color intensity and brightness, respectively. See also [https://en.wikipedia.org/wiki/HSL_and_HSV HSL and HSV] from wikipedia.
** '''Saturation (s)''': Determines how '''vivid''' or muted the colors are. A value of 1 (default) means fully saturated colors, while lower values reduce the intensity.
** '''Value (v)''': Controls the '''brightness'''. A value of 1 (default) results in full brightness, while lower values make the colors darker.


# Bad (single color)
[[File:Rainbow default.png|250px]] [[File:Rainbow s05.png|250px]] [[File:Rainbow v05.png|250px]]
ggplot(mpg, aes(hwy, color = class, fill = class)) +
    geom_density(alpha = .8) +
    scale_fill_brewer(name = "Class") +
    scale_color_brewer(name = "Class")


# Bad
== Color blind ==
ggplot(mpg, aes(hwy, color = class, fill = class)) +
[https://cran.r-project.org/web/packages/colorblindcheck/index.html colorblindcheck]: Check Color Palettes for Problems with Color Vision Deficiency
    geom_density(alpha = .8) +
    scale_fill_brewer(name = "Class", palette ="Set1") +
    scale_color_brewer(name = "Class", palette ="Set1")


# Nice
== Color picker ==
ggplot(mpg, aes(hwy, color = class, fill = class)) +
https://github.com/daattali/colourpicker
    geom_density(alpha = .8) +
    scale_fill_okabe_ito(name = "Class") +
    scale_color_okabe_ito(name = "Class")
</pre>


== Colour related aesthetics: colour, fill and alpha ==
<pre>
https://ggplot2.tidyverse.org/reference/aes_colour_fill_alpha.html
> library(colourpicker)
> plotHelper(colours=5)


=== Scatterplot with large number of points: alpha ===
Listening on http://127.0.0.1:6023
[https://wahani.github.io/2015/12/smoothScatter-with-ggplot2/ smoothScatter with ggplot2]
<pre>
ggplot(aes(x, y)) +
    geom_point(alpha=.1)
</pre>
</pre>


== Combine colors and shapes in legend ==
== Color names, Complementary/Inverted colors ==
<ul>
* [https://github.com/msanchez-beeckman/colornamer ColorNameR] - A tool for transforming coordinates in a color space to common color names using data from the Royal Horticultural Society and the International Union for the Protection of New Varieties of Plants.  
<li>https://ggplot2-book.org/scales.html#scale-details In order for legends to be merged, they must have the same name.  
* [https://www.colorhexa.com/color-names ColorHexa]
<pre>
* https://pinetools.com/invert-color
df <- data.frame(x = 1:3, y = 1:3, z = c("a", "b", "c"))
ggplot(df, aes(x, y)) + geom_point(aes(shape = z, colour = z), size=4)
</pre>
</li>
<li>[https://www.dummies.com/programming/r/how-to-work-with-scales-in-a-ggplot2-in-r/ How to Work with Scales in a ggplot2 in R]. This solution is better since it allows to change the legend title. Just make sure the title name we put in both scale_* functions are the same.
<pre>
ggplot(mtcars, aes(x=hp, y=mpg)) +
  geom_point(aes(shape=factor(cyl), colour=factor(cyl))) +
  scale_shape_discrete("Cylinders") +
  scale_colour_discrete("Cylinders")
</pre>
</li>
</ul>


== ggplot2::scale functions and scales packages ==
== colorspace package ==
* Scales control the mapping from data to aesthetics. They take your data and turn it into something that you can see, like '''size, colour, position''' or '''shape'''.  
* https://colorspace.r-forge.r-project.org/ More vignettes than CRAN have.
* Scales also provide the tools that let you read the plot: the axes and legends.
** [http://colorspace.r-forge.r-project.org/articles/approximations.html Approximating Palettes from Other Packages]
** it supports R's base graphics and also ggplot2 (eg [http://colorspace.r-forge.r-project.org/reference/scale_colour_discrete_qualitative.html scale_fill_discrete_qualitative(palette)] , notice the part '''discrete_quantitative''' is specific to colorspace package). See my [[Ggplot2#Color_fill.2Fscale_fill_XXX|ggplot2]] page.
* CRAN [https://cran.r-project.org/web/packages/colorspace/index.html colorspace]: A Toolbox for Manipulating and Assessing Colors and Palettes
* Some [https://www.brodrigues.co/blog/2020-04-12-basic_ggplot2/ examples]. The palette selections are different from scale_fill_XXX(). Note that the number of classes can be arbitrary in scale_fill_discrete_qualitative().
* Note
** why it does not "Set 1"?
** the "Dark 2" colors are not the same as in [https://www.datanovia.com/en/blog/the-a-z-of-rcolorbrewer-palette/ RColorBrewer].


=== ggplot2::scale - axes/axis, legend ===
== cols4all ==
https://ggplot2-book.org/scales.html
* https://github.com/mtennekes/cols4all. You can use '''cols4all''' palettes in ggplot2.
<syntaxhighlight lang='rsplus'>
c4a_gui() # it will create a shiny interface (but R will not be used at the same time)


Naming convention: <span style="color: red">'''scale_AestheticName_NameDataType'''</span> where
c4a_types() # understand abbreviation
* AestheticName can be '''x, y, color, fill, size, shape, ...'''
* NameDataType can be '''continuous, discrete''', '''manual''' or '''gradient'''.


Examples:
c4a_series() # 16 series like brewer, hcl, tableau, viridis, etc
* [https://ggplot2.tidyverse.org/reference/scale_discrete.html scale_x_discrete], [https://ggplot2.tidyverse.org/reference/scale_continuous.html scale_y_continuous]
* [https://ggplot2.tidyverse.org/reference/scale_manual.html Create your own discrete scale]:
** scale_colour_manual(),
** scale_fill_manual(values),
** scale_size_manual(),  
** scale_shape_manual(),  
** scale_linetype_manual(),  
** scale_alpha_manual(),  
** scale_discrete_manual()
<ul>
<li> See Figure 12.1: '''Axis''' and '''legend''' components on the book [https://ggplot2-book.org/scales.html#guides ggplot2: Elegant Graphics for Data Analysis]
<pre>
# Set x-axis label
scale_x_discrete("Car type")  # or a shortcut xlab() or labs()
scale_x_continuous("Displacement")


# Set legend title
c4a_overview() # how many palettes per series x types
scale_colour_discrete("Drive\ntrain")   # or a shortcut labs()


# Change the default color
c4a_palettes(type = "div", series = "hcl") # What palettes are available
scale_color_brewer()


# Change the axis scale
# Give me the colors
scale_x_sqrt()
c4a("hcl.purple_green", 11)
c4a("brewer.accent", 2)    # the 1st one on the website


# Change breaks and their labels
# Plot the colors
scale_x_continuous(breaks = c(2000, 4000), labels = c("2k", "4k"))
c4a_plot("hcl.purple_green", 11, include.na = TRUE)
</syntaxhighlight>
 
== *paletteer package ==
* [https://paulvanderlaken.com/2020/03/17/paletteer-hundreds-of-color-palettes-in-r/ The paletteer package offers direct access to 1759 color palettes, from 50 different packages!]
* [https://emilhvitfeldt.github.io/paletteer/index.html paletteer], [https://emilhvitfeldt.github.io/paletteer/reference/paletteer_d.html paletteer_d()] function for getting discrete palette by package and name.
* Interactive https://emilhvitfeldt.github.io/r-color-palettes/discrete.html and choose 'sort by length'
* [https://github.com/EmilHvitfeldt/r-color-palettes/blob/master/type-sorted-palettes.md#diverging-color-palettes Palettes sorted by type (Sequential/Diverging/Qualitative)]
* [https://awesomeopensource.com/project/EmilHvitfeldt/r-color-palettes *More examples with a gallery]


# Relabel the breaks in a categorical scale
<syntaxhighlight lang='rsplus'>
scale_y_discrete(labels = c(a = "apple", b = "banana", c = "carrot"))
paletteer_d("RColorBrewer::RdBu")
</pre>
#67001FFF #B2182BFF #D6604DFF #F4A582FF #FDDBC7FF #F7F7F7FF
</li>
#D1E5F0FF #92C5DEFF #4393C3FF #2166ACFF #053061FF
<li>[https://stackoverflow.com/a/43770608 How to change the color in geom_point or lines in ggplot]
<pre>
ggplot() +
  geom_point(data = data, aes(x = time, y = y, color = sample),size=4) +
  scale_color_manual(values = c("A" = "black", "B" = "red"))


ggplot(data = data, aes(x = time, y = y, color = sample)) +
paletteer_d("ggsci::uniform_startrek")
  geom_point(size=4) +
#CC0C00FF #5C88DAFF #84BD00FF #FFCD00FF #7C878EFF #00B5E2FF #00AF66FF
  geom_line(aes(group = sample)) +
  scale_color_manual(values = c("A" = "black", "B" = "red"))
</pre>
</li>
<li>See an example at [[#geom_linerange|geom_linerange]] where we have to specify the ''limits'' parameter in order to make "8" < "16" < "20"; otherwise it is 16 < 20 < 8.
<pre>
Browse[2]> order(coordinates$chr)
[1] 3 4 1 2
Browse[2]> coordinates$chr
[1] "20" "8"  "16" "16"
</pre>
</li>
</ul>


=== ylim and xlim in ggplot2 in axes ===
ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
https://stackoverflow.com/questions/3606697/how-to-set-limits-for-axes-in-ggplot2-r-plots or the '''Zooming''' part of the [https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf cheatsheet]
      geom_point() +
      scale_color_paletteer_d("ggsci::uniform_startrek")
# the next is the same as above
ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
    geom_point() +
    scale_color_manual(values = c("setosa" = "#CC0C00FF",
                                  "versicolor" = "#5C88DAFF",
                                  "virginica" = "#84BD00FF"))
</syntaxhighlight>


Use one of the following
== ggsci ==
* + scale_x_continuous(limits = c(-5000, 5000))
* https://cran.r-project.org/web/packages/ggsci/index.html
* + coord_cartesian(xlim = c(-5000, 5000))
* https://nanx.me/ggsci/
* + xlim(-5000, 5000)


=== Emulate ggplot2 default color palette ===
== ggokabeito ==
It is just equally spaced hues around the color wheel.
[https://cran.r-project.org/web/packages/ggokabeito/index.html ggokabeito]: Colorblind-friendly, qualitative 'Okabe-Ito' Scales for ggplot2 and ggraph. It seems to only support up to 9 classes/colors. It will give an error message if we have too many classes; e.g. Error: Insufficient values in manual scale. 15 needed but only 9 provided.)
[https://stackoverflow.com/questions/8197559/emulate-ggplot2-default-color-palette Emulate ggplot2 default color palette]
<pre>
# Bad
ggplot(mpg, aes(hwy, color = class, fill = class)) +
    geom_density(alpha = .8)


'''Answer 1'''
# Bad (single color)
<syntaxhighlight lang='rsplus'>
ggplot(mpg, aes(hwy, color = class, fill = class)) +
gg_color_hue <- function(n) {
    geom_density(alpha = .8) +
  hues = seq(15, 375, length = n + 1)
    scale_fill_brewer(name = "Class") +
  hcl(h = hues, l = 65, c = 100)[1:n]
    scale_color_brewer(name = "Class")
}


n = 4
# Bad
cols = gg_color_hue(n)
ggplot(mpg, aes(hwy, color = class, fill = class)) +
    geom_density(alpha = .8) +
    scale_fill_brewer(name = "Class", palette ="Set1") +
    scale_color_brewer(name = "Class", palette ="Set1")


dev.new(width = 4, height = 4)
# Nice
plot(1:n, pch = 16, cex = 2, col = cols)
ggplot(mpg, aes(hwy, color = class, fill = class)) +
</syntaxhighlight>
    geom_density(alpha = .8) +
    scale_fill_okabe_ito(name = "Class") +
    scale_color_okabe_ito(name = "Class")
</pre>


'''Answer 2''' (better, it shows the color values in HEX). It should be read from left to right and then top to down.
== Pride palette ==
[https://turtletopia.github.io/2022/08/12/show-pride-on-your-plots/ Show Pride on Your Plots]. [https://github.com/turtletopia/gglgbtq gglgbtq] package


[https://scales.r-lib.org/ scales] package
== unikn ==
<syntaxhighlight lang='rsplus'>
* [https://github.com/hneth/unikn unikn]: Enabling corporate design elements in R (with colors and color-related functions). The curve plot is interesting.
library(scales)
* [https://www.infoworld.com/article/3667496/12-ggplot-extensions-for-snazzier-r-graphics.html?page=2 12 ggplot extensions for snazzier R graphics]
show_col(hue_pal()(4)) # ("#F8766D", "#7CAE00", "#00BFC4", "#C77CFF")
                      # (Salmon, Christi, Iris Blue, Heliotrope)
show_col(hue_pal()(2)) # ("#F8767D", "#00BFC4") = (salmon, iris blue)  
          # see https://www.htmlcsscolor.com/ for color names
</syntaxhighlight>
See also the last example in [https://ggobi.github.io/ggally/reference/ggsurv.html ggsurv()] where the KM plots have 4 strata. The colors can be obtained by '''scales::hue_pal()(4)''' with hue_pal()'s default arguments.


R has a function called colorName() to convert a hex code to color name; see [https://www.stat.auckland.ac.nz/~paul/Reports/roloc/intro/roloc.html roloc] package.
== Colour related aesthetics: colour, fill and alpha ==
https://ggplot2.tidyverse.org/reference/aes_colour_fill_alpha.html


=== transform scales ===
=== Scatterplot with large number of points: alpha ===
[http://freerangestats.info/blog/2020/04/06/crazy-fox-y-axis How to make that crazy Fox News y axis chart with ggplot2 and scales]
[https://wahani.github.io/2015/12/smoothScatter-with-ggplot2/ smoothScatter with ggplot2]
<pre>
ggplot(aes(x, y)) +
    geom_point(alpha=.1)
</pre>
For base R, we can use the '''alpha''' parameter [https://www.rdocumentation.org/packages/grDevices/versions/3.6.2/topics/rgb rgb(,,,alpha)],
<pre>
plot(x, y, col=rgb(0,0,0, alpha=.1))
polygon(df, col=adjustcolor(c("red", "blue"), alpha.f=.3))
</pre>


== Class variables ==
== Combine colors and shapes in legend ==
"Set1" is a good choice. See [http://www.sthda.com/english/wiki/colors-in-r RColorBrewer::display.brewer.all()]
<ul>
<li>https://ggplot2-book.org/scales.html#scale-details In order for legends to be merged, they must have the same name.
<pre>
df <- data.frame(x = 1:3, y = 1:3, z = c("a", "b", "c"))
ggplot(df, aes(x, y)) + geom_point(aes(shape = z, colour = z), size=4)
</pre>
</li>
<li>[https://www.dummies.com/programming/r/how-to-work-with-scales-in-a-ggplot2-in-r/ How to Work with Scales in a ggplot2 in R]. This solution is better since it allows to change the legend title. Just make sure the title name we put in both scale_* functions are the same.
<pre>
ggplot(mtcars, aes(x=hp, y=mpg)) +
  geom_point(aes(shape=factor(cyl), colour=factor(cyl))) +
  scale_shape_discrete("Cylinders") + # change the legend title from 'factor(cyl)' to 'Cylinders'
  scale_colour_discrete("Cylinders")  # combine shape and colour in one legend; avoid another legend for colour
</pre>
</li>
<li>[https://www.datanovia.com/en/blog/ggplot-point-shapes-best-tips/ GGPLOT Point Shapes Best Tips] </li>
<li>Simulated data
<pre>
df <- data.frame(x = rnorm(100), y = rnorm(100),
                Treatment = rep(c("Before", "After"), each = 50),
                Response = rep(c("Sensitive", "Resistant"), each = 50),
                Subject = rep(1:50, times = 2))


== Heatmap for single channel ==
ggplot(df, aes(x = x, y = y, shape = Treatment, color = Response)) +
[https://youtu.be/TP8vjWiIpgI How to Make a Heatmap of Customers in R], [https://github.com/business-science/free_r_tips source code] on github. geom_tile() and geom_text() were used. [https://r-charts.com/correlation/heat-map-ggplot2/ Heatmap in ggplot2] from https://r-charts.com/.
  geom_point() +
  geom_line(aes(group = Subject), alpha = 0.5) +  # Add lines connecting the same subject
  scale_shape_manual(values = c(16, 17)) +  # You can choose different shapes
  scale_color_manual(values = c("blue", "red")) +  # You can choose different colors
  theme_minimal() +
  labs(title = "Scatterplot with Different Shapes and Colors",
      x = "X-axis label",
      y = "Y-axis label",
      shape = "Treatment",
      color = "Response")
</pre>
</ul>


https://scales.r-lib.org/
== ggplot2::scale functions and scales packages ==
<syntaxhighlight lang='rsplus'>
* Scales control the mapping from data to aesthetics. They take your data and turn it into something that you can see, like '''size, colour, position''' or '''shape'''.
# White <----> Blue
* Scales also provide the tools that let you read the plot: the axes and legends.
RColorBrewer::display.brewer.pal(n = 8, name = "Blues")
* [https://www.tidyverse.org/blog/2022/04/scales-1-2-0/ scales 1.2.0]
</syntaxhighlight>


== Heatmap for dual channels ==
=== ggplot2::scale_* - axes/axis, legend ===
http://www.sthda.com/english/wiki/colors-in-r <syntaxhighlight lang='rsplus'>
https://ggplot2-book.org/scales.html and [https://ggplot2.tidyverse.org/reference/index.html#scales reference of all scale_* functions]. Modifies the scales of the axes, such as the x- and y-axes, color, size, etc.
library(RColorBrewer)
# Red <----> Blue
display.brewer.pal(n = 8, name = 'RdBu')
# Hexadecimal color specification
brewer.pal(n = 8, name = "RdBu")


plot(1:8, col=brewer_pal(palette = "RdBu")(8), pch=20, cex=4)
Naming convention: <span style="color: red">'''scale_AestheticName_NameDataType'''</span> where
* AestheticName can be '''x, y, color, fill, size, shape, ...'''
* NameDataType can be '''continuous, discrete''', '''manual''' or '''gradient'''.
* Table of common functions
{| class="wikitable"  
|-
!
! scale_AestheticName_NameDataType
|-
|
| scale_x_continuous<br />scale_x_discrete
|-
|
| scale_x_log10
|-
|
| scale_color_continuous, <br />scale_color_gradient<br />scale_color_discrete<br />scale_color_brewer<br />scale_color_manual<br />scale_color_paletteer_d
|-
|
| scale_shape_discrete
|-
|
| scale_fill_brewer, <br />scale_fill_continuous,<br />scale_fill_discrete, <br />scale_fill_gradient<br />scale_fill_grey, <br />scale_fill_hue<br />scale_fill_manual,<br />scale_colour_viridis_d
|}


# Blue <----> Red
plot(1:8, col=rev(brewer_pal(palette = "RdBu")(8)), pch=20, cex=4)
</syntaxhighlight>


[[File:Twopalette.svg|300px]]
Examples:
 
* [https://ggplot2.tidyverse.org/reference/scale_discrete.html scale_x_discrete], [https://ggplot2.tidyverse.org/reference/scale_continuous.html scale_y_continuous]
= Themes and background for ggplot2 =
* [https://ggplot2.tidyverse.org/reference/scale_manual.html Create your own discrete scale]:
* [https://henrywang.nl/ggplot2-theme-elements-demonstration/ ggplot2 Theme Elements Demonstration]
** scale_colour_manual(),
 
** scale_fill_manual(values),
== Background ==
** scale_size_manual(),
** scale_shape_manual(),
** scale_linetype_manual(),
** scale_alpha_manual(),
** scale_discrete_manual()
<ul>
<ul>
<li>[https://stackoverflow.com/a/43614963 Export plot in .png with transparent background] in base R plot.
<li> See Figure 12.1: '''Axis''' and '''legend''' components on the book [https://ggplot2-book.org/scales.html#guides ggplot2: Elegant Graphics for Data Analysis]
<pre>
<pre>
x = c(1, 2, 3)
# Set x-axis label
op <- par(bg=NA)
scale_x_discrete("Car type")  # or a shortcut xlab() or labs()
plot (x)
scale_x_continuous("Displacement")


dev.copy(png,'myplot.png')
# Set legend title
dev.off()
scale_colour_discrete("Drive\ntrain")    # or a shortcut labs()
par(op)
 
# Change the default color
scale_color_brewer()
 
# Change the axis scale
scale_x_sqrt()
 
# Change breaks and their labels
scale_x_continuous(breaks = c(2000, 4000), labels = c("2k", "4k"))
 
# Relabel the breaks in a categorical scale
scale_y_discrete(labels = c(a = "apple", b = "banana", c = "carrot"))
</pre>
</pre>
</li>
</li>
<li>[https://stackoverflow.com/a/41878833 Transparent background with ggplot2]
<li>[https://stackoverflow.com/a/43770608 How to change the color in geom_point or lines in ggplot]
<pre>
<pre>
library(ggplot2)
ggplot() +
data("airquality")
  geom_point(data = data, aes(x = time, y = y, color = sample),size=4) +
  scale_color_manual(values = c("A" = "black", "B" = "red"))


p <- ggplot(airquality, aes(Solar.R, Temp)) +
ggplot(data = data, aes(x = time, y = y, color = sample)) +  
    geom_point() +
  geom_point(size=4) +  
    geom_smooth() +
  geom_line(aes(group = sample)) +
    # set transparency
  scale_color_manual(values = c("A" = "black", "B" = "red"))
    theme(
</pre>
        panel.grid.major = element_blank(),
</li>
        panel.grid.minor = element_blank(),
<li>See an example at [[#geom_linerange|geom_linerange]] where we have to specify the ''limits'' parameter in order to make "8" < "16" < "20"; otherwise it is 16 < 20 < 8.
        panel.background = element_rect(fill = "transparent",colour = NA),
<pre>
        plot.background = element_rect(fill = "transparent",colour = NA)
Browse[2]> order(coordinates$chr)
        )
[1] 3 4 1 2
p
Browse[2]> coordinates$chr
ggsave("airquality.png", p, bg = "transparent")
[1] "20" "8"  "16" "16"
</pre>
</pre>
</li>
</li>
<li>[https://www.datanovia.com/en/blog/ggplot-theme-background-color-and-grids/ ggplot2 theme background color and grids]
<li>Differences of scale_color_gradient() and scale_color_continuous()
* '''scale_color_gradient()''' (more common than scale_color_continuous) is used to map a continuous variable to a color gradient. It takes two arguments: low and high, which specify the colors for the minimum and maximum values of the variable, respectively. The gradient is automatically generated between these two colors.
<pre>
<pre>
ggplot() + geom_bar(aes(x=, fill=y)) +
ggplot(data = diamonds, aes(x = carat, y = price, color = depth)) +
          theme(panel.background=element_rect(fill='purple')) +  
  geom_point() +
          theme(plot.background=element_blank())
  scale_color_gradient(low = "blue", high = "red")
 
</pre>
ggplot() + geom_bar(aes(x=, fill=y)) +  
* '''scale_color_continuous()''' (useful if we want to specify the labels to display on legend) does not automatically generate the color scale. Instead, it requires the user to specify the values to which the colors should be mapped. The limits argument sets the minimum and maximum values for the variable, and the breaks argument specifies the values at which breaks occur.
          theme(panel.background=element_blank()) +
<pre>
          theme(plot.background=element_blank()) # minimal background like base R
ggplot(data = diamonds, aes(x = carat, y = price, color = depth)) +
          # the grid lines are not gone; they are white so it is the same as the background
    geom_point() +
    scale_color_continuous(name = "Depth",
                            limits = c(40, 80),
                            breaks = c(40, 60, 80),
                            labels = c("Shallow", "Moderate", "Deep"), # display on legend
                            type = "gradient")
</pre>
</li>
</ul>


ggplot() + geom_bar(aes(x=, fill=y)) +
=== ylim and xlim in ggplot2 in axes ===
          theme(panel.background=element_blank()) +
https://stackoverflow.com/questions/3606697/how-to-set-limits-for-axes-in-ggplot2-r-plots or the '''Zooming''' part of the [https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf cheatsheet]
          theme(plot.background=element_blank()) +
          theme(panel.grid.major.y = element_line(color="grey"))
          # draw grid line on y-axis only


ggplot() + geom_bar() +
Use one of the following
          theme_bw()
* + scale_x_continuous(limits = c(-5000, 5000))
* + coord_cartesian(xlim = c(-5000, 5000))  
* + xlim(-5000, 5000)


ggplot() + geom_bar() +
=== Emulate ggplot2 default color palette ===
          theme_minimal()
[[File:Paletteggplot2.png|250px]]


ggplot() + geom_bar() +
The above can be created by R >= 4.0.0 using the command '''scales::show_col(palette.colors(palette = "ggplot2"))'''. We should ignore the 1st color (black). Also if n>=5, the colors do not match with the result of '''show_col(hue_pal()(5))''' .
          theme_void()


ggplot() + geom_bar() +
'''Answer 1''' It is just equally spaced hues around the color wheel.
          theme_dark()
[https://stackoverflow.com/questions/8197559/emulate-ggplot2-default-color-palette Emulate ggplot2 default color palette]
</pre>
<syntaxhighlight lang='rsplus'>
</li>
gg_color_hue <- function(n) {
</ul>
  hues = seq(15, 375, length = n + 1)
  hcl(h = hues, l = 65, c = 100)[1:n]
}


== ggthmr ==
n = 4
[http://www.shanelynn.ie/themes-and-colours-for-r-ggplots-with-ggthemr/ ggthmr] package
cols = gg_color_hue(n)


== ggsci ==
dev.new(width = 4, height = 4)
* https://cran.r-project.org/web/packages/ggsci/index.html
plot(1:n, pch = 16, cex = 2, col = cols)
* https://nanx.me/ggsci/
</syntaxhighlight>
* [https://www.datanovia.com/en/blog/top-r-color-palettes-to-know-for-great-data-visualization/ Top R Color Palettes to Know for Great Data Visualization]


== Font size ==
'''Answer 2''' (better, it shows the color values in HEX). It should be read from left to right and then top to down.
[https://statisticsglobe.com/change-font-size-of-ggplot2-plot-in-r-axis-text-main-title-legend Change Font Size of ggplot2 Plot in R (5 Examples) | Axis Text, Main Title & Legend]


== Rotate x-axis labels ==
[https://scales.r-lib.org/ scales] package
<pre>
<syntaxhighlight lang='rsplus'>
theme(axis.text.x = element_text(angle = 90)
library(scales)
</pre>
show_col(hue_pal()(4)) # ("#F8766D", "#7CAE00", "#00BFC4", "#C77CFF")
                      # (Salmon, Christi, Iris Blue, Heliotrope)
show_col(hue_pal()(3)) # ("#F8766D", "#00BA38", "#619CFF")
                      # (Salmon, Dark Pastel Green, Cornflower Blue)
show_col(hue_pal()(2)) # ("#F8767D", "#00BFC4") = (salmon, iris blue)  
          # see https://www.htmlcsscolor.com/ for color names
</syntaxhighlight>
See also the last example in [https://ggobi.github.io/ggally/reference/ggsurv.html ggsurv()] where the KM plots have 4 strata. The colors can be obtained by '''scales::hue_pal()(4)''' with hue_pal()'s default arguments.


== Add axis on top or right hand side ==
R has a function called colorName() to convert a hex code to color name; see [https://www.stat.auckland.ac.nz/~paul/Reports/roloc/intro/roloc.html roloc] package on [https://cran.case.edu/web/packages/roloc/index.html CRAN].
 
=== transform scales ===
[http://freerangestats.info/blog/2020/04/06/crazy-fox-y-axis How to make that crazy Fox News y axis chart with ggplot2 and scales]
 
== Class variables ==
<ul>
<ul>
<li>Specify a secondary axis, [https://ggplot2.tidyverse.org/reference/sec_axis.html sec_axis()]. This new function was added in ggplot2 2.2.0; see [https://stackoverflow.com/a/39805869 here].</li>
<li>"Set1" is a good choice. See [http://www.sthda.com/english/wiki/colors-in-r RColorBrewer::display.brewer.all()]
<li>[https://stackoverflow.com/q/51898027 Create secondary x-axis in ggplot2]. '''dup_axis(name, breaks, labels)'''. Note that ggplot2 uses '''breaks''' while base R plot uses '''at'''. See [[R#Include_labels_on_the_top_axis.2Fmargin:_axis.28.29|R &rarr; Include labels on the top axis/margin: axis()]].
<li>For ordinal variable, brewer.pal(n, "Spectral") is good. But the middle color is too light. So I modify the middle color
<pre>
<pre>
# Bottom x-axis is the quantiles and the top x-axis is the original values
brewer.pal(5, "Spectral")
cols[3] <- "#D4C683" # middle of "#FDAE61" and "#ABDDA4"
</pre>
</ul>
 
== Red, Green, Blue alternatives ==
* Red: "maroon"
 
== Heatmap for single channel ==
[https://youtu.be/TP8vjWiIpgI How to Make a Heatmap of Customers in R], [https://github.com/business-science/free_r_tips source code] on github. geom_tile() and geom_text() were used. [https://r-charts.com/correlation/heat-map-ggplot2/ Heatmap in ggplot2] from https://r-charts.com/.


Fn <- ecdf(mtcars$mpg)
https://scales.r-lib.org/
mtcars %>% dplyr::mutate(quantile = Fn(mpg)) %>%
<syntaxhighlight lang='rsplus'>
  ggplot(aes(x= quantile, y= disp)) +
# White <----> Blue
  geom_point() +
RColorBrewer::display.brewer.pal(n = 8, name = "Blues")
  scale_x_continuous(name = "quantile of mpg",
</syntaxhighlight>
                    breaks=c(.25, .5, .75, 1.0),
                    labels = c("0.25", "0.50", "0.75", "1.00"),
                    sec.axis = dup_axis(name = "mpg",
                                        breaks = c(.25, .5, .75, 1.0),
                                        labels = quantile(mtcars$mpg, c(.25, .5, .75, 1.0))))
</pre>
</li>
<li>[https://stackoverflow.com/a/46257098 How to add line at top panel border of ggplot2]
<pre>
mtcars %>%
  ggplot(aes(x= mpg, y= disp)) +
  geom_point() +
  annotate(geom = 'segment', y = Inf, yend = Inf, color = 'green',
          x = -Inf, xend = Inf, size = 4)
</pre>
</li>
<li>[https://whatalnk.github.io/r-tips/ggplot2-secondary-y-axis.nb.html ggplot2: Secondary Y axis] </li>
<li>[https://www.r-graph-gallery.com/line-chart-dual-Y-axis-ggplot2.html Dual Y axis with R and ggplot2] </li>
</ul>


== Remove labels ==
== Heatmap for dual channels ==
[http://environmentalcomputing.net/plotting-with-ggplot-adding-titles-and-axis-names/ Plotting with ggplot: : adding titles and axis names]
http://www.sthda.com/english/wiki/colors-in-r <syntaxhighlight lang='rsplus'>
library(RColorBrewer)
# Red <----> Blue
display.brewer.pal(n = 8, name = 'RdBu')
# Hexadecimal color specification
brewer.pal(n = 8, name = "RdBu")


== ggthemes package ==
plot(1:8, col=brewer_pal(palette = "RdBu")(8), pch=20, cex=4)
https://cran.r-project.org/web/packages/ggthemes/index.html
<pre>
ggplot() + geom_bar() +
          theme_solarized()   # sun color in the background


theme_excel()
# Blue <----> Red
theme_wsj()
plot(1:8, col=rev(brewer_pal(palette = "RdBu")(8)), pch=20, cex=4)
theme_economist()
</syntaxhighlight>
theme_fivethirtyeight()
</pre>


== rsthemes ==
[[File:Twopalette.svg|300px]]
[https://www.garrickadenbuie.com/project/rsthemes/ rsthemes]


== thematic ==
== Don't rely on color to explain the data ==
[https://rstudio.github.io/thematic/ thematic], [https://www.infoworld.com/article/3604688/top-r-tips-and-news-from-rstudio-global-2021.amp.html Top R tips and news from RStudio Global 2021]
[https://cran.r-project.org/web/packages/ggpattern/index.html ggpattern]


= Common plots =
== Don't use very bright or low-contrast colors, accessibility ==
* https://ggplot2.tidyverse.org/reference/index.html
* [https://color.a11y.com/ Color Contrast Accessibility Validator]
* [https://github.com/WinVector/WVPlots WVPlots], [https://win-vector.com/2020/10/26/your-lopsided-model-is-out-to-get-you/ Your Lopsided Model is Out to Get You]
* [https://developers.google.com/web/tools/lighthouse/ Google Lighthouse]


== Scatterplot ==
== Create your own scale_fill_FOO and scale_color_FOO ==
[https://wilkelab.org/SDS375/slides/overplotting.html?s=09#1 Handling overlapping points] (slides) and the ebook [https://clauswilke.com/dataviz/overlapping-points.html Fundamentals of Data Visualization] by Claus O. Wilke.
[https://www.jumpingrivers.com/blog/custom-colour-palettes-for-ggplot2/ Custom colour palettes for {ggplot2}]


=== Scatterplot with histograms ===
= Themes and background for ggplot2 =
* [https://datavizpyr.com/how-to-make-scatterplot-with-marginal-histograms-in-r/ How To Make Scatterplot with Marginal Histograms in R?]
* [https://www.r-bloggers.com/2023/11/getting-started-with-theme/ Getting started with theme()] 2023/11/23
* [https://rpkgs.datanovia.com/ggpubr/reference/ggscatterhist.html ggpubr::ggscatterhist()]
* [https://henrywang.nl/ggplot2-theme-elements-demonstration/ ggplot2 Theme Elements Demonstration]
* [http://www.sthda.com/english/wiki/scatter-plot-matrices-r-base-graphs Scatter Plot Matrices]
* [https://www.r-bloggers.com/2011/06/example-8-41-scatterplot-with-marginal-histograms/ Example 8.41: Scatterplot with marginal histograms] (old fashion, based on ''layout()'')


=== Bubble Chart ===
== Background ==
* [https://www.data-to-viz.com/graph/bubble.html BUBBLE PLOT]
<ul>
* [https://finnstats.com/index.php/2021/06/18/how-to-create-a-bubble-chart-in-r/ Bubble Chart in R-ggplot & Plotly]
<li>[https://stackoverflow.com/a/43614963 Export plot in .png with transparent background] in base R plot.
<pre>
x = c(1, 2, 3)
op <- par(bg=NA)
plot (x)


=== Ellipse ===
dev.copy(png,'myplot.png')
* [https://ggplot2.tidyverse.org/reference/stat_ellipse.html ggplot2::stat_ellipse()]
dev.off()
* [https://stackoverflow.com/a/5262141 How can a data ellipse be superimposed on a ggplot2 scatterplot?]. Hint: use the [https://cran.r-project.org/web/packages/ellipse/index.html ellipse] package.
par(op)
</pre>
</li>
<li>[https://stackoverflow.com/a/41878833 Transparent background with ggplot2]
<pre>
library(ggplot2)
data("airquality")


== Line plots ==
p <- ggplot(airquality, aes(Solar.R, Temp)) +
* http://www.sthda.com/english/wiki/ggplot2-line-plot-quick-start-guide-r-software-and-data-visualization
    geom_point() +
* [https://observablehq.com/@d3/multi-line-chart Multi-Line Chart] by D3. Download the tarball. The index.html shows the interactive plot on FF but not Chrome or safari. See [https://stackoverflow.com/a/46992592 ES6 module support in Chrome 62/Chrome Canary 64, does not work locally]. Chrome is blocking it because local files cannot have cross origin requests. it should work in chrome if you put it on a server.
    geom_smooth() +
** [https://observablehq.com/@bencf/multi-line-chart This] and [https://observablehq.com/@shaswat-du/d3-multi-line-chart this] are examples where  X is a continuous variable.
    # set transparency
** Click "..." and compare code.
    theme(
* [https://www.r-bloggers.com/2020/12/how-to-make-stunning-line-charts-in-r-a-complete-guide-with-ggplot2/ How to Make Stunning Line Charts in R: A Complete Guide with ggplot2]
        panel.grid.major = element_blank(),
 
        panel.grid.minor = element_blank(),
=== Ridgeline plots, mountain diagram ===
        panel.background = element_rect(fill = "transparent",colour = NA),
* [https://github.com/wilkelab/ggridges?s=09 ggridges]: Ridgeline plots in ggplot2
        plot.background = element_rect(fill = "transparent",colour = NA)
* [https://www.datanovia.com/en/blog/elegant-visualization-of-density-distribution-in-r-using-ridgeline Elegant Visualization of Density Distribution in R Using Ridgeline]
        )
* [https://www.nature.com/articles/s41598-021-03432-3/figures/1 An example] from ''Scientific Reports''.
p
ggsave("airquality.png", p, bg = "transparent")
</pre>
</li>
<li>[https://www.datanovia.com/en/blog/ggplot-theme-background-color-and-grids/ ggplot2 theme background color and grids]
<pre>
ggplot() + geom_bar(aes(x=, fill=y)) +
          theme(panel.background=element_rect(fill='purple')) +
          theme(plot.background=element_blank())


== Histogram ==
ggplot() + geom_bar(aes(x=, fill=y)) +
Histograms is a special case of bar plots. Instead of drawing each unique individual values as a bar, a histogram groups close data points into bins.
          theme(panel.background=element_blank()) +
          theme(plot.background=element_blank()) # minimal background like base R
          # the grid lines are not gone; they are white so it is the same as the background


<syntaxhighlight lang='rsplus'>
ggplot() + geom_bar(aes(x=, fill=y)) +  
ggplot(data = txhousing, aes(x = median)) +
          theme(panel.background=element_blank()) +
  geom_histogram() # adding 'origin =0' if we don't expect negative values.
          theme(plot.background=element_blank()) +
                    # adding 'bins=10' to adjust the number of bins
          theme(panel.grid.major.y = element_line(color="grey"))
                    # adding 'binwidth=10' to adjust the bin width
          # draw grid line on y-axis only
</syntaxhighlight>


[http://www.deeplytrivial.com/2020/04/p-is-for-percent.html Histogram vs barplot] from deeply trivial.
ggplot() + geom_bar() +
          theme_bw()  # very similar to theme_light()
                      # have grid lines
ggplot() + geom_bar() +
          theme_classic() # similar to base R graphic
                      # no borders on top and right
ggplot() + geom_bar() +
          theme_minimal() # no edge


== Boxplot ==
ggplot() + geom_bar() +
Be careful that if we added '''scale_y_continuous(expand = c(0,0), limits = c(0,1))''' to the code, it will change the boxplot if some data is outside the range of (0, 1). The console gives a warning message in this case.
          theme_void() # no grid, no edge


=== Base R method ===
ggplot() + geom_bar() +
[http://www.sthda.com/english/wiki/box-plots-r-base-graphs Box Plots - R Base Graphs]
          theme_dark()
<pre>
dim(df) # 112436 x 2
mycol <- c("#F8766D", "#7CAE00", "#00BFC4", "#C77CFF")
# mycol defines colors of 4 levels in df$Method (a factor)
boxplot(df$value ~ df$Method, col = mycol, xlab="Method")
</pre>
</pre>
</li>
</ul>
== ggthmr ==
[http://www.shanelynn.ie/themes-and-colours-for-r-ggplots-with-ggthemr/ ggthmr] package


=== Color fill/scale_fill_XXX ===
== Font size ==
{{Pre}}
* https://ggplot2.tidyverse.org/reference/theme.html
n <- 100
* [https://statisticsglobe.com/change-font-size-of-ggplot2-plot-in-r-axis-text-main-title-legend Change Font Size of ggplot2 Plot in R (5 Examples) | Axis Text, Main Title & Legend]
k <- 12
* [https://stackoverflow.com/a/34610941 What is the default font for ggplot2]
set.seed(1234)
* [http://www.cookbook-r.com/Graphs/Fonts/ Fonts] from Cookbook for R
cond <- factor(rep(LETTERS[1:k], each=n))
 
rating <- rnorm(n*k)
For example to make the subtitle font size smaller
dat <- data.frame(cond = cond, rating = rating)
<pre>
my_ggp + theme(plot.sybtitle = element_text(size = 8))  
# Default font size seems to be 11 for title/subtitle
</pre>


p <- ggplot(dat, aes(x=cond, y=rating, fill=cond)) +
== Remove x and y axis titles ==
    geom_boxplot()
[http://www.sthda.com/english/wiki/ggplot2-title-main-axis-and-legend-titles#remove-x-and-y-axis-labels ggplot2 title : main, axis and legend titles]


p + scale_fill_hue() + labs(title="hue default") # Same as only p
== Rotate x-axis labels, change colors ==
p + scale_fill_hue(l=40, c=35) + labs(title="hue options")
Counter-clockwise
p + scale_fill_brewer(palette="Dark2") + labs(title="Dark2")
<pre>
p + colorspace::scale_fill_discrete_qualitative(palette = "Dark 3") + labs(title="Dark 3")
theme(axis.text.x = element_text(angle = 90, size=5, hjust=1)
p + scale_fill_brewer(palette="Accent") + labs(title="Accent")
p + scale_fill_brewer(palette="Pastel1") + labs(title="Pastel1")
p + scale_fill_brewer(palette="Set1") + labs(title="Set1")
p + scale_fill_brewer(palette="Spectral") + labs(title ="Spectral")
p + scale_fill_brewer(palette="Paired") + labs(title="Paired")
# cbbPalette <- c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
# p + scale_fill_manual(values=cbbPalette)
</pre>
</pre>
[[File:Scalefill.png|250px]]


[https://www.datanovia.com/en/blog/the-a-z-of-rcolorbrewer-palette/ ColorBrewer palettes] RColorBrewer::display.brewer.all() to display all brewer palettes.
[https://stackoverflow.com/a/38862452 customize ggplot2 axis labels with different colors]


[https://ggplot2.tidyverse.org/reference/index.html Reference from ggplot2]. scale_fill_binned, '''scale_fill_brewer''', scale_fill_continuous, scale_fill_date, scale_fill_datetime, scale_fill_discrete, scale_fill_distiller, scale_fill_gradient, scale_fill_gradientc, scale_fill_gradientn, scale_fill_grey, '''scale_fill_hue''', scale_fill_identity, '''scale_fill_manual''', scale_fill_ordinal, scale_fill_steps, scale_fill_steps2, scale_fill_stepsn, scale_fill_viridis_b, scale_fill_viridis_c, scale_fill_viridis_d
== Add axis on top or right hand side ==
 
=== Jittering - plot the data on top of the boxplot ===
<ul>
<ul>
<li>[[Statistics#Box.28Box_and_whisker.29_plot_in_R|What is a boxplot]</li>
<li>Specify a secondary axis, [https://ggplot2.tidyverse.org/reference/sec_axis.html sec_axis()]. This new function was added in ggplot2 2.2.0; see [https://stackoverflow.com/a/39805869 here].</li>
<li>Quick look
<li>[https://stackoverflow.com/q/51898027 Create secondary x-axis in ggplot2]. '''dup_axis(name, breaks, labels)'''. Note that ggplot2 uses '''breaks''' while base R plot uses '''at'''. See [[R#Include_labels_on_the_top_axis.2Fmargin:_axis.28.29|R &rarr; Include labels on the top axis/margin: axis()]].
<syntaxhighlight lang='rsplus'>
<pre>
# Only 1 variable
# Bottom x-axis is the quantiles and the top x-axis is the original values
ggplot(data.frame(Wi), aes(y = Wi)) +
  geom_boxplot()


# Two variable, one of them is a factor
Fn <- ecdf(mtcars$mpg)
ggplot() + geom_jitter(mapping = aes(x, y))
mtcars %>% dplyr::mutate(quantile = Fn(mpg)) %>%
 
  ggplot(aes(x= quantile, y= disp)) +
# Box plot
  geom_point() +
ggplot() + geom_boxplot(mapping = aes(x, y))
  scale_x_continuous(name = "quantile of mpg",
</syntaxhighlight>
                    breaks=c(.25, .5, .75, 1.0),
</li>
                    labels = c("0.25", "0.50", "0.75", "1.00"),
<li>[https://ggplot2.tidyverse.org/reference/geom_jitter.html geom_jitter()]</li>
                    sec.axis = dup_axis(name = "mpg",
<li>geom_jitter can affect both X and Y values.
                                        breaks = c(.25, .5, .75, 1.0),
                                        labels = quantile(mtcars$mpg, c(.25, .5, .75, 1.0))))
</pre>
</li>
<li>[https://stackoverflow.com/a/46257098 How to add line at top panel border of ggplot2]
<pre>
<pre>
tibble(x=1:4, y=1:4) %>% ggplot(aes(x, y)) + geom_jitter()
mtcars %>%  
  ggplot(aes(x= mpg, y= disp)) +
  geom_point() +
  annotate(geom = 'segment', y = Inf, yend = Inf, color = 'green',
          x = -Inf, xend = Inf, size = 4)
</pre>
</pre>
</li>
</li>
<li>https://stackoverflow.com/a/17560113  </li>
<li>[https://whatalnk.github.io/r-tips/ggplot2-secondary-y-axis.nb.html ggplot2: Secondary Y axis] </li>
<li>https://www.tutorialgateway.org/r-ggplot2-jitter/  </li>
<li>[https://www.r-graph-gallery.com/line-chart-dual-Y-axis-ggplot2.html Dual Y axis with R and ggplot2] </li>
<syntaxhighlight lang='rsplus'>
</ul>
# df2 is n x 2
 
ggplot(df2, aes(x=nboot, y=boot)) +
== Remove labels ==
  geom_boxplot(outlier.shape=NA) + #avoid plotting outliers twice
[http://environmentalcomputing.net/plotting-with-ggplot-adding-titles-and-axis-names/ Plotting with ggplot: : adding titles and axis names]
  geom_jitter(aes(color=nboot), position=position_jitter(width=.2, height=0, seed=1)) +
  labs(title="", y = "", x = "nboot")
</syntaxhighlight>
If we omit the '''outlier.shape=NA''' option in geom_boxplot(), we will get the following plot. (Another option is '''outlier.color = NA''').


[[File:Jitterboxplot.png|300px]]
== ggthemes package ==
</li>
https://cran.r-project.org/web/packages/ggthemes/index.html
</ul>
<pre>
ggplot() + geom_bar() +
          theme_solarized()  # sun color in the background


=== Groups of boxplots ===
theme_excel()
[http://cmdlinetips.com/2019/02/how-to-make-grouped-boxplots-with-ggplot2/ How To Make Grouped Boxplots with ggplot2?]. Use the '''fill''' parameter such as
theme_wsj()
<pre>
theme_economist()
mydata %>%
theme_fivethirtyeight()
  ggplot(aes(x=Factor1, y=Response, fill=factor(Factor2)))
  geom_boxplot()  
</pre>
</pre>


Another method is to use [https://rpkgs.datanovia.com/ggpubr/reference/ggboxplot.html ggpubr::ggboxplot()].
== rsthemes ==
<pre>
[https://www.garrickadenbuie.com/project/rsthemes/ rsthemes]
ggboxplot(df, "dose", "len",
 
          fill = "dose", palette = c("#00AFBB", "#E7B800", "#FC4E07"), add.params=list(size=0.1),
== thematic ==
          notch=T, add = "jitter", outlier.shape = NA, shape=16,
[https://rstudio.github.io/thematic/ thematic], [https://www.infoworld.com/article/3604688/top-r-tips-and-news-from-rstudio-global-2021.amp.html Top R tips and news from RStudio Global 2021]
          size = 1/.pt, x.text.angle = 30,
 
          ylab = "Silhouette Values", legend="right",
= Common plots =
          ggtheme = theme_pubr(base_size = 8)) +
* https://ggplot2.tidyverse.org/reference/index.html
    theme(plot.title = element_text(size=8,hjust = 0.5),
* [https://github.com/WinVector/WVPlots WVPlots], [https://win-vector.com/2020/10/26/your-lopsided-model-is-out-to-get-you/ Your Lopsided Model is Out to Get You]
          text = element_text(size=8),
          title = element_text(size=8),
          rect = element_rect(size = 0.75/.pt),
          line = element_line(size = 0.75/.pt),
          axis.text.x = element_text(size = 7),
          axis.line = element_line(colour = 'black', size = 0.75/.pt),
          legend.title = element_blank(),
          legend.position = c(0,1),
          legend.justification = c(0,1),
          legend.key.size = unit(4,"mm"))
</pre>


== Violin plot and sina plot ==
== Scatterplot ==
[https://ggforce.data-imaginist.com/reference/geom_sina.html sina plot] from the [https://cran.r-project.org/web/packages/ggforce/index.html ggforce] package.
[https://wilkelab.org/SDS375/slides/overplotting.html?s=09#1 Handling overlapping points] (slides) and the ebook [https://clauswilke.com/dataviz/overlapping-points.html Fundamentals of Data Visualization] by Claus O. Wilke.
<syntaxhighlight lang='rsplus'>
library(ggplot2)
ggplot(midwest, aes(state, area)) + geom_violin() + ggforce::geom_sina()
</syntaxhighlight>


[[File:Violinplot.png|250px]]
=== Scatterplot with histograms ===
* [https://datavizpyr.com/how-to-make-scatterplot-with-marginal-histograms-in-r/ How To Make Scatterplot with Marginal Histograms in R?]
* [https://rpkgs.datanovia.com/ggpubr/reference/ggscatterhist.html ggpubr::ggscatterhist()]
* [http://www.sthda.com/english/wiki/scatter-plot-matrices-r-base-graphs Scatter Plot Matrices]
* [https://www.r-bloggers.com/2011/06/example-8-41-scatterplot-with-marginal-histograms/ Example 8.41: Scatterplot with marginal histograms] (old fashion, based on ''layout()'')


== Kernel density plot ==
=== aes(color) ===
<ul>
<ul>
<li>https://ggplot2.tidyverse.org/reference/geom_density.html
<li><span style="color: blue">Discrete colors</span>. [https://tidyverse.github.io/ggplot2-docs/reference/scale_brewer.html ?scale_colour_brewer]. [https://stackoverflow.com/a/67375729 How to fix 'continuous value supplied to discrete scale' in with scale_color_brewer]. [https://statisticsglobe.com/scale-colour-fill-brewer-rcolorbrewer-package-r Change ggplot2 Color & Fill Using scale_brewer Functions & RColorBrewer Package in R]
<pre>
ggplot(mpg, aes(x = hwy, y = cty)) +
  geom_point(aes(color = class), palette = "Set2")
 
ggplot(mpg, aes(x = displ, y = hwy, colour = manufacturer)) +
  geom_point() +
  scale_colour_brewer(palette = "Set3")
</pre>
<li><span style="color: blue">Continuous colors</span>. The default color scale is [https://tidyverse.github.io/ggplot2-docs/reference/scale_gradient.html ?scale_colour_gradient] with prespecified 'low' and 'high' colors. [https://ggplot2.tidyverse.org/reference/scale_colour_continuous.html ?scale_colour_continuous].
<pre>
<pre>
ggplot(iris, aes(x = Sepal.Length, fill = Species, col = Species)) +
ggplot(mpg, aes(x = displ, y = hwy, color = cty)) +  
      geom_density(alpha = 0.4)
  geom_point(size = 2) +
  scale_color_continuous("City Miles Per Gallon")
# scale_color_continuous("City MPG Rating", low = "springgreen3", high = "red")
</pre>
</pre>
<li>[http://www.sthda.com/english/wiki/ggplot2-colors-how-to-change-colors-automatically-and-manually ggplot2 colors : How to change colors automatically and manually?] (mainly the scatterplot and box plots)
<li>[https://ggplot2.tidyverse.org/reference/aes_colour_fill_alpha.html Colour related aesthetics: colour, fill, and alpha]
</li>
</li>
<li>As you can see the default colors are so terrible. A better choice is [[#ggokabeito|ggokabeito]] color scales. </li>
<li>[https://stackoverflow.com/a/43770608 how to change the color in geom_point or lines in ggplot].
</ul>
* color is used outside '''aes()''': the ''color'' parameter can be used to specify the color name (eg 'red')
* https://learnr.wordpress.com/2009/03/16/ggplot2-plotting-two-or-more-overlapping-density-plots-on-the-same-graph/
* color is used inside '''aes()''': it is used to specify the category/level of colors. It does not work as expected if we try to specify colors explicitly; e.g. ''aes(color=c("red", "red", "green"))''. In this case, the color names becomes a factor.
* [https://win-vector.com/2020/10/26/your-lopsided-model-is-out-to-get-you/ Your Lopsided Model is Out to Get You]
* http://www.cookbook-r.com/Graphs/Plotting_distributions_(ggplot2)/
<ul>
<li>Overlay histograms with density plots
<pre>
<pre>
library(ggplot2); library(tidyr)
ggplot() +
x <- data.frame(v1=rnorm(100), v2=rnorm(100,1,1),
  geom_point(data = data, aes(x = time, y = y, color = sample),size=4) +
                v3=rnorm(100, 0,2))
   scale_color_manual(values = c("A" = "black", "B" = "red"))
data <- pivot_longer(x, cols=1:3)
</pre>
ggplot(data, aes(x=value, fill=name)) +
  geom_histogram(aes(y=..density..), alpha=.25) +  
   stat_density(geom="line", aes(color=name, linetype=name))
ggplot(data, aes(x=value, fill=name, col =name)) +
  geom_density(alpha = .4)
</pre>
</li>
</li>
<li>[https://www.sharpsightlabs.com/blog/highlight-data-in-ggplot2/ How to highlight data in ggplot2] </li>
</ul>
</ul>


== Bivariate analysis with ggpair ==
=== groups ===
[https://www.guru99.com/r-pearson-spearman-correlation.html Correlation in R: Pearson & Spearman with Matrix Example ]
* [https://datavizpyr.com/add-regression-line-per-group-to-scatterplot-in-r/ How To Add Regression Line per Group to Scatterplot in ggplot2?] '''geom_smooth()'''
* Multiple fitted lines in one plot
[[File:Geom smooth ex.png|250px]]
 
=== Bubble Chart ===
* [https://www.data-to-viz.com/graph/bubble.html BUBBLE PLOT]
* [https://finnstats.com/index.php/2021/06/18/how-to-create-a-bubble-chart-in-r/ Bubble Chart in R-ggplot & Plotly]


== GGally::ggpairs ==
=== Ellipse ===
* [https://ggobi.github.io/ggally/articles/ All vignettes] launched by GGally::vig_ggally()  
* [https://ggplot2.tidyverse.org/reference/stat_ellipse.html ggplot2::stat_ellipse()]
* [https://soroosj.netlify.app/2020/09/26/penguins-cluster/ Kmeans Clustering of Penguins]
* [https://stackoverflow.com/a/5262141 How can a data ellipse be superimposed on a ggplot2 scatterplot?]. Hint: use the [https://cran.r-project.org/web/packages/ellipse/index.html ellipse] package.
* [http://padamson.github.io/r/ggally/ggplot2/ggpairs/2016/02/16/multiple-regression-lines-with-ggpairs.html Multiple regression lines in ggpairs]
* [https://www.blopig.com/blog/2019/06/a-brief-introduction-to-ggpairs/ A Brief Introduction to ggpairs]
* [https://stackoverflow.com/a/42656454 How to show only the lower triangle in ggpairs?]


== barplot ==
=== ggside: scatterplot + marginal density plot ===
* [http://www.brodrigues.co/blog/2020-04-12-basic_ggplot2/ How to basic: bar plots]
* https://cran.r-project.org/web/packages/ggside/index.html
* [https://appsilon.com/ggplot2-bar-charts/ How to Make Stunning Bar Charts in R]
* [https://www.business-science.io/code-tools/2021/05/18/marginal_distributions.html ggside] package


=== Ordered barplot and facet ===
=== ggextra: scatterplot + marginal histogram/density ===
* [https://www.r-graph-gallery.com/267-reorder-a-variable-in-ggplot2.html Reorder a variable with ggplot2]
https://github.com/daattali/ggExtra
* [https://bugs.r-project.org/show_bug.cgi?id=18243 ‘reorder()’ gets an argument ‘decreasing’ which it passes to ‘sort()’ for level creation]. 2021-11-23
 
<ul>
== Line plots ==
<li>[https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/reorder.default ?reorder]. This, as '''relevel()''', is a special case of simply calling factor(x, levels = levels(x)[....]).
* http://www.sthda.com/english/wiki/ggplot2-line-plot-quick-start-guide-r-software-and-data-visualization
<syntaxhighlight lang='rsplus'>
* [https://observablehq.com/@d3/multi-line-chart Multi-Line Chart] by D3. Download the tarball. The index.html shows the interactive plot on FF but not Chrome or safari. See [https://stackoverflow.com/a/46992592 ES6 module support in Chrome 62/Chrome Canary 64, does not work locally]. Chrome is blocking it because local files cannot have cross origin requests. it should work in chrome if you put it on a server.  
R> bymedian <- with(InsectSprays, reorder(spray, count, median))
** [https://observablehq.com/@bencf/multi-line-chart This] and [https://observablehq.com/@shaswat-du/d3-multi-line-chart this] are examples where  X is a continuous variable.
# bymedian will replace spray (a factor)
** Click "..." and compare code.
# The data is not changed except the order of levels (a factor)
* [https://www.r-bloggers.com/2020/12/how-to-make-stunning-line-charts-in-r-a-complete-guide-with-ggplot2/ How to Make Stunning Line Charts in R: A Complete Guide with ggplot2]
# In this case, the order is determined by the median of count from each spray level
#  from small to large.


R> InsectSprays[1:3, ]
=== Ridgeline plots, mountain diagram ===
  count spray
* [https://github.com/wilkelab/ggridges?s=09 ggridges]: Ridgeline plots in ggplot2
1    10    A
* [https://www.datanovia.com/en/blog/elegant-visualization-of-density-distribution-in-r-using-ridgeline Elegant Visualization of Density Distribution in R Using Ridgeline]
2    7    A
* [https://www.nature.com/articles/s41598-021-03432-3/figures/1 An example] from ''Scientific Reports''.
3    20    A
 
R> bymedian
== Histogram ==
[1] A A A A A A A A A A A A B B B B B B B B B B B B C C C C C C C C C C C C D D D D D D D
Histograms is a special case of bar plots. Instead of drawing each unique individual values as a bar, a histogram groups close data points into bins.
[44] D D D D D E E E E E E E E E E E E F F F F F F F F F F F F
 
attr(,"scores")
<syntaxhighlight lang='rsplus'>
  A    B    C    D    E    F
ggplot(data = txhousing, aes(x = median)) +
14.0 16.5  1.5  5.0  3.0 15.0
  geom_histogram()  # adding 'origin =0' if we don't expect negative values.
Levels: C E D A F B
                    # adding 'bins=10' to adjust the number of bins
R> InsectSprays$spray
                    # adding 'binwidth=10' to adjust the bin width
[1] A A A A A A A A A A A A B B B B B B B B B B B B C C C C C C C C C C C C D D D D D D D
[44] D D D D D E E E E E E E E E E E E F F F F F F F F F F F F
Levels: A B C D E F
R> boxplot(count ~ bymedian, data = InsectSprays,
        xlab = "Type of spray", ylab = "Insect count",
        main = "InsectSprays data", varwidth = TRUE,
        col = "lightgray")
</syntaxhighlight>
</syntaxhighlight>
Scatterplot
 
[http://www.deeplytrivial.com/2020/04/p-is-for-percent.html Histogram vs barplot] from deeply trivial.
 
== Boxplot ==
Be careful that if we added '''scale_y_continuous(expand = c(0,0), limits = c(0,1))''' to the code, it will change the boxplot if some data is outside the range of (0, 1). The console gives a warning message in this case.
 
=== Base R method ===
[http://www.sthda.com/english/wiki/box-plots-r-base-graphs Box Plots - R Base Graphs]
<pre>
<pre>
tibble(y=sample(6), x=letters[1:6]) %>%
dim(df) # 112436 x 2
  ggplot(aes(reorder(x, -y), y)) + geom_point(size=4)
mycol <- c("#F8766D", "#7CAE00", "#00BFC4", "#C77CFF")
# mycol defines colors of 4 levels in df$Method (a factor)
boxplot(df$value ~ df$Method, col = mycol, xlab="Method")
</pre>
</pre>
</li>
<li>[https://sebastiansauer.github.io/ordering-bars/ Sorting the x-axis in bargraphs using ggplot2] or [http://www.deeplytrivial.com/2020/05/statistics-sunday-my-2019-reading.html this one] from Deeply Trivial. reorder(fac, value) was used.
<syntaxhighlight lang='rsplus'>
ggplot(df, aes(x=reorder(x, -y), y=y)) + geom_bar(stat = 'identity')


df$order <- 1:nrow(df)
=== Color fill/scale_fill_XXX ===
# Assume df$y is a continuous variable and df$fac is a character/factor variable
{{Pre}}
#  and we want to show factor according to the way they appear in the data
n <- 100
(not following R's order even the variable is of type "character" not "factor")
k <- 12
# We like to plot df$fac on the y-axis and df$y on x-axis. Fortunately,
set.seed(1234)
#  ggplot2 will draw barplot vertically or horizontally depending the 2 variables' types
cond <- factor(rep(LETTERS[1:k], each=n))
# The reason of using "-order" is to make the 1st name appears on the top
rating <- rnorm(n*k)
ggplot(df, aes(x=y, y=reorder(fac, -order))) + geom_col()
dat <- data.frame(cond = cond, rating = rating)


ggplot(df, aes(x=reorder(x, desc(y)), y=y)), geom_col()
p <- ggplot(dat, aes(x=cond, y=rating, fill=cond)) +
</syntaxhighlight>
    geom_boxplot()  
</li>
<li>[https://juliasilge.com/blog/giant-pumpkins/ Predict #TidyTuesday giant pumpkin weights with workflowsets]. [https://forcats.tidyverse.org/reference/fct_reorder.html fct_reorder()]  </li>
<li>[https://juliasilge.com/blog/reorder-within/ Reordering and facetting for ggplot2]. tidytext::reorder_within() was used. </li>
<li>Chapter2 of [https://github.com/chuvanan/rdatatable-cookbook data.table cookbook]. reorder(fac, value) was used. </li>
<li>[https://juliasilge.com/blog/cocktail-recipes-umap/ PCA and UMAP with tidymodels] </li>
</ul>


=== Back to back barplot ===
p + scale_fill_hue() + labs(title="hue default") # Same as only p
* https://community.rstudio.com/t/back-to-back-barplot/17106. Comment: the colors should be opposite but not.
p + scale_fill_hue(l=40, c=35) + labs(title="hue options")
* https://stackoverflow.com/a/55015174 (different scale on positive/negative sides. Cool!)
p + scale_fill_brewer(palette="Dark2") + labs(title="Dark2")
* https://learnr.wordpress.com/2009/09/24/ggplot2-back-to-back-bar-charts/  (change negative values to positive values, slow to load the page)
p + colorspace::scale_fill_discrete_qualitative(palette = "Dark 3") + labs(title="Dark 3")
* [https://stackoverflow.com/a/33837922 Pyramid plot in R]
p + scale_fill_brewer(palette="Accent") + labs(title="Accent")
* [https://www.brodrigues.co/blog/2020-04-12-basic_ggplot2/ How to basic: bar plots]. Hint: use '''geom_col()''' twice.
p + scale_fill_brewer(palette="Pastel1") + labs(title="Pastel1")
p + scale_fill_brewer(palette="Set1") + labs(title="Set1")
p + scale_fill_brewer(palette="Spectral") + labs(title ="Spectral")
p + scale_fill_brewer(palette="Paired") + labs(title="Paired")
# cbbPalette <- c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
# p + scale_fill_manual(values=cbbPalette)
</pre>
[[File:Scalefill.png|250px]]


=== Pyramid Chart ===
[https://www.datanovia.com/en/blog/the-a-z-of-rcolorbrewer-palette/ ColorBrewer palettes] RColorBrewer::display.brewer.all() to display all brewer palettes.
[https://thomas-neitmann.github.io/ggcharts/reference/pyramid_chart.html ggcharts::pyramid_chart()]


=== Flip x and y axes ===
[https://ggplot2.tidyverse.org/reference/index.html Reference from ggplot2]. scale_fill_binned, '''scale_fill_brewer''', scale_fill_continuous, scale_fill_date, scale_fill_datetime, scale_fill_discrete, scale_fill_distiller, scale_fill_gradient, scale_fill_gradientc, scale_fill_gradientn, scale_fill_grey, '''scale_fill_hue''', scale_fill_identity, '''scale_fill_manual''', scale_fill_ordinal, scale_fill_steps, scale_fill_steps2, scale_fill_stepsn, scale_fill_viridis_b, scale_fill_viridis_c, scale_fill_viridis_d
coord_flip()
 
=== Rotate x-axis labels ===
* [https://datavizpyr.com/rotate-x-axis-text-labels-in-ggplot2/ How To Rotate x-axis Text Labels in ggplot2?]
* [https://stackoverflow.com/a/7267364 What do hjust and vjust do when making a plot using ggplot?] 0 means left-justified 1 means right-justified.


=== Jittering - plot the data on top of the boxplot ===
<ul>
<li>[[Statistics#Box.28Box_and_whisker.29_plot_in_R|What is a boxplot]]  </li>
<li>Quick look
<syntaxhighlight lang='rsplus'>
# Only 1 variable
ggplot(data.frame(Wi), aes(y = Wi)) +
  geom_boxplot()
# Two variable, one of them is a factor
ggplot() + geom_jitter(mapping = aes(x, y))
# Box plot
ggplot() + geom_boxplot(mapping = aes(x, y))
</syntaxhighlight>
</li>
<li>[https://ggplot2.tidyverse.org/reference/geom_jitter.html geom_jitter()]</li>
<li>geom_jitter can affect both X and Y values.
<pre>
<pre>
ggplot(mydf) + geom_col(aes(x = model, y=value, fill = method), position="dodge")+
tibble(x=1:4, y=1:4) %>% ggplot(aes(x, y)) + geom_jitter()
  theme(axis.text.x = element_text(angle = 45, hjust=1))
</pre>
</pre>
 
</li>
=== Starts at zero ===
<li>https://stackoverflow.com/a/17560113  </li>
[http://malditobarbudo.xyz/blog/r/starting-bars-and-histograms-at-zero-in-ggplot2/ Starting bars and histograms at zero in ggplot2]
<li>[https://stackoverflow.com/a/48822620 How to make scatterplot with geom_jitter plot reproducible?]
<pre>
<pre>
scale_y_continuous(expand = c(0,0), limits = c(0, YourLimit))
set.seed(1); data %>%
  ggplot() +
  geom_jitter(aes(T.categ, sex, colour = status))
</pre>
</pre>
</li>
<li>[https://r-charts.com/distribution/box-plot-jitter-ggplot2/ Boxplot with jittered data points in ggplot2]  </li>
<syntaxhighlight lang='rsplus'>
# df2 is n x 2
ggplot(df2, aes(x=nboot, y=boot)) +
  geom_boxplot(outlier.shape=NA) + #avoid plotting outliers twice
  geom_jitter(aes(color=nboot), position=position_jitter(width=.2, height=0, seed=1)) +
  labs(title="", y = "", x = "nboot")
</syntaxhighlight>
If we omit the <span style="color: red">outlier.shape=NA</span> option in '''geom_boxplot()''', we will get the following plot where some outliers will appear twice. (Another option is '''outlier.color = NA'''; see [https://stackoverflow.com/a/63785060 extra point at boxplot with jittered points (ggplot2)]).


=== Add patterns ===
[[File:Jitterboxplot.png|300px]]
* [https://coolbutuseless.github.io/package/ggpattern/ ggpattern] package
</li>
* [https://www.jianshu.com/p/6d889a80d229 ggpartten填充柱状图]
<li>Base plot approach
 
[http://jtleek.com/genstats/inst/doc/02_13_batch-effects.html Batch effects and confounders]
== Waterfall plot ==
</li>
* [https://r-charts.com/flow/waterfall-chart/ Waterfall charts in ggplot2 with waterfalls package]
<li>Another base plot approach. boxplot() + stripchart(). See [https://r-coder.com/stripchart-r/ Stripchart in R], [https://www.statology.org/strip-chart-r/ How to Create a Strip Chart in R]. Consider to add '''outline = FALSE''' to boxplot() to avoid drawing outliers in boxplot() when stripchart() has been added.
* [https://www.r-bloggers.com/2010/05/ggplot2-waterfall-charts/ ggplot2: Waterfall Charts] geom_rect()
<syntaxhighlight lang='rsplus'>
ylim <- range(df$estimate, na.rm = TRUE)
boxplot(estimate~type, data=df, xlab=NULL, ylab=NULL, ylim=ylim, outline=F)
set.seed(1)
stripchart(estimate~type, data=df, method = "jitter",
pch=19, col=c("salmon", "orange", "yellowgreen", "green"),
vertical=TRUE, add=TRUE)
</syntaxhighlight>
</li>
</ul>


== Polygon and map plot ==
=== Groups of boxplots ===
https://ggplot2.tidyverse.org/reference/geom_polygon.html
<ul>
 
<li>[https://datavizpyr.com/how-to-make-grouped-boxplot-with-jittered-data-points-in-ggplot2/ How to Make Grouped Boxplot with Jittered Data Points in ggplot2]. Use the '''color''' parameter in ggplot(aes()).  
== geom_step: Step function ==
<li>[https://www.bioinfo-scrounger.com/archives/jittered_boxplot/ Boxplot With Jittered Points in R]
Connect observations: [https://ggplot2.tidyverse.org/reference/geom_path.html geom_path(), geom_step()]
<li>[http://cmdlinetips.com/2019/02/how-to-make-grouped-boxplots-with-ggplot2/ How To Make Grouped Boxplots with ggplot2?], [https://rpubs.com/alecri/review_longitudinal A review of Longitudinal Data Analysis in R]. Use the '''fill''' parameter such as
 
Example: KM curves (without legend)
<pre>
<pre>
library(survival)
mydata %>%
sf <- survfit(Surv(time, status) ~ x, data = aml)
  ggplot(aes(x=Factor1, y=Response, fill=factor(Factor2))) +   
sf
   geom_boxplot()  
str(sf) # the first 10 forms one strata and the rest 10 forms the other
ggplot() +
  geom_step(aes(x=c(0, sf$time[1:10]), y=c(1, sf$surv[1:10])),
            col='red') +
  scale_x_continuous('Time', limits = c(0, 161)) +  
   scale_y_continuous('Survival probability', limits = c(0, 1)) +
   geom_step(aes(x=c(0, sf$time[11:20]), y=c(1, sf$surv[11:20])),
            col='black')
# cf:  plot(sf, col = c('red', 'black'), mark.time=FALSE)
</pre>
</pre>
 
<li>Another method is to use [https://rpkgs.datanovia.com/ggpubr/reference/ggboxplot.html ggpubr::ggboxplot()]. Papers [https://github.com/guosheng437/TumorPurity/tree/main/Fig1/Fig1A TumorPurity].
Same example but with legend (see [https://stackoverflow.com/a/17149021 Construct a manual legend for a complicated plot])
<pre>
<pre>
cols <- c("NEW"="#f04546","STD"="#3591d1")
ggboxplot(df, "dose", "len",
ggplot() +  
          fill = "dose", palette = c("#00AFBB", "#E7B800", "#FC4E07"), add.params=list(size=0.1),
  geom_step(aes(x=c(0, sf$time[1:10]), y=c(1, sf$surv[1:10]), col='NEW')) +
          notch=T, add = "jitter", outlier.shape = NA, shape=16,
  scale_x_continuous('Time', limits = c(0, 161)) +
          size = 1/.pt, x.text.angle = 30,
  scale_y_continuous('Survival probability', limits = c(0, 1)) +
          ylab = "Silhouette Values", legend="right",
  geom_step(aes(x=c(0, sf$time[11:20]), y=c(1, sf$surv[11:20]), col='STD')) +
          ggtheme = theme_pubr(base_size = 8)) +
  scale_colour_manual(name="Treatment", values = cols)
    theme(plot.title = element_text(size=8,hjust = 0.5),  
          text = element_text(size=8),  
          title = element_text(size=8),
          rect = element_rect(size = 0.75/.pt),
          line = element_line(size = 0.75/.pt),
          axis.text.x = element_text(size = 7),
          axis.line = element_line(colour = 'black', size = 0.75/.pt),
          legend.title = element_blank(),
          legend.position = c(0,1),  
          legend.justification = c(0,1),
          legend.key.size = unit(4,"mm"))
</pre>
</pre>
</ul>


To control the line width, use the '''size''' parameter; e.g. geom_step(aes(x, y), size=.5). The default size is .5 (where to find this info?).
=== p-values on top of boxplots ===
<ul>
<li>[https://www.r-bloggers.com/2017/06/add-p-values-and-significance-levels-to-ggplots/ Add P-values and Significance Levels to ggplots]
* ggpubr::stat_compare_means()
:<syntaxhighlight lang='rsplus'>
library(ggpubr)
my_comparisons <- list( c("6", "8"), c("4", "6"), c("4", "8") )
ggboxplot(mtcars, x = "cyl", y = "mpg",
          color = "cyl", add = "jitter", palette = "jco") +
    stat_compare_means(comparisons = my_comparisons)+ # method="t.test", default is "wilcox.test"
    stat_compare_means(label.y = 45) # y-axis loc of overall p-value
</syntaxhighlight>
<li>[https://www.datanovia.com/en/blog/how-to-perform-multiple-paired-t-tests-in-r/ How to Perform Multiple Paired T-tests in R]
* ggpubr::stat_pvalue_manual()
<li>[https://datasciencetut.com/add-significance-level-and-stars-to-plot-in-r/ Add Significance Level and Stars to Plot in R]
* ggsignif::geom_signif()
:<syntaxhighlight lang='rsplus'>
library(ggsignif)
ggplot(mtcars, aes(factor(cyl), mpg)) +
  geom_boxplot() +
  geom_signif(
    comparisons = list(
      c("6","8"),
      c("4","6"), c("4","8")
    ),
    map_signif_level=TRUE,
    y_position = c(34, 35, 36)
  )
</syntaxhighlight>
<li>[https://stackoverflow.com/a/29263992 How to draw the boxplot with significant level?]
* ggsignif package or geom_line() function.
<li>Paper examples
* [https://www.future-science.com/doi/10.2144/btn-2018-0179 Fig 5A,B]
* [https://ovarianresearch.biomedcentral.com/articles/10.1186/s13048-023-01129-x/figures/2 Fig 2B]
<li>Manually do it - [https://cran.r-project.org/web/packages/signibox/index.html signibox] package (small).
</ul>


To allow different line types, use the '''linetype''' parameter. The first level is solid line, the 2nd level is dashed, ... We can change the default line types by using the '''scale_linetype_manual()''' function. See [https://www.datanovia.com/en/blog/line-types-in-r-the-ultimate-guide-for-r-base-plot-and-ggplot/ Line Types in R: The Ultimate Guide for R Base Plot and GGPLOT].
== Violin plot and sina plot ==
 
<ul>
== Coefficients, intervals, errorbars ==
<li>https://en.wikipedia.org/wiki/Violin_plot. It is similar to a box plot, with the addition of a rotated kernel '''density plot''' on each side.
* [https://stackoverflow.com/a/42560960 Plotting two models with regression coefficients] with [https://ggplot2.tidyverse.org/reference/geom_linerange.html geom_pointrange()] - Vertical intervals: lines, crossbars & errorbars.
<li>[https://ggplot2.tidyverse.org/reference/geom_violin.html geom_violin()]
* [https://stackoverflow.com/q/49483128 Grouping and staggering estimates with geom_point]
<li>[https://r-charts.com/distribution/violin-plot-mean-ggplot2/ Violin plot with mean/median in ggplot2], [https://ggplot2.tidyverse.org/reference/stat_summary.html stat_summary()]
<li>[https://ggforce.data-imaginist.com/reference/geom_sina.html sina plot] from the [https://cran.r-project.org/web/packages/ggforce/index.html ggforce] package.
<syntaxhighlight lang='rsplus'>
library(ggplot2)
ggplot(midwest, aes(state, area)) + geom_violin() + ggforce::geom_sina()
</syntaxhighlight>


== Comparing similarities / differences between groups ==
[[File:Violinplot.png|250px]]
[https://www.business-science.io/code-tools/2021/02/09/stat-plots-in-R.html comparing similarities / differences between groups]
<li>[https://bmcimmunol.biomedcentral.com/articles/10.1186/s12865-018-0285-5/figures/6 An example]
</ul>


= Special plots =
== geom_density: Kernel density plot ==
== Dot plot & forest plot ==
<ul>
* https://en.wikipedia.org/wiki/Dot_plot_(statistics), https://en.wikipedia.org/wiki/Forest_plot
<li>https://ggplot2.tidyverse.org/reference/geom_density.html
* [https://ikashnitsky.github.io/2019/dotplot/ Dotplot – the single most useful yet largely neglected dataviz type]
<pre>
* [http://sthda.com/english/wiki/ggplot2-dot-plot-quick-start-guide-r-software-and-data-visualization ggplot2 dot plot : Quick start guide - R software and data visualization]
ggplot(iris, aes(x = Sepal.Length, fill = Species, col = Species)) +
* [https://cran.r-project.org/web/packages/forestplot/ foresplot] package
      geom_density(alpha = 0.4)
</pre>
And two densities (black & red colors)
<pre>
mydata <- data.frame(var1 = rnorm(100), var2 = rnorm(100, mean = 2))


== Correlation Analysis Different ==
# Create the plot
* [https://github.com/r-link/corrmorant corrmorant]: Flexible Correlation Matrices Based on ggplot2
ggplot(data = mydata, aes(x = var1)) +
* [https://finnstats.com/index.php/2021/05/13/correlation-analysis-plot/ Correlation Analysis Different Types of Plots in R]
  geom_density() +
  geom_density(aes(x = var2), color = "red")
</pre>
</li>
<li>As you can see the default colors are so terrible. A better choice is [[#ggokabeito|ggokabeito]] color scales. </li>
<li>[https://stackoverflow.com/a/61548764 Density plot + histogram]
<li>https://learnr.wordpress.com/2009/03/16/ggplot2-plotting-two-or-more-overlapping-density-plots-on-the-same-graph/
<li>[https://win-vector.com/2020/10/26/your-lopsided-model-is-out-to-get-you/ Your Lopsided Model is Out to Get You] & [https://cran.r-project.org/web/packages/WVPlots/index.html WVPlots] package
<li>http://www.cookbook-r.com/Graphs/Plotting_distributions_(ggplot2)/
<li>Overlay histograms with density plots
<pre>
library(ggplot2); library(tidyr)
x <- data.frame(v1=rnorm(100), v2=rnorm(100,1,1),
                v3=rnorm(100, 0,2))
data <- pivot_longer(x, cols=1:3)
ggplot(data, aes(x=value, fill=name)) +
  geom_histogram(aes(y=..density..), alpha=.25) +
  stat_density(geom="line", aes(color=name, linetype=name))
ggplot(data, aes(x=value, fill=name, col =name)) +
  geom_density(alpha = .4)
</pre>
</li>
</ul>


== Bump plot: plot ranking over time ==
=== A panel of density plots ===
https://github.com/davidsjoberg/ggbump
<ul>
<li>Common xlim for all subplots
<pre>
ggplot(data = mpg, aes(x = hwy)) +
    geom_density() +
    facet_wrap(~ class)
</pre>
<li>Each subplot has its own xlim
<pre>
ggplot(data = mpg, aes(x = hwy)) +
    geom_density() +
    facet_wrap(~ class, scales = "free_x")
</pre>
</ul>


== Gauge plots ==
== Bivariate analysis with ggpair ==
* [https://pomvlad.blog/2018/05/03/gauges-ggplot2/ Generating gauge plots in ggplot2]
[https://www.guru99.com/r-pearson-spearman-correlation.html Correlation in R: Pearson & Spearman with Matrix Example ]
* [https://www.stomperusa.com/2020/10/18/multiple-gauge-plots-with-facet-wrap/ Multiple Gauge Plots with Facet Wrap]
 
== GGally::ggpairs ==
* graphics::pairs()
** [https://www.statology.org/pairs-plots-r/ How to Create and Interpret Pairs Plots in R]. [https://www.rdocumentation.org/packages/graphics/versions/3.6.2/topics/pairs pairs()]
** [https://www.spsanderson.com/steveondata/posts/2023-09-25/index.html Mastering Data Visualization with Pairs Plots in Base R]. Adding colors and regression lines,.
* [https://ggobi.github.io/ggally/articles/ All vignettes] launched by GGally::vig_ggally()
* [https://soroosj.netlify.app/2020/09/26/penguins-cluster/ Kmeans Clustering of Penguins]
* [http://padamson.github.io/r/ggally/ggplot2/ggpairs/2016/02/16/multiple-regression-lines-with-ggpairs.html Multiple regression lines in ggpairs]
* [https://www.blopig.com/blog/2019/06/a-brief-introduction-to-ggpairs/ A Brief Introduction to ggpairs]
* [https://stackoverflow.com/a/42656454 How to show only the lower triangle in ggpairs?]


== Sankey diagrams ==
== barplot/bar plot ==
* [https://en.wikipedia.org/wiki/Sankey_diagram Wikipedia]
* [http://www.brodrigues.co/blog/2020-04-12-basic_ggplot2/ How to basic: bar plots]
* [https://www.r-graph-gallery.com/sankey-diagram.html Some examples] by the [https://cran.r-project.org/web/packages/networkD3/index.html networkD3] package
* [https://appsilon.com/ggplot2-bar-charts/ How to Make Stunning Bar Charts in R]
* [http://www.sthda.com/english/wiki/ggplot2-barplots-quick-start-guide-r-software-and-data-visualization ggplot2 barplots : Quick start guide - R software and data visualization]


= Aesthetics =
=== Ordered barplot and facet ===
* https://ggplot2.tidyverse.org/reference/aes.html
* [https://www.r-graph-gallery.com/267-reorder-a-variable-in-ggplot2.html Reorder a variable with ggplot2]
* https://ggplot2.tidyverse.org/articles/ggplot2-specs.html
* [https://bugs.r-project.org/show_bug.cgi?id=18243 ‘reorder()’ gets an argument ‘decreasing’ which it passes to ‘sort()’ for level creation]. 2021-11-23
* [https://datavizpyr.com/re-ordering-bars-in-barplot-in-r/#How_To_Sort_Bars_in_Barplot_with_reorder_function_in_base_R How to Reorder bars in barplot with ggplot2 in R]. '''fct_reorder()''' and '''reorder()'''.
<ul>
<ul>
<li>We can create a new aesthetic name in '''aes(aesthetic = variable)''' function; for example, the "text2" below. In this case "text2" name will not be shown; only the original variable will be used.
<li>[https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/reorder.default ?reorder]. This, as '''relevel()''', is a special case of simply calling factor(x, levels = levels(x)[....]).
<pre>
<syntaxhighlight lang='rsplus'>
library(plotly)
R> bymedian <- with(InsectSprays, reorder(spray, count, median))
g <- ggplot(tail(iris), aes(Petal.Length, Sepal.Length, text2=Species)) + geom_point()
# bymedian will replace spray (a factor)  
ggplotly(g, tooltip = c("Petal.Length", "text2"))
# The data is not changed except the order of levels (a factor)  
</pre>
# In this case, the order is determined by the median of count from each spray level
</li>
#  from small to large.
</ul>


== aes_string() ==
R> InsectSprays[1:3, ]
* [https://ggplot2.tidyverse.org/reference/aes_.html aes_()]. Define aesthetic mappings programmatically.
  count spray
* [https://www.tutorialspoint.com/how-to-create-a-boxplot-using-ggplot2-with-aes-string-in-r How to create a boxplot using ggplot2 with aes_string in R?]
1    10    A
 
2    7    A
== group ==
3    20    A
https://ggplot2.tidyverse.org/reference/aes_group_order.html
R> bymedian
 
[1] A A A A A A A A A A A A B B B B B B B B B B B B C C C C C C C C C C C C D D D D D D D
* It seems the group parameter in aes() is used for coloring of lines. See [https://stackoverflow.com/a/43770608 How to change the color in geom_point or lines in ggplot].
[44] D D D D D E E E E E E E E E E E E F F F F F F F F F F F F
* [https://plotly.com/ggplot2/geom_line/ geom_line in ggplot2].
attr(,"scores")
* [https://stackoverflow.com/a/26195631 ggplot2 manually specifying colour with geom_line]
  A    B    C    D    E    F
* [http://www.sthda.com/english/wiki/ggplot2-line-types-how-to-change-line-types-of-a-graph-in-r-software ggplot2 line types : How to change line types of a graph in R software?]
14.0 16.5  1.5  5.0  3.0 15.0
Levels: C E D A F B
R> InsectSprays$spray
[1] A A A A A A A A A A A A B B B B B B B B B B B B C C C C C C C C C C C C D D D D D D D
[44] D D D D D E E E E E E E E E E E E F F F F F F F F F F F F
Levels: A B C D E F
R> boxplot(count ~ bymedian, data = InsectSprays,
        xlab = "Type of spray", ylab = "Insect count",
        main = "InsectSprays data", varwidth = TRUE,
        col = "lightgray")
</syntaxhighlight>
Scatterplot
<pre>
tibble(y=sample(6), x=letters[1:6]) %>%
  ggplot(aes(reorder(x, -y), y)) + geom_point(size=4)
</pre>
</li>
<li>[https://sebastiansauer.github.io/ordering-bars/ Sorting the x-axis in bargraphs using ggplot2] or [http://www.deeplytrivial.com/2020/05/statistics-sunday-my-2019-reading.html this one] from Deeply Trivial. reorder(fac, value) was used.
<syntaxhighlight lang='rsplus'>
ggplot(df, aes(x=reorder(x, -y), y=y)) + geom_bar(stat = 'identity')


= GUI/Helper packages =
df$order <- 1:nrow(df)
== ggedit & ggplotgui – interactive ggplot aesthetic and theme editor ==
# Assume df$y is a continuous variable and df$fac is a character/factor variable
* https://www.r-statistics.com/2016/11/ggedit-interactive-ggplot-aesthetic-and-theme-editor/
#  and we want to show factor according to the way they appear in the data
* https://github.com/gertstulp/ggplotgui/. It allows to change text (axis, title, font size), themes, legend, et al. A docker website was set up for the online version.
#  (not following R's order even the variable is of type "character" not "factor")
# We like to plot df$fac on the y-axis and df$y on x-axis. Fortunately,
#  ggplot2 will draw barplot vertically or horizontally depending the 2 variables' types
# The reason of using "-order" is to make the 1st name appears on the top
ggplot(df, aes(x=y, y=reorder(fac, -order))) + geom_col()
 
ggplot(df, aes(x=reorder(x, desc(y)), y=y)), geom_col()
</syntaxhighlight>
</li>
<li>[https://juliasilge.com/blog/giant-pumpkins/ Predict #TidyTuesday giant pumpkin weights with workflowsets]. [https://forcats.tidyverse.org/reference/fct_reorder.html fct_reorder()]  </li>
<li>[https://juliasilge.com/blog/reorder-within/ Reordering and facetting for ggplot2]. tidytext::reorder_within() was used. </li>
<li>Chapter2 of [https://github.com/chuvanan/rdatatable-cookbook data.table cookbook]. reorder(fac, value) was used. </li>
<li>[https://juliasilge.com/blog/cocktail-recipes-umap/ PCA and UMAP with tidymodels] </li>
<li>A simple example
<pre>
dat <- structure(list(gene = c("CAPN9", "CSF3R", "HPN", "KCNA5", "MTMR7",
"NRG3", "SMTNL2", "TMPRSS6"), coef = c(-1.238, -0.892, -0.224,
-0.057, 0.133, 0.377, 0.436, 0.804)), row.names = c("4976", "6467",
"12355", "13373", "18143", "19010", "23805", "25602"), class = "data.frame")


== esquisse (French, means 'sketch'): creating ggplot2 interactively ==
# Base R plot
https://cran.rstudio.com/web/packages/esquisse/index.html
par(mar=c(4,6,4,1))
barplot(dat$coef, names = dat$gene, horiz = T, las=1,
        main='base R', xlab = "Coefficients")


A 'shiny' gadget to create 'ggplot2' charts interactively with drag-and-drop to map your variables. You can quickly visualize your data accordingly to their type, export to 'PNG' or 'PowerPoint', and retrieve the code to reproduce the chart.
# GGplot2
dat %>% ggplot(aes(y=gene, x=coef)) + geom_col(fill = 'gray') +
    theme(axis.ticks.y = element_blank()) +
    theme(panel.background = element_blank(),  
          axis.line.x = element_line(colour = 'black')) +
    labs(x="Coefficients", y = '', title = "ggplot2")
</pre>
[[File:Barplot base.png|300px]], [[File:Barplot ggplot2.png|300px]]
</ul>


The interface introduces basic terms used in ggplot2:
=== Proportion barplot ===
* x, y,
* [https://www.geeksforgeeks.org/grouped-stacked-and-percent-stacked-barplot-in-ggplot2/ Grouped, stacked and percent stacked barplot in ggplot2] '''geom_bar(position = "fill", stat = "identity")'''
* fill (useful for geom_bar, geom_rect, geom_boxplot, & geom_raster, not useful for scatterplot),
* [https://thatdatatho.com/my-favourite-ggplot-plot-bar-chart-presentations/ Powerful Bar Plot for Presentations]
* color (edges for geom_bar, geom_line, geom_point),
* size,
* [http://www.cookbook-r.com/Graphs/Facets_(ggplot2)/ facet], split up your data by one or more variables and plot the subsets of data together.


It does not include all features in ggplot2. At the bottom of the interface,
=== Back to back barplot ===
* Labels & title & caption.
* https://community.rstudio.com/t/back-to-back-barplot/17106. Comment: the colors should be opposite but not.
* Plot options. Palette, theme, legend position.
* https://stackoverflow.com/a/55015174 (different scale on positive/negative sides. Cool!)
* Data. Remove subset of data.
* https://learnr.wordpress.com/2009/09/24/ggplot2-back-to-back-bar-charts/  (change negative values to positive values, slow to load the page)
* Export & code. Copy/save the R code. Export file as PNG or PowerPoint.
* [https://stackoverflow.com/a/33837922 Pyramid plot in R]
* [https://www.brodrigues.co/blog/2020-04-12-basic_ggplot2/ How to basic: bar plots]. Hint: use '''geom_col()''' twice.


== ggcharts ==
=== Pyramid Chart ===
https://cran.r-project.org/web/packages/ggcharts/index.html
[https://thomas-neitmann.github.io/ggcharts/reference/pyramid_chart.html ggcharts::pyramid_chart()]


== ggeasy ==
=== Flip x and y axes ===
* [https://cran.r-project.org/web/packages/ggeasy/index.html ggeasy]
coord_flip()
* [https://youtu.be/-2ZvQQ583pI How to simplify ggplot2 with ggeasy]


== ggx ==
=== Rotate x-axis labels ===
https://github.com/brandmaier/ggx Create ggplot in natural language
* [https://datavizpyr.com/rotate-x-axis-text-labels-in-ggplot2/ How To Rotate x-axis Text Labels in ggplot2?]
* [https://stackoverflow.com/a/7267364 What do hjust and vjust do when making a plot using ggplot?] 0 means left-justified 1 means right-justified.


= Interactive =
<pre>
== plotly ==
ggplot(mydf) + geom_col(aes(x = model, y=value, fill = method), position="dodge")+
[[R_web#plotly|R web &rarr; plotly]]
  theme(axis.text.x = element_text(angle = 45, hjust=1, size= 8))
</pre>


== ggiraph ==
=== Starts at zero ===
[https://cran.r-project.org/web/packages/ggiraph/index.html ggiraph]: Make 'ggplot2' Graphics Interactive
[http://malditobarbudo.xyz/blog/r/starting-bars-and-histograms-at-zero-in-ggplot2/ Starting bars and histograms at zero in ggplot2]
<pre>
scale_y_continuous(expand = c(0,0), limits = c(0, YourLimit))
</pre>
* [https://stackoverflow.com/a/44170954 How does ggplot scale_continuous expand argument work?]
* https://ggplot2.tidyverse.org/reference/scale_continuous.html
* https://ggplot2.tidyverse.org/reference/scale_discrete.html


= ggconf: Simpler Appearance Modification of 'ggplot2' =
=== Add patterns ===
https://github.com/caprice-j/ggconf
* [https://coolbutuseless.github.io/package/ggpattern/ ggpattern] package
* [https://www.jianshu.com/p/6d889a80d229 ggpartten填充柱状图]


= Plotting individual observations and group means =
=== Barplot with colors for a 2nd variable ===
https://drsimonj.svbtle.com/plotting-individual-observations-and-group-means-with-ggplot2
[https://www.brodrigues.co/blog/2020-04-12-basic_ggplot2/ How to basic: bar plots]


= subplot =
By default, the barplots are stacked on top of each other. Use '''geom_col(position = "dodge")''' if we want the barplots to be side-by-side.
* https://ikashnitsky.github.io/2017/subplots-in-maps/
<pre>
* [https://stackoverflow.com/a/20721231 Embedding a subplot]
df <- data.frame(group = c("A", "A", "B", "B", "C", "C"),
      count = c(3, 4, 5, 6, 7, 8),
      fill = c("red", "blue", "red", "blue", "red", "blue"))
ggplot(df, aes(x = group, y = count, fill = fill)) +
      geom_col(position = "dodge")
</pre>
[[File:ggplotbarplot.png|250px]]
 
[https://stats.stackexchange.com/a/3843 Base R approach].
 
=== Barplot with color gradient ===
* [https://stackoverflow.com/a/52026622 horizontal barplot with color gradient from top to bottom of the graphic]
* [https://r-graph-gallery.com/79-levelplot-with-ggplot2.html ggplot2 heatmap]
* [https://ggplot2.tidyverse.org/reference/scale_gradient.html scale_fill_gradient()], [https://ggplot2.tidyverse.org/reference/scale_brewer.html scale_colour_brewer()/scale_fill_distiller()], [https://ggplot2.tidyverse.org/reference/scale_viridis.html scale_fill_viridis()]. To reverse the colors, use the '''direction''' parameter; see [https://statisticsglobe.com/scale-colour-fill-brewer-rcolorbrewer-package-r#example-3-reverse-order-of-color-brewer-palette here].


== Adding/Inserting an image to ggplot2 ==
[[File:Geomcolviridis.png|300px]]
[https://stackoverflow.com/a/9917684 Inserting an image to ggplot2]: See [[#annotation_custom|annotation_custom]].


See also [https://github.com/R-CoderDotCom/ggbernie/ ggbernie] which uses a different way [https://ggplot2.tidyverse.org/reference/layer.html ggplot2::layer()] and a self-defined geom (geometric object).
=== Barplot with only horizontal gridlines ===
[[File:Geom bar3.png|250px]] [[File:Geom bar4.png|250px]]


= Easy way to mix multiple graphs on the same page =
=== Barplot with text at the end ===
* http://www.cookbook-r.com/Graphs/Multiple_graphs_on_one_page_(ggplot2)/. '''grid''' package is used.
* [https://r-graph-gallery.com/37-barplot-with-number-of-observation.html Barplot with number of observation]
* [https://cran.r-project.org/web/packages/gridExtra/index.html gridExtra]::grid.arrange() which has lots of reverse imports.
* [https://www.cedricscherer.com/2021/07/05/a-quick-how-to-on-labelling-bar-graphs-in-ggplot2/ A Quick How-to on Labelling Bar Graphs in ggplot2]
** [https://datascienceplus.com/machine-learning-results-one-plot-to-rule-them-all/ Machine Learning Results in R: one plot to rule them all!]
* [https://stackoverflow.com/a/11939678 How to label a barplot bar with positive and negative bars with ggplot2] (Looks good but 2012)
** It is used by the book [https://bioconductor.org/books/release/OSCA/dimensionality-reduction.html#visualizing-with-pca Orchestrating Single-Cell Analysis with Bioconductor] to visualize dimension reduction result among cells from the t-SNE algorithm.
* [https://twitter.com/rappa753/status/1604144466033405953 plitting a stacked bar plot simple]
* [https://cran.rstudio.com/web/packages/egg/ egg] (ggarrange()): Extensions for 'ggplot2', to Align Plots, Plot insets, and Set Panel Sizes. Same author of gridExtra package. egg depends on gridExtra.
* Examples from publications
** [https://onunicornsandgenes.blog/2019/01/13/showing-a-difference-in-means-between-two-groups/ Showing a difference in means between two groups]
** https://twitter.com/simocristea/status/1603055034081505280/photo/1. Draw a panel of barplots with common labels?
** [https://stackoverflow.com/a/16258375 How can I make consistent-width plots in ggplot (with legends)?]
* [http://www.sthda.com/english/wiki/ggplot2-easy-way-to-mix-multiple-graphs-on-the-same-page Easy Way to Mix Multiple Graphs on The Same Page]. Four packages are included: '''ggpubr''' (ggarrange()), '''cowplot''' (plot_grid()), '''gridExtra''' and '''grid'''.
** cowplot can mix ggplot2 and base graphics (require the '''gridGraphics''' package). It can also add 'A', 'B' to each subplot for easy annotation.
** [https://www.rdocumentation.org/packages/cowplot/versions/1.0.0/topics/draw_image draw_image()] from cowplot can embed an image to a plot. See [https://evamaerey.github.io/ggplot2_grammar_guide/ensembles.html#76 this example].
** [https://datascienceplus.com/how-to-combine-multiple-ggplot-plots-to-make-publication-ready-plots/ How to combine Multiple ggplot Plots to make Publication-ready Plots]
** ''Cannot convert object of class ggsurvplotggsurvlist into a grob'' [https://stackoverflow.com/a/58124480 ggpubr::ggarrange is just a wrapper around cowplot::plot_grid()]. This does not solve the problem. Using [https://rpkgs.datanovia.com/survminer/reference/arrange_ggsurvplots.html '''survminer::arrange_ggsurvplots()'''] does work.
** [https://stackoverflow.com/a/58945564 unable to use survfit when called from a function]. Use '''surv_fit()''' instead survfit() with ggsurvplot() when ggsurvplot() is used within another function.
* [https://cran.r-project.org/web/packages/patchwork/index.html patchwork]. [https://www.rdocumentation.org/packages/patchwork/versions/1.0.0/topics/plot_spacer plot_spacer()] to create an empty plot.
* [http://www.sharpsightlabs.com/blog/master-small-multiple/ Why you should master small multiple chart] (facet_wrap()), facet_grid())
* [https://hadley.shinyapps.io/cran-downloads/ Download statistics] and enter "gridExtra, cowplot, ggpubr, egg, grid" (the number of downloads is in this order).
* [https://stackoverflow.com/a/39009374 how to add common x and y labels to a grid of plots]. Another solution is on the egg package's [https://cran.rstudio.com/web/packages/egg/vignettes/Ecosystem.html vignette].


== annotation_custom ==
[[File:Geom bar1.png|250px]] [[File:Geom bar2.png|250px]]
* https://ggplot2.tidyverse.org/reference/annotation_custom.html
 
* [http://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/81-ggplot2-easy-way-to-mix-multiple-graphs-on-the-same-page/ ggplot2 - Easy Way to Mix Multiple Graphs on The Same Page]
== Polygon and map plot ==
<ul>
* https://ggplot2.tidyverse.org/reference/geom_polygon.html
<li>[https://github.com/cran/TreatmentSelection/blob/master/R/predcurvePLOT.R#L89-L94 predcurvePlot.R] from TreatmentSelection. One issue is the font size is large for the text & labels at the bottom. The 2nd issue is the bottom part of the graph/annotation (marker value scale) can be truncated if the window size is too large. If the window is too small, the bottom part can overlap with the top part.
* Base R method. ?polygon.
[[File:Polygon.png|200px]]
 
== geom_step: Step function ==
Connect observations: [https://ggplot2.tidyverse.org/reference/geom_path.html geom_path(), geom_step()]
 
Example: KM curves (without legend)
<pre>
<pre>
p <- p + theme(plot.margin = unit(c(1,1,4,1), "lines")) # hard coding
library(survival)
p <- p + annotation_custom() # axis for marker value scale
sf <- survfit(Surv(time, status) ~ x, data = aml)
p <- p + annotation_custom() # label only
sf
str(sf) # the first 10 forms one strata and the rest 10 forms the other
ggplot() +  
  geom_step(aes(x=c(0, sf$time[1:10]), y=c(1, sf$surv[1:10])),
            col='red') +
  scale_x_continuous('Time', limits = c(0, 161)) +
  scale_y_continuous('Survival probability', limits = c(0, 1)) +
  geom_step(aes(x=c(0, sf$time[11:20]), y=c(1, sf$surv[11:20])),
            col='black')  
# cf:  plot(sf, col = c('red', 'black'), mark.time=FALSE)
</pre>
</pre>
<ul>
 
<li>[https://github.com/BMBOUP/Optimal_threshold/blob/master/plot_time_dependent_predictiveness.R Similar plot but without using base R graphic]. One issue is the text is not below the scale (this can be fixed by par(mar) & [https://stackoverflow.com/a/25907359 mtext(text, side=1, line=4)]) and the 2nd issue is the same as ggplot2's approach.
Same example but with legend (see [https://stackoverflow.com/a/17149021 Construct a manual legend for a complicated plot])
<pre>
<pre>
axis(1,at= breaks, label = round(quantile(x1, prob = breaks/100), 1),pos=-0.26) # hard coding
cols <- c("NEW"="#f04546","STD"="#3591d1")
ggplot() +
  geom_step(aes(x=c(0, sf$time[1:10]), y=c(1, sf$surv[1:10]), col='NEW')) +
  scale_x_continuous('Time', limits = c(0, 161)) +
  scale_y_continuous('Survival probability', limits = c(0, 1)) +
  geom_step(aes(x=c(0, sf$time[11:20]), y=c(1, sf$surv[11:20]), col='STD')) +
  scale_colour_manual(name="Treatment", values = cols)
</pre>
</pre>
</li>
<li>Another common problem is the plot saved by pdf() or png() can be truncated too. I have a better luck with png() though. </li>
</ul>
</ul>


== grid ==
To control the line width, use the '''size''' parameter; e.g. geom_step(aes(x, y), size=.5). The default size is .5 (where to find this info?).
<ul>
 
<li>Create a gradient image [https://www.rdocumentation.org/packages/grid/versions/3.6.2/topics/grid.raster grid.raster() or rasterGrob()]
To allow different line types, use the '''linetype''' parameter. The first level is solid line, the 2nd level is dashed, ... We can change the default line types by using the '''scale_linetype_manual()''' function. See [https://www.datanovia.com/en/blog/line-types-in-r-the-ultimate-guide-for-r-base-plot-and-ggplot/ Line Types in R: The Ultimate Guide for R Base Plot and GGPLOT].
<pre>
 
redGradient <- matrix(hcl(0, 80, seq(50, 80, 10)), nrow=4, ncol=5)
== Coefficients, intervals, errorbars ==
# interpolated
* [https://stackoverflow.com/a/42560960 Plotting two models with regression coefficients] with [https://ggplot2.tidyverse.org/reference/geom_linerange.html geom_pointrange()] - Vertical intervals: lines, crossbars & errorbars.
grid.newpage()
* [https://stackoverflow.com/q/49483128 Grouping and staggering estimates with geom_point]
grid.raster(redGradient)
</pre>
</li>
<li>
[https://nandeshwar.info/data-visualization/how-to-create-infographics-in-r/ Recipe for Infographics in R]. See example of using rasterGrob() and annotation_custom() to place more images using a custom function.
</li>
<li>
[https://datascienceplus.com/how-to-add-a-background-image-to-ggplot2-graphs/ How to add a background image to ggplot2 graphs]
</li>
<li>
[https://www.engineeringbigdata.com/how-to-add-a-background-image-in-ggplot2-with-r/ How to Add a Background Image in ggplot2 with R]
</li>
</ul>


== gridExtra ==
== Comparing similarities / differences between groups ==
=== Force a regular plot object into a Grob for use in grid.arrange ===
[https://www.business-science.io/code-tools/2021/02/09/stat-plots-in-R.html comparing similarities / differences between groups]
[https://stackoverflow.com/a/33848995 gridGraphics] package


=== make one panel blank/create a placeholder ===
= Special plots =
https://stackoverflow.com/questions/20552226/make-one-panel-blank-in-ggplot2
* [https://readmedium.com/5-extremely-useful-plots-for-data-scientists-that-you-never-knew-existed-5b92498a878f 5 Extremely Useful Plots For Data Scientists That You Never Knew Existed].
** Chord Diagram
** Sunburst Chart
** Hexbin Plot
** Sankey Diagram
** Stream Graph/ Theme River


= labs for x and y axes =
== Dot plot & forest plot ==
== x and y labels ==
* Wikipedia
https://stackoverflow.com/questions/10438752/adding-x-and-y-axis-labels-in-ggplot2 or the '''Labels''' part of the [https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf cheatsheet]
** https://en.wikipedia.org/wiki/Dot_plot_(statistics),
** https://en.wikipedia.org/wiki/Forest_plot
* [https://s4be.cochrane.org/blog/2016/07/11/tutorial-read-forest-plot/ Tutorial: How to read a forest plot]
* [https://ikashnitsky.github.io/2019/dotplot/ Dotplot – the single most useful yet largely neglected dataviz type]
* [http://sthda.com/english/wiki/ggplot2-dot-plot-quick-start-guide-r-software-and-data-visualization ggplot2 dot plot : Quick start guide - R software and data visualization]
* [https://cran.r-project.org/web/packages/forestplot/ foresplot] package
* [https://stackoverflow.com/a/63945806 Forest Plot, ordering and summarizing multiple variables]
* [https://www.statology.org/forest-plot-in-r/ How to Create a Forest Plot in R]. A forest plot (sometimes called a “blobbogram”) is used in a meta-analysis to visualize the results of several studies in one plot.
* [https://bookdown.org/MathiasHarrer/Doing_Meta_Analysis_in_R/forest.html Doing Meta-Analysis with R: A Hands-On Guide] ebook where the [https://cran.r-project.org/web/packages/meta/index.html meta] package was used.
* [https://rpkgs.datanovia.com/survminer/reference/ggforest.html survminer::ggforest()*]: Draws forest plot for CoxPH model. See [https://www.r-bloggers.com/2017/03/survminer-cheatsheet-to-create-easily-survival-plots/ Survminer Cheatsheet to Create Easily Survival Plots] & [https://www.datacamp.com/community/tutorials/survival-analysis-R#fifth Hazard ratio forest plot: ggforest() from survminer]
** [https://rdrr.io/cran/survivalAnalysis/man/forest_plot.html survivalAnalysis::forest_plot()]. Builds upon the 'survminer' package for Kaplan-Meier plots and provides a customizable implementation for forest plots.
** [https://www.nature.com/articles/s41467-021-26502-6#ref-CR74 Multi-omics analysis identifies therapeutic vulnerabilities in triple-negative breast cancer subtypes] 2021
* [https://cran.r-project.org/web/packages/forestmodel/index.html forestmodel*]: Forest Plots from Regression Models. [https://stackoverflow.com/a/58350503 ggforest (survminer) only selected covariates]
* [https://github.com/adayim/forestploter forestploter]


You can set the labels with xlab() and ylab(), or make it part of the scale_*.* call.
== Lollipop plot ==
'''geom_segment()''' + '''geom_point()'''


<pre>
<ul>
labs(x = "sample size", y = "ngenes (glmnet)")
<li>[https://www.data-to-viz.com/graph/lollipop.html A lollipop plot is basically a barplot, where the bar is transformed in a line and a dot.]</li>
<li>[https://r-charts.com/ranking/lollipop-chart-ggplot2/ r-charts.com/] </li>
<li>[https://www.r-graph-gallery.com/lollipop-plot.html r-graph-gallery.com], [https://www.r-graph-gallery.com/300-basic-lollipop-plot.html Most basic lollipop plot], [https://www.r-graph-gallery.com/302-lollipop-chart-with-conditional-color.html Lollipop chart with conditional color]
<syntaxhighlight lang="rsplus">
library(ggplot2)


scale_x_discrete(name="sample size")
# Create data
scale_y_continuous(name="ngenes (glmnet)", limits=c(100, 500))
data <- data.frame(
</pre>
  x=LETTERS[1:26],
  y=abs(rnorm(26))
)


== Change tick mark labels ==
# Horizontal version
[http://www.sthda.com/english/wiki/ggplot2-axis-ticks-a-guide-to-customize-tick-marks-and-labels ggplot2 axis ticks : A guide to customize tick marks and labels]
ggplot(data, aes(x=x, y=y)) +
  geom_segment( aes(x=x, xend=x, y=0, yend=y), color="skyblue") +
  geom_point( color="blue", size=4, alpha=0.6) +
  theme_light() +
  coord_flip() +
  theme(
    panel.grid.major.y = element_blank(),
    panel.border = element_blank(),
    axis.ticks.y = element_blank()
  )
</syntaxhighlight>
Note if we put ''color'' argument in geom_segment(), the color shape in the legend will be a solid circle with a cross line (looks funny). So it is better not to have multiple colors for the segment part in the lollipop plot.
</li>
<li>[https://hutsons-hacks.info/diverging-dot-plot-and-lollipop-charts-plotting-variance-with-ggplot2 Diverging Dot Plot and Lollipop Charts – Plotting Variance with ggplot2] </li>
<li>[https://datavizpyr.com/lollipop-plot-in-r-with-ggplot2/ How To Make Lollipop Plot in R with ggplot2?] </li>
<li>[https://www.statology.org/lollipop-chart-r/ Color annotation] </li>
<li>[http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-Code.html Top 50 ggplot2 Visualizations - The Master List (With Full R Code)] from r-statistics.co </li>
</ul>


== name-value pairs ==
'''ggpubr:: ggdotchart()'''
See several examples (color, fill, size, ...) from [https://juliasilge.com/blog/texas-opioids/ opioid prescribing habits in texas].
* [http://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/ Dot charts, Lollipop chart]


= Prevent sorting of x labels =
== Correlation Analysis Different ==
See [https://stackoverflow.com/a/3255448 Change the order of a discrete x scale].
* [https://github.com/r-link/corrmorant corrmorant]: Flexible Correlation Matrices Based on ggplot2
* [https://finnstats.com/index.php/2021/05/13/correlation-analysis-plot/ Correlation Analysis Different Types of Plots in R]


The idea is to set the levels of x variable.
== Bump plot: plot ranking over time ==
https://github.com/davidsjoberg/ggbump


<pre>
== Gauge plots ==
junk  # n x 2 table
* [https://pomvlad.blog/2018/05/03/gauges-ggplot2/ Generating gauge plots in ggplot2]
colnames(junk) <- c("gset", "boot")
* [https://www.stomperusa.com/2020/10/18/multiple-gauge-plots-with-facet-wrap/ Multiple Gauge Plots with Facet Wrap]
junk$gset <- factor(junk$gset, levels = as.character(junk$gset))
 
ggplot(data = junk, aes(x = gset, y = boot, group = 1)) +
== Sankey diagrams ==
  geom_line() +
* [https://en.wikipedia.org/wiki/Sankey_diagram Wikipedia]
  theme(axis.text.x=element_text(color = "black", angle=30, vjust=.8, hjust=0.8))
* [https://www.r-graph-gallery.com/sankey-diagram.html Some examples] by the [https://cran.r-project.org/web/packages/networkD3/index.html networkD3] package
</pre>
 
== Horizon chart ==
* [https://statisticaloddsandends.wordpress.com/2022/03/31/what-is-a-horizon-chart/ What is a horizon chart?]
* [https://chenyuzuoo.github.io/posts/7349/ How to Draw a Horizon Chart with R]
 
== Circos plots ==
* [https://cloud.r-project.org/web/packages/circlize/index.html circlize] (not depends on ggplot2)
* [[NGS#Circos_Plot|NGS -> Circos plot]]
* [https://r-charts.com/flow/chord-diagram/ Chord diagram in R with circlize]
* [https://www.royfrancis.com/beautiful-circos-plots-in-r/ Beautiful circos plots in R]
* [https://r-graph-gallery.com/224-basic-circular-plot.html Introduction to the circlize package]
* [https://r-graph-gallery.com/122-a-circular-plot-with-the-circlize-package.html Chord diagram from adjacency matrix]
* [https://bioconductor.org/packages/release/bioc/html/ComplexHeatmap.html ComplexHeatmap] imports it.


= Legends =
= Aesthetics =
== Legend title ==
* https://ggplot2.tidyverse.org/reference/aes.html
* https://ggplot2.tidyverse.org/articles/ggplot2-specs.html
<ul>
<ul>
<li>[https://ggplot2-book.org/scales.html#scale-title labs() function]
<li>We can create a new aesthetic name in '''aes(aesthetic = variable)''' function; for example, the "text2" below. In this case "text2" name will not be shown; only the original variable will be used.
<pre>
<pre>
p <- ggplot(df, aes(x, y)) + geom_point(aes(colour = z))
library(plotly)
p + labs(x = "X axis", y = "Y axis", colour = "Colour\nlegend")
g <- ggplot(tail(iris), aes(Petal.Length, Sepal.Length, text2=Species)) + geom_point()
</pre>
ggplotly(g, tooltip = c("Petal.Length", "text2"))
</li>
<li>scale_colour_manual()
<pre>
scale_colour_manual("Treatment", values = c("black", "red"))
</pre>
</li>
<li>scale_color_discrete() and scale_shape_discrete(). See [[#Combine_colors_and_shapes_in_legend|Combine colors and shapes in legend]].
<pre>
df <- data.frame(x = 1:3, y = 1:3, z = c("a", "b", "c"))
ggplot(df, aes(x, y)) + geom_point(aes(shape = z, colour = z), size=5) +
  scale_color_discrete('new title') + scale_shape_discrete('new title')
</pre>
</pre>
</li>
</li>
</ul>
</ul>


== Layout: move the legend from right to top/bottom of the plot or hide it ==
== Aesthetics finder ==
<pre>
https://ggplot2tor.com/aesthetics/, [https://twitter.com/ChBurkhart/status/1650523994548731911?s=20 video]
gg + theme(legend.position = "top")


# Useful in the boxplot case
== aes_string() ==
gg + theme(legend.position="none")
* [https://ggplot2.tidyverse.org/reference/aes_.html aes_()]. Define aesthetic mappings programmatically.
</pre>
* [https://www.tutorialspoint.com/how-to-create-a-boxplot-using-ggplot2-with-aes-string-in-r How to create a boxplot using ggplot2 with aes_string in R?]


== Guide functions for finer control ==
== group ==
https://ggplot2-book.org/scales.html#guide-functions The guide functions, guide_colourbar() and guide_legend(), offer additional control over the fine details of the legend.
https://ggplot2.tidyverse.org/reference/aes_group_order.html


[https://ggplot2.tidyverse.org/reference/guide_legend.html guide_legend()] allows the modification of legends for scales, including fill, color, and shape.
* It seems the group parameter in aes() is used for coloring of lines. See [https://stackoverflow.com/a/43770608 How to change the color in geom_point or lines in ggplot].
* [https://plotly.com/ggplot2/geom_line/ geom_line in ggplot2].
* [https://stackoverflow.com/a/26195631 ggplot2 manually specifying colour with geom_line]
* [http://www.sthda.com/english/wiki/ggplot2-line-types-how-to-change-line-types-of-a-graph-in-r-software ggplot2 line types : How to change line types of a graph in R software?]


This function can be used in scale_fill_manual(), scale_fill_continuous(), ... functions.
= GUI/Helper packages =
== ggedit & ggplotgui – interactive ggplot aesthetic and theme editor ==
* https://www.r-statistics.com/2016/11/ggedit-interactive-ggplot-aesthetic-and-theme-editor/
* https://github.com/gertstulp/ggplotgui/. It allows to change text (axis, title, font size), themes, legend, et al. A docker website was set up for the online version.
 
== esquisse (French, means 'sketch'): creating ggplot2 interactively ==
https://cran.rstudio.com/web/packages/esquisse/index.html


<pre>
A 'shiny' gadget to create 'ggplot2' charts interactively with drag-and-drop to map your variables. You can quickly visualize your data accordingly to their type, export to 'PNG' or 'PowerPoint', and retrieve the code to reproduce the chart.
scale_fill_manual(values=c("orange", "blue"),
                  guide=guide_legend(title = "My Legend Title",
                                    nrow=1,  # multiple items in one row
                                    label.position = "top", # move the texts on top of the color key
                                    keywidth=2.5)) # increase the color key width
</pre>
The problem with the default setting is it leaves a lot of white space above and below the legend.
To change the position of the entire legend to the bottom of the plot, we use theme().
<pre>
theme(legend.position = 'bottom')
</pre>


== Legend symbol background ==
The interface introduces basic terms used in ggplot2:
<pre>
* x, y,
ggplot() + geom_point(aes(x, y, color, size)) +
* fill (useful for geom_bar, geom_rect, geom_boxplot, & geom_raster, not useful for scatterplot),
          theme(legend.key = element_blank())
* color (edges for geom_bar, geom_line, geom_point),  
          # remove the symbol background in legend
* size,
</pre>
* [http://www.cookbook-r.com/Graphs/Facets_(ggplot2)/ facet], split up your data by one or more variables and plot the subsets of data together.


== Construct a manual legend for a complicated plot ==
It does not include all features in ggplot2. At the bottom of the interface,
https://stackoverflow.com/a/17149021
* Labels & title & caption.
* Plot options. Palette, theme, legend position.
* Data. Remove subset of data.
* Export & code. Copy/save the R code. Export file as PNG or PowerPoint.


== Legend size ==
== ggcharts ==
[https://www.statology.org/ggplot2-legend-size/ How to Change Legend Size in ggplot2 (With Examples)]
https://cran.r-project.org/web/packages/ggcharts/index.html


= ggtitle() =
== ggeasy ==
== Centered title ==
* [https://cran.r-project.org/web/packages/ggeasy/index.html ggeasy]
See the '''Legends''' part of the [https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf cheatsheet].
* [https://youtu.be/-2ZvQQ583pI How to simplify ggplot2 with ggeasy]
<pre>
ggtitle("MY TITLE") +
  theme(plot.title = element_text(hjust = 0.5))
</pre>


=== Subtitle ===
== ggx ==
<pre>
https://github.com/brandmaier/ggx Create ggplot in natural language
ggtitle("My title",
        subtitle = "My subtitle")
</pre>


= margins =
= Interactive =
https://stackoverflow.com/a/10840417
== plotly ==
[[R_web#plotly|R web &rarr; plotly]]


= Aspect ratio =
== ggiraph ==
?coord_fixed
[https://cran.r-project.org/web/packages/ggiraph/index.html ggiraph]: Make 'ggplot2' Graphics Interactive
<pre>
p <- ggplot(mtcars, aes(mpg, wt)) + geom_point()
p + coord_fixed() # plot is compressed horizontally
p  # fill up plot region
</pre>


= Time series plot =
= ggconf: Simpler Appearance Modification of 'ggplot2' =
* [http://sharpsightlabs.com/blog/line-chart-ggplot2-amzn/ How to make a line chart with ggplot2]
https://github.com/caprice-j/ggconf
* [http://ggplot2.tidyverse.org/reference/scale_brewer.html#palettes Colour palettes]. Note some palette options like ''Accent'' from the Qualitative category will give a warning message In RColorBrewer::brewer.pal(n, pal) :  n too large, allowed maximum for palette Accent is 8.


Multiple lines plot https://stackoverflow.com/questions/14860078/plot-multiple-lines-data-series-each-with-unique-color-in-r
= Plotting individual observations and group means =
{{Pre}}
https://drsimonj.svbtle.com/plotting-individual-observations-and-group-means-with-ggplot2
set.seed(45)
nc <- 9
df <- data.frame(x=rep(1:5, nc), val=sample(1:100, 5*nc),
                  variable=rep(paste0("category", 1:nc), each=5))
# plot
# http://colorbrewer2.org/#type=qualitative&scheme=Paired&n=9
ggplot(data = df, aes(x=x, y=val)) +
    geom_line(aes(colour=variable)) +
    scale_colour_manual(values=c("#a6cee3", "#1f78b4", "#b2df8a", "#33a02c", "#fb9a99", "#e31a1c", "#fdbf6f", "#ff7f00", "#cab2d6"))
</pre>
Versus old fashion
<syntaxhighlight lang='rsplus'>
dat <- matrix(runif(40,1,20),ncol=4) # make data
matplot(dat, type = c("b"),pch=1,col = 1:4) #plot
legend("topleft", legend = 1:4, col=1:4, pch=1) # optional legend
</syntaxhighlight>


= calendR =
= subplot =
[https://r-coder.com/calendar-plot-r/ Calendar plot in R using ggplot2]
* https://ikashnitsky.github.io/2017/subplots-in-maps/
* [https://stackoverflow.com/a/20721231 Embedding a subplot]


= Github style calendar plot =
== Adding/Inserting an image to ggplot2 ==
* https://mvuorre.github.io/post/2016/2016-03-24-github-waffle-plot/
[https://stackoverflow.com/a/9917684 Inserting an image to ggplot2]: See [[#annotation_custom|annotation_custom]].
* https://gist.github.com/marcusvolz/84d69befef8b912a3781478836db9a75 from [https://github.com/marcusvolz/strava Create artistic visualisations with your exercise data]


= geom_point() =
See also [https://github.com/R-CoderDotCom/ggbernie/ ggbernie] which uses a different way [https://ggplot2.tidyverse.org/reference/layer.html ggplot2::layer()] and a self-defined geom (geometric object).
<pre>
df <- data.frame(x=1:3, y=1:3, color=c("red", "green", "blue"))
# Use I() to set aes values to the identify of a value from your data table
ggplot(df, aes(x,y, color=I(color))) + geom_point(size=10)
# VS
ggplot(df, aes(x,y, color=color)) + geom_point(size=10) # color is like a class label
</pre>


= geom_bar(), geom_col(), stat_count() =
= Easy way to mix/combine multiple graphs on the same page =
https://ggplot2.tidyverse.org/reference/geom_bar.html
* http://www.cookbook-r.com/Graphs/Multiple_graphs_on_one_page_(ggplot2)/. '''grid''' package is used.
 
* [https://cran.r-project.org/web/packages/gridExtra/index.html gridExtra]::grid.arrange() which has lots of reverse imports.
<pre>
** [https://datascienceplus.com/machine-learning-results-one-plot-to-rule-them-all/ Machine Learning Results in R: one plot to rule them all!]
geom_col(position = 'dodge') # same as
** It is used by the book [https://bioconductor.org/books/release/OSCA/dimensionality-reduction.html#visualizing-with-pca Orchestrating Single-Cell Analysis with Bioconductor] to visualize dimension reduction result among cells from the t-SNE algorithm.
geom_bar(stat = 'identity', position = 'dodge')
* [http://www.sthda.com/english/wiki/ggplot2-easy-way-to-mix-multiple-graphs-on-the-same-page Easy Way to Mix Multiple Graphs on The Same Page]. Four packages are included: '''ggpubr''' (ggarrange()), '''cowplot''' (plot_grid()), '''gridExtra''' and '''grid'''.
</pre>
** cowplot can mix ggplot2 and base graphics (require the '''gridGraphics''' package). It can also add 'A', 'B' to each subplot for easy annotation.
** [https://www.rdocumentation.org/packages/cowplot/versions/1.0.0/topics/draw_image draw_image()] from cowplot can embed an image to a plot. See [https://evamaerey.github.io/ggplot2_grammar_guide/ensembles.html#76 this example].
** [https://datascienceplus.com/how-to-combine-multiple-ggplot-plots-to-make-publication-ready-plots/ How to combine Multiple ggplot Plots to make Publication-ready Plots]
** ''Cannot convert object of class ggsurvplotggsurvlist into a grob'' [https://stackoverflow.com/a/58124480 ggpubr::ggarrange is just a wrapper around cowplot::plot_grid()]. This does not solve the problem. Using [https://rpkgs.datanovia.com/survminer/reference/arrange_ggsurvplots.html '''survminer::arrange_ggsurvplots()'''] does work.
** [https://stackoverflow.com/a/58945564 unable to use survfit when called from a function]. Use '''surv_fit()''' instead survfit() with ggsurvplot() when ggsurvplot() is used within another function.
* [https://cran.r-project.org/web/packages/patchwork/index.html patchwork]. [https://www.rdocumentation.org/packages/patchwork/versions/1.0.0/topics/plot_spacer plot_spacer()] to create an empty plot.
* [http://www.sharpsightlabs.com/blog/master-small-multiple/ Why you should master small multiple chart] (facet_wrap()), facet_grid())
* [https://hadley.shinyapps.io/cran-downloads/ Download statistics] and enter "gridExtra, cowplot, ggpubr, egg, grid" (the number of downloads is in this order).


geom_bar() can not specify the y-axis. To specify y-axis, use geom_col().  
== annotation_custom ==
* https://ggplot2.tidyverse.org/reference/annotation_custom.html
* [http://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/81-ggplot2-easy-way-to-mix-multiple-graphs-on-the-same-page/ ggplot2 - Easy Way to Mix Multiple Graphs on The Same Page]
<ul>
<li>[https://github.com/cran/TreatmentSelection/blob/master/R/predcurvePLOT.R#L89-L94 predcurvePlot.R] from TreatmentSelection. One issue is the font size is large for the text & labels at the bottom. The 2nd issue is the bottom part of the graph/annotation (marker value scale) can be truncated if the window size is too large. If the window is too small, the bottom part can overlap with the top part.
<pre>
<pre>
ggplot() + geom_col(mapping = aes(x, y))
p <- p + theme(plot.margin = unit(c(1,1,4,1), "lines"))  # hard coding
p <- p + annotation_custom() # axis for marker value scale
p <- p + annotation_custom() # label only
</pre>
</pre>
<ul>
<li>[https://github.com/BMBOUP/Optimal_threshold/blob/master/plot_time_dependent_predictiveness.R Similar plot but without using base R graphic]. One issue is the text is not below the scale (this can be fixed by par(mar) & [https://stackoverflow.com/a/25907359 mtext(text, side=1, line=4)]) and the 2nd issue is the same as ggplot2's approach.
<pre>
axis(1,at= breaks, label = round(quantile(x1, prob = breaks/100), 1),pos=-0.26) # hard coding
</pre>
</li>
<li>Another common problem is the plot saved by pdf() or png() can be truncated too. I have a better luck with png() though. </li>
</ul>
</ul>


== Add numbers to the plot ==
== grid ==
[https://www.infoworld.com/article/3410295/how-to-write-your-own-ggplot2-functions-in-r.html An example]
<ul>
 
<li>Create a gradient image [https://www.rdocumentation.org/packages/grid/versions/3.6.2/topics/grid.raster grid.raster() or rasterGrob()]
== Ordered barplot and reorder() ==
[[#Ordered_barplot_and_facet|Ordered barplot and facet]]
 
= stat_function() =
* [https://ggplot2.tidyverse.org/reference/stat_function.html stat_function()]
* [http://skranz.github.io//r/2020/11/11/CovidVaccineBayesian.html A look at biontech/pfizer's bayesian analysis of their covid-19 vaccine trial]
 
= geom_area() =
[http://blog.fellstat.com/?p=440 The Pfizer-Biontech Vaccine May Be A Lot More Effective Than You Think]
 
= geom_segment() =
[https://ggplot2.tidyverse.org/reference/geom_segment.html Line segments, arrows and curves]
 
Cf annotate("segment", ...)
 
= Square shaped plot =
<pre>
<pre>
ggplot() + theme(aspect.ratio=1) # do not adjust xlim, ylim
redGradient <- matrix(hcl(0, 80, seq(50, 80, 10)), nrow=4, ncol=5)
 
# interpolated
xylim <- range(c(x, y))
grid.newpage()
ggplot() + coord_fixed(xlim=xylim, ylim=xylim)  
grid.raster(redGradient)
</pre>
</pre>
</li>
<li>
[https://nandeshwar.info/data-visualization/how-to-create-infographics-in-r/ Recipe for Infographics in R]. See example of using rasterGrob() and annotation_custom() to place more images using a custom function.
</li>
<li>
[https://datascienceplus.com/how-to-add-a-background-image-to-ggplot2-graphs/ How to add a background image to ggplot2 graphs]
</li>
<li>
[https://www.engineeringbigdata.com/how-to-add-a-background-image-in-ggplot2-with-r/ How to Add a Background Image in ggplot2 with R]
</li>
</ul>


= geom_line() =
== gridExtra ==
See also [[#group|aes(..., group, ...)]].
=== Force a regular plot object into a Grob for use in grid.arrange ===
[https://stackoverflow.com/a/33848995 gridGraphics] package


== Connect Paired Points with Lines in Scatterplot ==
=== make one panel blank/create a placeholder ===
* [https://datavizpyr.com/connect-paired-points-with-lines-in-scatterplot-in-ggplot2/ Connect Paired Points with Lines in Scatterplot in ggplot2?] '''geom_line(aes(group = patient))''' where the 'patient' variable has 2 same values for the same 'patient'; e.g. patient=0,0,1,1,2,2,3,3.
* https://stackoverflow.com/questions/20552226/make-one-panel-blank-in-ggplot2
* [https://www.geeksforgeeks.org/how-to-connect-paired-points-with-lines-in-scatterplot-in-ggplot2-in-r/ How to Connect Paired Points with Lines in Scatterplot in ggplot2 in R?]
* [https://patchwork.data-imaginist.com/reference/plot_spacer.html patchwork::plot_spacer()]
* [https://stackoverflow.com/a/55722553 Can I create an empty ggplot2 plot in R?]
<pre>
# Method 1: Blank
ggplot() + theme_void()
# Method 2: Display N/A
ggplot() +
    theme_void() +
    geom_text(aes(0,0,label='N/A'))
</pre>
 
=== Overall title ===
[https://stackoverflow.com/a/12722422 multiple ggplots overall title]


== Use geom_line() to create a square bracket to annotate the plot ==
=== Remove vertical/horizontal grids but keep ticks ===
[https://ggplot2tutor.com/simple_barchart_with_p_values/barchart_simple/ Barchart with Significance Tests]
[https://rdrr.io/cran/ggExtra/man/removeGrid.html removeGrid()]


= geom_errorbar(): error bars =
== patchwork ==
* Can ggplot2 do this? https://www.nature.com/articles/nature25173/figures/1
* [https://datavizpyr.com/combine-multiple-plots-using-patchwork-in-r/ How to Combine Multiple ggplot2 Plots? Use Patchwork]
* [https://stackoverflow.com/questions/14069629/plotting-confidence-intervals plotCI() from the plotrix package or geom_errorbar() from ggplot2 package]
* [https://onezero.blog/combining-multiple-ggplot2-plots-for-scientific-publications/ Combining Multiple ggplot2 Plots for Scientific Publications]
* http://sape.inf.usi.ch/quick-reference/ggplot2/geom_errorbar
 
* [http://ggplot2.tidyverse.org/reference/geom_linerange.html Vertical error bars]
=== Common legend ===
* [http://ggplot2.tidyverse.org/reference/geom_errorbarh.html Horizontal error bars]
[https://stackoverflow.com/a/59324590 Add a common Legend for combined ggplots]
* [http://timelyportfolio.blogspot.com/2012/08/horizon-on-ggplot2.html Horizontal panel plot] example and [http://timelyportfolio.blogspot.com/2012/08/plotxts-with-moving-average-panel.html more]
<pre>
* [https://stackoverflow.com/questions/13032777/scatter-plot-with-error-bars R does not draw error bars out of the box]. R has arrows() to create the error bars. Using just arrows(x0, y0, x1, y1, code=3, angle=90, length=.05, col). See
library(ggplot2)
** [https://datascienceplus.com/building-barplots-with-error-bars/ Building Barplots with Error Bars]. Note that the segments() statement is not necessary.
library(patchwork)
** https://www.rdocumentation.org/packages/graphics/versions/3.4.3/topics/arrows
 
* Toy example (see this [https://www.nature.com/articles/nature25173/figures/1 nature paper])
p1 <- ggplot(df1, aes(x = x, y = y, colour = group)) +
<syntaxhighlight lang='rsplus'>
  geom_point(position = position_jitter(w = 0.04, h = 0.02), size = 1.8)
set.seed(301)
p2 <- ggplot(df2, aes(x = x, y = y, colour = group)) +
x <- rnorm(10)
  geom_point(position = position_jitter(w = 0.04, h = 0.02), size = 1.8)
SE <- rnorm(10)
 
y <- 1:10
# Method 1:
 
p1 + p2 + plot_layout(guides = "collect") + theme(legend.position = "bottom")
par(mfrow=c(2,1))
                                          # one legend on the bottom
par(mar=c(0,4,4,4))
# Method 2:
xlim <- c(-4, 4)
p1 + p2 + plot_layout(guides = "collect") # one legend on the RHS
plot(x[1:5], 1:5, xlim=xlim, ylim=c(0+.1,6-.1), yaxs="i", xaxt = "n", ylab = "", pch = 16, las=1)
# Method 2:
mtext("group 1", 4, las = 1, adj = 0, line = 1) # las=text rotation, adj=alignment, line=spacing
p1 + theme(legend.position="none") + p2  # legend (based on p2) is on the RHS
par(mar=c(5,4,0,4))
# Method 3:
plot(x[6:10], 6:10, xlim=xlim, ylim=c(5+.1,11-.1), yaxs="i", ylab ="", pch = 16, las=1, xlab="")
p1 + p2 + theme(legend.position="none")  # legend (based on p1) is in the middle!!
arrows(x[6:10]-SE[6:10], 6:10, x[6:10]+SE[6:10], 6:10, code=3, angle=90, length=0)
</pre>
mtext("group 2", 4, las = 1, adj = 0, line = 1)
 
</syntaxhighlight>
=== Overall title ===
 
[https://statisticsglobe.com/common-main-title-for-multiple-plots-in-r Common Main Title for Multiple Plots in Base R & ggplot2 (2 Examples)]
[[File:Stklnpt.svg|350px]]
 
 
== egg ==
= geom_rect(), geom_bar() =
* [https://cran.rstudio.com/web/packages/egg/ egg] (ggarrange()): Extensions for 'ggplot2', to Align Plots, Plot insets, and Set Panel Sizes. Same author of gridExtra package. egg depends on gridExtra.
* https://ggplot2.tidyverse.org/reference/geom_tile.html
** [https://onunicornsandgenes.blog/2019/01/13/showing-a-difference-in-means-between-two-groups/ Showing a difference in means between two groups]
* https://plotly.com/ggplot2/geom_rect/, https://ggplot2.tidyverse.org/reference/aes_colour_fill_alpha.html
** [https://stackoverflow.com/a/16258375 How can I make consistent-width plots in ggplot (with legends)?]
 
 
Note that we can use '''scale_fill_manual'''() to change the 'fill' colors (scheme/palette). The 'fill' parameter in geom_rect() is only used to define the discrete variable.
=== Common x or y labels ===
 
* [https://stackoverflow.com/a/39009374 how to add common x and y labels to a grid of plots]. Another solution is on the egg package's [https://cran.rstudio.com/web/packages/egg/vignettes/Ecosystem.html vignette].
<pre>
 
ggplot(data=) +
= Base R plot vs ggplot2 =
  geom_bar(aes(x=, fill=)) +
* My summary
  scale_fill_manual(values = c("orange", "blue"))
:{| class="wikitable"  
</pre>
|- style="background-color:#ffffc7;"
 
! base-R
== geom_raster() and geom_tile() ==
! ggplot2
* [https://ggplot2.tidyverse.org/reference/geom_tile.html Rectangles]. This is useful for creating heatmaps; .e.g [https://github.com/satijalab/seurat/blob/master/R/visualization.R#L7445 DoHeatmap()] & [https://satijalab.org/seurat/reference/doheatmap an example] in Seurat.
|-
* [https://jacobsimmering.com/post/wordle/ Wordle Words and Expected Value]
| plot(x, y, col)
 
| geom_point(aes(x, y, color, shape))
= geom_linerange =
|-
* https://ggplot2.tidyverse.org/reference/geom_linerange.html
| xlim
* [https://onunicornsandgenes.blog/2021/07/25/a-plot-of-genes-on-chromosomes/ A plot of genes on chromosomes]. Since ggplot() is inside a function, we need to add ''print()'' in order to show the plot.  
| scale_x_continuous(limits)
** See also [https://bioconductor.org/packages/release/bioc/vignettes/biomaRt/inst/doc/accessing_ensembl.html#given-the-human-gene-tp53-retrieve-the-human-chromosomal-location-of-this-gene-and-also-retrieve-the-chromosomal-location-and-refseq-id-of-its-homolog-in-mouse. Given the human gene TP53, retrieve the human chromosomal location of this gene and also retrieve the chromosomal location and RefSeq id of its homolog in mouse] from the biomaRt package's vignette.
|-
** [https://stackoverflow.com/a/45928905 Get gene location from gene symbol and ID]
| log="x"
** [https://medium.com/intothegenomics/annotate-genes-and-genomic-coordinates-ecdad47d0c8e Genomic coordinates to gene lists and vice versa — Annotating gene coordinates and gene lists]
| scale_x_continuous(trans="log10")
** [https://stackoverflow.com/a/52252962 Genomic coordinates of HGNC gene names] where '''org.Hs.eg.db''' and '''TxDb.Hsapiens.UCSC.hg19.knownGene''' are used
|-
** [https://seandavi.github.io/ITR/transcriptdb.html TxDb: Genes, Transcripts, and Genomic Locations] which uses a gtf file and the '''GenomicFeatures''' package
| xlab<br />mtext("Var", cex, line, adj, las, side)
 
| scale_x_discrete(name="sample size")<br />labs(x)<br />xlab()
= Circle =
|-
[https://community.rstudio.com/t/circle-in-ggplot2/8543 Circle in ggplot2] '''ggplot(data.frame(x = 0, y = 0), aes(x, y)) + geom_point(size = 25, pch = 1)'''
| main
| labs(x, y, title, colour)<br />ggtitle()
|-
| axis(2, labels)
| scale_y_continuous(labels, breaks)<br />scale_x_discrete(labels)
|-
| ?
| scale_color_discrete('new color title')
|-
| ?
| scale_shape_discrete('new shape title')
|-
| col
| scale_color_manual(name, <br />  values = NamedVector)
|-
| pch, cex
| geom_point(pch, size)
|-
| plot(mpg, disp, col=factor(cyl))<br />legend("topleft", <br />    legend = sort(unique(cyl)), <br />   col=1:3, pch=1)<br /># discrete case
| ggplot(mtcars, <br />    aes(mpg, disp, color = factor(cyl))) +<br />   geom_point() +<br />    labs(color = "Number of Cylinders")
|-
| text()
| geom_text()
|-
| ?
| theme(title = element_text(size=8),<br />  legend.title = element_blank(),<br />  legend.position = "none", <br />  legend.key = element_blank(),<br />  plot.title = element_text(hjust = 0.5),<br />  plot.sybtitle = element_text(size = 8))
|-
| las in plot(), barplot()<br />text(x, y, labs, srt=45)
| theme(axis.text.x = element_text(angle = 90))
|-
| matplot()
| geom_line() + geom_point()
|-
| plot(type = 'l'), points()
| geom_line() + geom_point()
|-
| barplot()
| geom_bar()
|-
| par(mfrow)
| facet_grid()
|}
 
* [https://flowingdata.com/2016/03/22/comparing-ggplot2-and-r-base-graphics/ Comparing ggplot2 and R Base Graphics]
 
= labs for x and y axes =
== x and y labels ==
https://stackoverflow.com/questions/10438752/adding-x-and-y-axis-labels-in-ggplot2 or the '''Labels''' part of the [https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf cheatsheet]
 
You can set the labels with xlab() and ylab(), or make it part of the scale_*.* call.
 
<pre>
labs(x = "sample size", y = "ngenes (glmnet)")
 
scale_x_discrete(name="sample size")
scale_y_continuous(name="ngenes (glmnet)", limits=c(100, 500))
</pre>
 
== Change tick mark labels ==
[http://www.sthda.com/english/wiki/ggplot2-axis-ticks-a-guide-to-customize-tick-marks-and-labels ggplot2 axis ticks : A guide to customize tick marks and labels]
 
== name-value pairs ==
See several examples (color, fill, size, ...) from [https://juliasilge.com/blog/texas-opioids/ opioid prescribing habits in texas].
 
= Prevent sorting of x labels =
See [https://stackoverflow.com/a/3255448 Change the order of a discrete x scale].
 
The idea is to set the levels of x variable.
 
<pre>
junk  # n x 2 table
colnames(junk) <- c("gset", "boot")
junk$gset <- factor(junk$gset, levels = as.character(junk$gset))
ggplot(data = junk, aes(x = gset, y = boot, group = 1)) +
  geom_line() +
  theme(axis.text.x=element_text(color = "black", angle=30, vjust=.8, hjust=0.8))
</pre>
 
= Legends =
== Legend title ==
<ul>
<li>[https://ggplot2-book.org/scales.html#scale-title labs() function]
<pre>
p <- ggplot(df, aes(x, y)) + geom_point(aes(colour = z))
p + labs(x = "X axis", y = "Y axis", colour = "Colour\nlegend")
      # Use color to represent the legend title
 
p <- ggplot(df) + geom_col(aes(x=x, y=y, fill=cat), position = "dodge")
p + labs(x = "X", y = "Y", fill = "Category")
      # Use fill to represent the legend title
</pre>
</li>
<li>scale_colour_manual()
<pre>
scale_colour_manual("Treatment", values = c("black", "red"))
</pre>
</li>
<li>scale_color_discrete() and scale_shape_discrete(). See [[#Combine_colors_and_shapes_in_legend|Combine colors and shapes in legend]].
<pre>
df <- data.frame(x = 1:3, y = 1:3, z = c("a", "b", "c"))
ggplot(df, aes(x, y)) + geom_point(aes(shape = z, colour = z), size=5) +
  scale_color_discrete('new title') + scale_shape_discrete('new title')
</pre>
</li>
</ul>
 
== Layout: move the legend from right to top/bottom of the plot or inside the plot or hide it ==
<pre>
gg + theme(legend.position = "top")
 
# Useful in the boxplot case
gg + theme(legend.position="none")
 
gg + theme(legend.position = c(0.87, 0.25))
 
# Customize the edge color and background color
gapminder %>%
  ggplot(aes(gdpPercap,lifeExp, color=continent)) +
  geom_point() +
  scale_x_log10()+
  theme(legend.position = c(0.87, 0.25),
        legend.background = element_rect(fill = "white", color = "black"))
</pre>
 
== Guide functions for finer control (legend, axis, color scales) ==
<ul>
<li>https://ggplot2-book.org/scales.html#guide-functions The guide functions, guide_colourbar() and guide_legend(), offer additional control over the fine details of the legend.
<li>[https://ggplot2.tidyverse.org/reference/guide_legend.html guide_legend()] allows the modification of legends for scales, including fill, color, and shape. This function can be used in scale_fill_manual(), scale_fill_continuous(), ... functions.
<pre>
scale_fill_manual(values=c("orange", "blue"),
                  guide=guide_legend(title = "My Legend Title",
                                    nrow=1,  # multiple items in one row
                                    label.position = "top", # move the texts on top of the color key
                                    keywidth=2.5)) # increase the color key width
</pre>
The problem with the default setting is it leaves a lot of white space above and below the legend.
To change the position of the entire legend to the bottom of the plot, we use theme().
<pre>
theme(legend.position = 'bottom')
</pre>
<li>[https://ggplot2.tidyverse.org/reference/guides.html guides()]
* Legend. For example, to remove the legend title:
<pre>
ggplot(mtcars, aes(x = mpg, y = disp, color = factor(cyl))) +
  geom_point() +
  guides(color = guide_legend(title = NULL))
</pre>
* Axis. For example, to change the angle of the x-axis labels:
<pre>
ggplot(mtcars, aes(x = mpg, y = disp)) +
  geom_point() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  guides(x = guide_axis(angle = 45))
</pre>
* Color scales. For example, to change the number of color breaks:
<pre>
ggplot(mtcars, aes(x = mpg, y = disp, color = hp)) +
  geom_point() +
  guides(color = guide_colorbar(nbin = 10))
</pre>
</ul>
 
== Legend symbol background ==
<pre>
ggplot() + geom_point(aes(x, y, color, size)) +
          theme(legend.key = element_blank())
          # remove the symbol background in legend
</pre>
 
== Construct a manual legend for a complicated plot ==
https://stackoverflow.com/a/17149021
 
== Legend size ==
[https://www.statology.org/ggplot2-legend-size/ How to Change Legend Size in ggplot2 (With Examples)]
<pre>
data <- data.frame(x = 1:5, y = 1:5, label = c("A", "B", "C", "D", "E"))
ggplot(data, aes(x, y, color = as.factor(label))) +
  geom_point() +
  labs(title = "Legend Size Example with Theme Modification",
      color = "Label") +
  theme(
    legend.text = element_text(size = 12),
    legend.title = element_text(size = 14)
    )
</pre>
 
= ggtitle() =
== Centered title ==
See the '''Legends''' part of the [https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf cheatsheet].
<pre>
ggtitle("MY TITLE") +
  theme(plot.title = element_text(hjust = 0.5))
</pre>
 
=== Subtitle ===
<pre>
ggtitle("My title",
        subtitle = "My subtitle")
</pre>
 
= margins =
https://stackoverflow.com/a/10840417
 
= Aspect ratio =
?coord_fixed
<pre>
p <- ggplot(mtcars, aes(mpg, wt)) + geom_point()
p + coord_fixed() # plot is compressed horizontally
p  # fill up plot region
</pre>
 
= Time series plot =
* [http://sharpsightlabs.com/blog/line-chart-ggplot2-amzn/ How to make a line chart with ggplot2]
* [http://ggplot2.tidyverse.org/reference/scale_brewer.html#palettes Colour palettes]. Note some palette options like ''Accent'' from the Qualitative category will give a warning message In RColorBrewer::brewer.pal(n, pal) :  n too large, allowed maximum for palette Accent is 8.
 
Multiple lines plot https://stackoverflow.com/questions/14860078/plot-multiple-lines-data-series-each-with-unique-color-in-r
{{Pre}}
set.seed(45)
nc <- 9
df <- data.frame(x=rep(1:5, nc), val=sample(1:100, 5*nc),
                  variable=rep(paste0("category", 1:nc), each=5))
# plot
# http://colorbrewer2.org/#type=qualitative&scheme=Paired&n=9
ggplot(data = df, aes(x=x, y=val)) +
    geom_line(aes(colour=variable)) +
    scale_colour_manual(values=c("#a6cee3", "#1f78b4", "#b2df8a", "#33a02c", "#fb9a99", "#e31a1c", "#fdbf6f", "#ff7f00", "#cab2d6"))
</pre>
Versus old fashion
<syntaxhighlight lang='rsplus'>
dat <- matrix(runif(40,1,20),ncol=4) # make data
matplot(dat, type = c("b"),pch=1,col = 1:4) #plot
legend("topleft", legend = 1:4, col=1:4, pch=1) # optional legend
</syntaxhighlight>
 
= calendR =
[https://r-coder.com/calendar-plot-r/ Calendar plot in R using ggplot2]
 
= Github style calendar plot =
* https://mvuorre.github.io/post/2016/2016-03-24-github-waffle-plot/
* https://gist.github.com/marcusvolz/84d69befef8b912a3781478836db9a75 from [https://github.com/marcusvolz/strava Create artistic visualisations with your exercise data]
 
= geom_point() =
See [[Ggplot2#Scatterplot|Scatterplot]].
 
<pre>
df <- data.frame(x=1:3, y=1:3, color=c("red", "green", "blue"))
# Use I() to set aes values to the identify of a value from your data table
ggplot(df, aes(x,y, color=I(color))) + geom_point(size=10) # no color legend
# VS
ggplot(df, aes(x,y, color=color)) + geom_point(size=10) # color is like a class label
</pre>
 
= geom_bar(), geom_col(), stat_count() =
https://ggplot2.tidyverse.org/reference/geom_bar.html
* geom_bar: Counts the number of cases at each x position and makes the height of the bar proportional to the count (or sum of weights if supplied)
* geom_col: Leaves the data as is and makes the height of the bar proportional to the value in the data
{| class="wikitable"
|-
! Function !! Default Statistic !! Purpose
|-
| geom_bar() || stat_count() || <pre>
df2 <- data.frame(cat = c("A", "A", "A", "B", "B",
  "B", "B", "B", "C", "C", "C", "C", "C", "C"))
ggplot(df2, aes(x = cat)) + geom_bar()
# Same as
# barplot(table(df2$cat))
</pre>
|-
| geom_col() || stat_identity() || <pre>
df <- data.frame(group = c("A", "B", "C"),
                count = c(3, 5, 6))
ggplot(df, aes(x = group, y = count)) + geom_col()
# Same as
# barplot(df$count, names.arg = df$group)
</pre>
|}
 
<pre>
geom_col(position = 'dodge')  # same as
geom_bar(stat = 'identity', position = 'dodge')
</pre>
 
geom_bar() can not specify the y-axis. To specify y-axis, use geom_col().
<pre>
ggplot() + geom_col(mapping = aes(x, y))
</pre>
 
== Add colors to the plot ==
<pre>
df <- data.frame(group = c("A", "B", "C"),
                count = c(3, 5, 6),
                fill = c("red", "green", "blue"))
ggplot(df, aes(x = group, y = count, fill = fill)) +
  geom_col()
</pre>
 
== Add numbers to the plot ==
[https://www.infoworld.com/article/3410295/how-to-write-your-own-ggplot2-functions-in-r.html An example]
 
== Ordered barplot and reorder() ==
[[#Ordered_barplot_and_facet|Ordered barplot and facet]]
 
= stat_function() =
* [https://ggplot2.tidyverse.org/reference/stat_function.html stat_function()]
* [http://skranz.github.io//r/2020/11/11/CovidVaccineBayesian.html A look at biontech/pfizer's bayesian analysis of their covid-19 vaccine trial]
 
= stat_summary() =
https://ggplot2.tidyverse.org/reference/stat_summary.html
 
= stat_smooth(), geom_smooth() =
[https://ggplot2.tidyverse.org/reference/geom_smooth.html ?geom_smooth, ?stat_smooth]
<pre>
ggplot(data = mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  stat_smooth(method = "glm", formula = "y ~ x",
              method.args = list(family = poisson(link = "log")),
              se = FALSE, color = "red") +
  labs(x = "Weight", y = "Miles per gallon")
</pre>
To control the smoothness, use the "span" parameter. To disable the confidence interval, use "se = F".
<pre>
geom_smooth(method = 'loess', se = FALSE, span = 0.3)
</pre>
 
= geom_area() =
[http://blog.fellstat.com/?p=440 The Pfizer-Biontech Vaccine May Be A Lot More Effective Than You Think]
 
= Square shaped plot =
<pre>
ggplot() + theme(aspect.ratio=1) # do not adjust xlim, ylim
 
xylim <- range(c(x, y))
ggplot() + coord_fixed(xlim=xylim, ylim=xylim)
</pre>
 
= geom_line() =
See also [[#group|aes(..., group, ...)]].
 
== Connect Paired Points with Lines in Scatterplot ==
* [https://datavizpyr.com/connect-paired-points-with-lines-in-scatterplot-in-ggplot2/ Connect Paired Points with Lines in Scatterplot in ggplot2?] '''geom_line(aes(group = patient))''' where the 'patient' variable has 2 same values for the same 'patient'; e.g. patient=0,0,1,1,2,2,3,3.
* [https://www.geeksforgeeks.org/how-to-connect-paired-points-with-lines-in-scatterplot-in-ggplot2-in-r/ How to Connect Paired Points with Lines in Scatterplot in ggplot2 in R?]
 
== Use geom_line() to create a square bracket to annotate the plot ==
[https://ggplot2tutor.com/simple_barchart_with_p_values/barchart_simple/ Barchart with Significance Tests]
 
== Interaction plot ==
[[T-test#Randomized_block_design|Randomized block design]]
 
= geom_segment() =
[https://ggplot2.tidyverse.org/reference/geom_segment.html Line segments, arrows and curves]. See an example in ''geom_errorbar'' section below.
 
Cf annotate("segment", ...)
 
= geom_errorbar(): error bars =
<ul>
<li>[http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/ Plotting means and error bars (ggplot2)] from Cookbook for R.
<li>[https://www.datanovia.com/en/lessons/ggplot-error-bars/ GGPlot Error Bars] using geom_errorbar() and geom_segment()
<br />
[[File:Rerrorbars.png|250px]]
</li>
</ul>
* Can ggplot2 do this? https://www.nature.com/articles/nature25173/figures/1
* [https://stackoverflow.com/questions/14069629/plotting-confidence-intervals plotCI() from the plotrix package or geom_errorbar() from ggplot2 package]
* [http://ggplot2.tidyverse.org/reference/geom_linerange.html Vertical error bars]
* [http://ggplot2.tidyverse.org/reference/geom_errorbarh.html Horizontal error bars]
* [http://timelyportfolio.blogspot.com/2012/08/horizon-on-ggplot2.html Horizontal panel plot] example and [http://timelyportfolio.blogspot.com/2012/08/plotxts-with-moving-average-panel.html more]
* [https://stackoverflow.com/questions/13032777/scatter-plot-with-error-bars R does not draw error bars out of the box]. R has arrows() to create the error bars. Using just arrows(x0, y0, x1, y1, code=3, angle=90, length=.05, col). See
** [https://datascienceplus.com/building-barplots-with-error-bars/ Building Barplots with Error Bars]. Note that the segments() statement is not necessary.
** https://www.rdocumentation.org/packages/graphics/versions/3.4.3/topics/arrows
* Toy example (see this [https://www.nature.com/articles/nature25173/figures/1 nature paper])
<syntaxhighlight lang='rsplus'>
set.seed(301)
x <- rnorm(10)
SE <- rnorm(10)
y <- 1:10
 
par(mfrow=c(2,1))
par(mar=c(0,4,4,4))
xlim <- c(-4, 4)
plot(x[1:5], 1:5, xlim=xlim, ylim=c(0+.1,6-.1), yaxs="i", xaxt = "n", ylab = "", pch = 16, las=1)
mtext("group 1", 4, las = 1, adj = 0, line = 1) # las=text rotation, adj=alignment, line=spacing
par(mar=c(5,4,0,4))
plot(x[6:10], 6:10, xlim=xlim, ylim=c(5+.1,11-.1), yaxs="i", ylab ="", pch = 16, las=1, xlab="")
arrows(x[6:10]-SE[6:10], 6:10, x[6:10]+SE[6:10], 6:10, code=3, angle=90, length=0)
mtext("group 2", 4, las = 1, adj = 0, line = 1)
</syntaxhighlight>
 
[[File:Stklnpt.svg|350px]]
 
* Forest plot example using geom_errorbarh()
[[File:Geomerrorbarh.png|350px]]
 
= geom_rect(), geom_bar() =
* https://ggplot2.tidyverse.org/reference/geom_tile.html
* https://plotly.com/ggplot2/geom_rect/, https://ggplot2.tidyverse.org/reference/aes_colour_fill_alpha.html
 
Note that we can use '''scale_fill_manual'''() to change the 'fill' colors (scheme/palette). The 'fill' parameter in geom_rect() is only used to define the discrete variable.
 
<pre>
ggplot(data=) +
  geom_bar(aes(x=, fill=)) +
  scale_fill_manual(values = c("orange", "blue"))
</pre>
 
== geom_raster() and geom_tile() ==
* [https://ggplot2.tidyverse.org/reference/geom_tile.html Rectangles]. This is useful for creating heatmaps; .e.g [https://github.com/satijalab/seurat/blob/master/R/visualization.R#L7445 DoHeatmap()] & [https://satijalab.org/seurat/reference/doheatmap an example] in Seurat.
* [https://jacobsimmering.com/post/wordle/ Wordle Words and Expected Value]
 
== Waterfall plot ==
* https://en.wikipedia.org/wiki/Waterfall_chart. A waterfall chart is a type of chart that represents how an '''initial value''' is affected by a series of intermediate positive or negative values.
* [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4093310/ Understanding Waterfall Plots]
* [https://r-charts.com/flow/waterfall-chart/ Waterfall charts in ggplot2 with waterfalls package]
* [https://www.r-bloggers.com/2010/05/ggplot2-waterfall-charts/ ggplot2: Waterfall Charts] geom_rect()
* [https://www.pharmasug.org/proceedings/2012/DG/PharmaSUG-2012-DG13.pdf Waterfall Charts in Oncology Trials - Ride the Wave]. '''Drug response'''
** Collected data is compared to the data taken at '''baseline''' to determine if drug has some activity or not. Also each patient is assigned in to different categories based on overall response
** Y-axis = % of change from baseline in the '''tumor size''' for each patient
** We want to create this plot by grouping different patients based on their overall response category (eg 'Earth Death' or 'Complete Response') and fill the bars of such patients with different colors so it is easy to identify different groups.
* A waterfall plot for drug BYL719 and color it based on the mutation status of the CDK13 gene, see [https://bioconductor.org/packages/release/bioc/vignettes/Xeva/inst/doc/Xeva.pdf#page=10 Xeva] vignette.
 
= geom_linerange =
* https://ggplot2.tidyverse.org/reference/geom_linerange.html
* [https://onunicornsandgenes.blog/2021/07/25/a-plot-of-genes-on-chromosomes/ A plot of genes on chromosomes]. Since ggplot() is inside a function, we need to add ''print()'' in order to show the plot.  
** See also [https://bioconductor.org/packages/release/bioc/vignettes/biomaRt/inst/doc/accessing_ensembl.html#given-the-human-gene-tp53-retrieve-the-human-chromosomal-location-of-this-gene-and-also-retrieve-the-chromosomal-location-and-refseq-id-of-its-homolog-in-mouse. Given the human gene TP53, retrieve the human chromosomal location of this gene and also retrieve the chromosomal location and RefSeq id of its homolog in mouse] from the biomaRt package's vignette.
** [https://stackoverflow.com/a/45928905 Get gene location from gene symbol and ID]
** [https://medium.com/intothegenomics/annotate-genes-and-genomic-coordinates-ecdad47d0c8e Genomic coordinates to gene lists and vice versa — Annotating gene coordinates and gene lists]
** [https://stackoverflow.com/a/52252962 Genomic coordinates of HGNC gene names] where '''org.Hs.eg.db''' and '''TxDb.Hsapiens.UCSC.hg19.knownGene''' are used
** [https://seandavi.github.io/ITR/transcriptdb.html TxDb: Genes, Transcripts, and Genomic Locations] which uses a gtf file and the '''GenomicFeatures''' package
 
= Circle =
[https://community.rstudio.com/t/circle-in-ggplot2/8543 Circle in ggplot2] '''ggplot(data.frame(x = 0, y = 0), aes(x, y)) + geom_point(size = 25, pch = 1)'''


= Annotation =
= Annotation =


== geom_hline(), geom_vline() ==
== Add a horizontal/vertical line ==
[https://ggplot2.tidyverse.org/reference/geom_abline.html geom_hline(), geom_vline()]
<pre>
<pre>
geom_hline(yintercept=1000)
geom_hline(yintercept=1000)
Line 1,353: Line 2,118:
== text annotations, annotate() and geom_text(): '''ggrepel''' package ==
== text annotations, annotate() and geom_text(): '''ggrepel''' package ==
<ul>
<ul>
<li>[https://cran.r-project.org/web/packages/ggrepel/vignettes/ggrepel.html ggrepel] package. Found on [https://simplystatistics.org/2018/01/22/the-dslabs-package-provides-datasets-for-teaching-data-science/ Some datasets for teaching data science] by Rafael Irizarry.
<li>[https://cran.r-project.org/web/packages/ggrepel/vignettes/ggrepel.html ggrepel] package, [https://ggrepel.slowkow.com/reference/geom_text_repel.html ?geom_text_repel]. Found on [https://simplystatistics.org/2018/01/22/the-dslabs-package-provides-datasets-for-teaching-data-science/ Some datasets for teaching data science] by Rafael Irizarry.
<pre>
<pre>
p <- ggplot(dat, aes(wt, mpg, label = car)) +
p <- ggplot(dat, aes(wt, mpg, label = car)) +
Line 1,360: Line 2,125:
p1 <- p + geom_text() + labs(title = "geom_text()") # Bad
p1 <- p + geom_text() + labs(title = "geom_text()") # Bad


p2 <- p + geom_text_repel() + labs(title = "geom_text_repel()") # Good
p2 <- p + geom_text_repel(seed=1) + labs(title = "geom_text_repel()") # Good
                                          # Use 'seed' to fix the location of text
</pre>
</pre>
Note that we may need to add '''show.legend = FALSE''' in geom_text_repel() to get rid of "a" character in the legend. See [https://stackoverflow.com/questions/18337653/remove-a-from-legend-when-using-aesthetics-and-geom-text Remove 'a' from legend when using aesthetics and geom_text]
</li>
</li>
</ul>
</ul>
Line 1,374: Line 2,141:
geom_text(aes(x, y, label), data, size, vjust, hjust, nudge_x)
geom_text(aes(x, y, label), data, size, vjust, hjust, nudge_x)
</pre>
</pre>
<li>[https://r-charts.com/ggplot2/text-annotations/ Text annotations in ggplot2]
<syntaxhighlight lang='rsplus'>
p + geom_text(aes(x = -115, y = 25,
                  label = "Map of the United States"),
              stat = "unique")
p + geom_label(aes(x = -115, y = 25,
                  label = "Map of the United States"),
              stat = "unique") # include border around the text
</syntaxhighlight>
</li>
</li>
<li>Use the '''nudge_y''' parameter to avoid the overlap of the point and the text such as  
<li>Use the '''nudge_y''' parameter to avoid the overlap of the point and the text such as  
Line 1,382: Line 2,158:
</li>
</li>
<li>
<li>
[https://stackoverflow.com/a/7267364 What do hjust and vjust do when making a plot using ggplot?] 0 means left-justified 1 means right-justified.
[https://stackoverflow.com/a/7267364 What do '''hjust''' and vjust do when making a plot using ggplot?] 0 means left-justified 1 means right-justified. This is necessary if we have multiples lines in text. By default, it will center-justified.
</li>
</li>
<li>[https://biocorecrg.github.io/CRG_RIntroduction/volcano-plots.html Volcano plots], [https://bioconductor.org/packages/release/bioc/vignettes/EnhancedVolcano/inst/doc/EnhancedVolcano.html EnhancedVolcano] package </li>
<li>[https://biocorecrg.github.io/CRG_RIntroduction/volcano-plots.html Volcano plots], [https://bioconductor.org/packages/release/bioc/vignettes/EnhancedVolcano/inst/doc/EnhancedVolcano.html EnhancedVolcano] package </li>
<li>[https://samdsblog.netlify.app/post/visualizing-volcano-plots-in-r/ Visualization of Volcano Plots in R]
</ul>
</ul>


Line 1,432: Line 2,210:
[https://github.com/AllanCameron/geomtextpath geomtextpath]- Create curved text in ggplot2
[https://github.com/AllanCameron/geomtextpath geomtextpath]- Create curved text in ggplot2


= Fonts =
== Build your own geom ==
* https://ggplot2-book.org/extensions.html#new-geoms
* [https://youtu.be/ZMHJdW6a20I Building a new geom in ggplot2] (video)
 
= Fonts, icons =
* [http://gradientdescending.com/adding-custom-fonts-to-ggplot-in-r/ Adding Custom Fonts to ggplot in R]
* [http://gradientdescending.com/adding-custom-fonts-to-ggplot-in-r/ Adding Custom Fonts to ggplot in R]
* [https://twitter.com/rfunctionaday/status/1412985327812288513 The {showtext_auto} function from {showtext} supports a large collection of font formats and graphics devices! ]
* [https://twitter.com/rfunctionaday/status/1412985327812288513 The {showtext_auto} function from {showtext} supports a large collection of font formats and graphics devices! ]
* [https://statisticaloddsandends.wordpress.com/2021/07/08/using-different-fonts-with-ggplot2 Using different fonts with ggplot2]
* [https://statisticaloddsandends.wordpress.com/2021/07/08/using-different-fonts-with-ggplot2 Using different fonts with ggplot2]
* [https://albert-rapp.de/post/2022-03-04-fonts-and-icons/ How to use Fonts and Icons in ggplot]


= Lines of best fit =
= Lines of best fit =
[http://freerangestats.info/blog/2020/08/23/highered-ols Lines of best fit]
[http://freerangestats.info/blog/2020/08/23/highered-ols Lines of best fit]


= Save the plots =
= Save the plots -- ggsave() =
[https://ggplot2.tidyverse.org/reference/ggsave.html ggsave()]. Note '''svglite''' package is required, see [https://r-graphics.org/recipe-output-vector-svg R Graphics Cookbook]. ''The svglite package provides more standards-compliant output.''
 
By default the units of '''width''' & '''height''' is inch no matter what output formats we choose.
 
(3/24/2022) If I save the plot in the svg format using RStudio GUI (Export -> As as Image...) or by the '''svg()''' function, the svg plot can't be converted to a png file by ImageMagick. But if I save the plot by using the '''ggsave()''' command, the svg plot can be converted to a png file.
<pre>
$ convert -resize 100% Rerrorbar.svg tmp.png
convert-im6.q16: non-conforming drawing primitive definition `path' @ error/draw.c/RenderMVGContent/4300.
$ convert -resize 100% Rerrorbar2.svg tmp.png # Works
</pre>
 
(1/31/2022) For some reason, the text in legend in svg files generated by ggsave() looks fine in browsers but when I insert it into ppt, the word "Sensitive" becomes "Sensitiv e". However, the svg files generated by '''svg()''' command looks fine in browsers AND in ppt.
 
ggsave() will save a plot with the '''width/height''' based on the current graphical device if we don't specify them. That's why after we issue ggsave() it will tell us the image size (inch). So in order to have a fixed width/height, we need to specify them explicitly.
See
* [https://sscc.wisc.edu/sscc/pubs/using-r-plots/saving-plots.html Saving ggplot Plots]
* [https://stackoverflow.com/a/44711767 Set the size of ggsave exactly]
 
My experience is ggsave() is better than png() because ggsave() makes the text larger when we save a file with a higher resolution.
My experience is ggsave() is better than png() because ggsave() makes the text larger when we save a file with a higher resolution.
<pre>
<pre>
Line 1,451: Line 2,252:
</pre>
</pre>


[https://ggplot2.tidyverse.org/reference/ggsave.html ggsave()] We can specify dpi to increase the resolution. For example,
We can specify dpi to increase the resolution if we use the '''png''' format ('''svg''' is not affected); see Chapter 14.5 [https://r-graphics.org/recipe-output-bitmap Outputting to Bitmap (PNG/TIFF) Files] from R Graphics Cookbook.
<syntaxhighlight lang='rsplus'>
<syntaxhighlight lang='rsplus'>
g1 <- ggplot(data = mydf)  
g1 <- ggplot(data = mydf)  
g1
g1
ggsave("myfile.png", g1, height = 7, width = 8, units = "in", dpi = 500)
ggsave("myfile.png", g1, height = 7, width = 8, units = "in", dpi = 300)
</syntaxhighlight>
</syntaxhighlight>
I got an error -  Error in loadNamespace(name) : there is no package called ‘svglite’. After I install the package, everything works fine.
I got an error -  Error in loadNamespace(name) : there is no package called ‘svglite’. After I install the package, everything works fine.
Line 1,462: Line 2,263:
# Will generate 4*100 x 3*100 pixel plot
# Will generate 4*100 x 3*100 pixel plot
</pre>
</pre>
Note:
* For saving to "png" file, increasing dpi (from 72 to 300) will increase font & point size. '''dpi/ppi''' is not an inherent property of an image.
* If we don't specify any parameters and without resizing the graphics device size, then "png" file created by ggsave() will contain much more pixels compared to "svg" file (e.g. 1200 vs 360).
* How ggsave() decides width/height if a svg file was used in an Rmd file? A: 7x7 from my experiment. So the font/point size will be smaller compared to a 4x4 inch output.
* When I created an svg file in Linux with 4x4 inch (width x height), the file is 360 x 360 pixels when I right click the file to get the properties of the file. But macOS cannot return this number nor am I able to find this number from the svg file??


== Multiple pages in pdf ==
== Multiple pages in pdf ==
Line 1,474: Line 2,281:
</pre>
</pre>


= graphics::smoothScatter =
= graphics::smoothScatter: scatter plots with lots of points =
[https://www.inwt-statistics.com/read-blog/smoothscatter-with-ggplot2-513.html smoothScatter with ggplot2]
* [https://www.rdocumentation.org/packages/graphics/versions/3.6.2/topics/smoothScatter ?smoothScatter]
* [https://r-charts.com/correlation/smooth-scatter-plot/ Smooth scatter plot in R]
* [https://www.inwt-statistics.com/read-blog/smoothscatter-with-ggplot2-513.html smoothScatter with ggplot2]
* [https://htmlpreview.github.io/?https://github.com/wwylab/DeMixTallmaterials/blob/master/online_methods.html#Figure%203b%20and%203c An example] from DeMixT. As we can see, we can we the '''lines()''' or '''abline()''' to add lines.


= Other tips/FAQs =
= Other tips/FAQs =
Line 1,482: Line 2,292:
== Ten Simple Rules for Better Figures ==
== Ten Simple Rules for Better Figures ==
[https://journals.plos.org/ploscompbiol/article?id=10.1371%2Fjournal.pcbi.1003833&s=09 Ten Simple Rules for Better Figures]
[https://journals.plos.org/ploscompbiol/article?id=10.1371%2Fjournal.pcbi.1003833&s=09 Ten Simple Rules for Better Figures]
== Recreating the Storytelling with Data look with ggplot ==
[https://albert-rapp.de/post/2022-03-29-recreating-the-swd-look/ Recreating the Storytelling with Data look with ggplot]


== ggplot2 does not appear to work when inside a function ==
== ggplot2 does not appear to work when inside a function ==

Latest revision as of 17:35, 23 April 2024

ggplot2

Books

The Grammar of Graphics

  • Data: Raw data that we'd like to visualize
  • Geometrics: shapes that we use to visualize data
  • Aesthetics: Properties of geometries (size, color, etc)
  • Scales: Mapping between geometries and aesthetics

Scatterplot aesthetics

geom_point(). The aesthetics is geom dependent.

  • x, y
  • shape
  • color
  • size. It is not always to put 'size' inside aes(). See an example at Legend layout.
  • alpha
library(ggplot2)
library(tidyverse)
set.seed(1)
x1 <- rbinom(100, 1, .5) - .5
x2 <- c(rnorm(50, 3, .8)*.1, rnorm(50, 8, .8)*.1)
x3 <- x1*x2*2
# x=1:100, y=x1, x2, x3
tibble(x=1:length(x1), T=x1, S=x2, I=x3) %>% 
  tidyr::pivot_longer(-x) %>% 
  ggplot(aes(x=x, y=value)) + 
  geom_point(aes(color=name))

# Cf
matplot(1:length(x1), cbind(x1, x2, x3), pch=16, 
        col=c('cornflowerblue', 'springgreen3', 'salmon'))

Online tutorials

Help

> library(ggplot2)
Need help? Try Stackoverflow: https://stackoverflow.com/tags/ggplot2

Gallery

Some examples

Examples from 'R for Data Science' book - Aesthetic mappings

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy))
  # the 'mapping' is the 1st argument for all geom_* functions, so we can safely skip it.
# template
ggplot(data = <DATA>) + 
  <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))

# add another variable through color, size, alpha or shape
ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy, color = class))

ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy, size = class))

ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy, alpha = class))

ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy, shape = class))

ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy), color = "blue")

# add another variable through facets
ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy)) + 
  facet_wrap(~ class, nrow = 2)

# add another 2 variables through facets
ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy)) + 
  facet_grid(drv ~ cyl)

Examples from 'R for Data Science' book - Geometric objects, lines and smoothers

How to Add a Regression Line to a ggplot?

# Points
ggplot(data = mpg) + 
  geom_point(aes(x = displ, y = hwy)) # we can add color to aes()

# Line plot
ggplot() +
  geom_line(aes(x, y))  # we can add color to aes()

# Smoothed
# 'size' controls the line width
ggplot(data = mpg) + 
  geom_smooth(aes(x = displ, y = hwy), size=1) 

# Points + smoother, add transparency to points, remove se
# We add transparency if we need to make smoothed line stands out
#                    and points less significant
# We move aes to the '''mapping''' option in ggplot()
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + 
  geom_point(alpha=1/10) +
  geom_smooth(se=FALSE)    

# Colored points + smoother
ggplot(data = mpg, aes(x = displ, y = hwy)) + 
  geom_point(aes(color = class)) + 
  geom_smooth()

Examples from 'R for Data Science' book - Transformation, bar plot

# y axis = counts
# bar plot
ggplot(data = diamonds) + 
  geom_bar(aes(x = cut))
# Or
ggplot(data = diamonds) + 
  stat_count(aes(x = cut))

# y axis = proportion
ggplot(data = diamonds) + 
  geom_bar(aes(x = cut, y = ..prop.., group = 1))

# bar plot with 2 variables
ggplot(data = diamonds) + 
  geom_bar(aes(x = cut, fill = clarity))

facet_wrap and facet_grid to create a panel of plots

  • The statement facet_grid() can be defined without a data. For example
    mylayout <- list(ggplot2::facet_grid(cat_y ~ cat_x))
    mytheme <- c(mylayout, 
                 list(ggplot2::theme_bw(), ggplot2::ylim(NA, 1)))
    # we haven't defined cat_y, cat_x variables
    ggplot() + geom_line() + 
      mylayout 
    
  • Multiclass predictive modeling for #TidyTuesday NBER papers
  • changing the facet_wrap labels using labeller in ggplot2. The solution is to create a labeller function as a function of a variable x (or any other name as long as it's not the faceting variables' names) and then coerce to labeller with as_labeller.

lattice::xyplot

df <- data.frame(x = rnorm(100), y = rnorm(100), group = sample(c("A", "B"), 100, replace = TRUE))

# Use the xyplot() function to create the plot
# with each group represented by a different color
# result is 1 plot only
# no annotation
xyplot(y ~ x, data = df, groups = group)
df <- data.frame(x = rnorm(100), y = rnorm(100), 
                 group = sample(c("A", "B"), 100, replace = TRUE), 
                 time = sample(c("T1", "T2"), 100, replace = TRUE))

# 2 plots grouped by time
# two colors (defined by group) was used in each plot 
# no annotation
xyplot(y ~ x | time, groups = group, data = df)

For more complicated plot, we can use the panel parameter.

Color palette

Top color palettes

Display color palettes

  • Use barplot()
    pal <- c("#E41A1C", "#377EB8", "#4DAF4A", "#984EA3", "#FF7F00")
    # pal <- sample(colors(), 10) # randomly pick 10 colors 
    
    barplot(rep(1, length(pal)), col = pal, space = 0, 
            axes = FALSE, border = NA)
    par()$usr
    # [1] -0.20  5.20 -0.01  1.00
    

    Palettebarplot.png

  • Use heatmap()
    pal <- c("#E41A1C", "#377EB8", "#4DAF4A", "#984EA3", "#FF7F00")
    pal <- matrix(pal, nr=2) # acknowledge a nice warning message
    #      [,1]      [,2]      [,3]     
    # [1,] "#E41A1C" "#4DAF4A" "#FF7F00"
    # [2,] "#377EB8" "#984EA3" "#E41A1C"
    pal_matrix <- matrix(seq_along(pal), nr=nrow(pal), nc=ncol(pal))
    heatmap(pal_matrix, col = pal, Rowv = NA, Colv = NA, scale = "none", 
             ylab = "", xlab = "", main = "", margins = c(5, 5))
    # 2 rows, 3 columns with labeling on two axes
    par()$usr
    # [1] 0 1 0 1
    

    Paletteheatmap.png

  • Use image()
    pal <- palette() # R 4.0 has a new default palette
                     # The old colors are highly saturated and vary enormousely
                     # in terms of luminance
    # [1] "black"   "#DF536B" "#61D04F" "#2297E6" "#28E2E5" "#CD0BBC" "#F5C710"
    # [8] "gray62"
    pal_matrix <- matrix(seq_along(pal), nr=1)
    image(pal_matrix, col = pal, axes = FALSE)
    # 8 rows, 1 column, but no labeling
    # Starting from bottom, left.
    
    par()$usr  # change with the data dim
    text(0, (par()$usr[4]-par()$usr[3])/8*c(0:7), 
         labels = pal)
    

    Rpalette.png

  • Use scales::show_col()
    scales::show_col(palette())
    

    Paletteshowcol.png

colors()

In R, colors() is a function that returns a character vector of color names available in R.

To obtain the hexadecimal codes for all colors obtained by colors()

rgb_values <- col2rgb(colors())

# Convert the RGB values to hexadecimal codes
hex_codes <- apply(rgb_values, 2, 
                   function(x) rgb(x[1], x[2], x[3], 
                   maxColorValue = 255))

# View the first few hexadecimal codes
head(hex_codes)

palette()

rainbow

  • ?rainbow
  • Below compare the effects of 's' and 'v' parameters. s (saturation) and v (value): These parameters control the color intensity and brightness, respectively. See also HSL and HSV from wikipedia.
    • Saturation (s): Determines how vivid or muted the colors are. A value of 1 (default) means fully saturated colors, while lower values reduce the intensity.
    • Value (v): Controls the brightness. A value of 1 (default) results in full brightness, while lower values make the colors darker.

Rainbow default.png Rainbow s05.png Rainbow v05.png

Color blind

colorblindcheck: Check Color Palettes for Problems with Color Vision Deficiency

Color picker

https://github.com/daattali/colourpicker

> library(colourpicker)
> plotHelper(colours=5)

Listening on http://127.0.0.1:6023

Color names, Complementary/Inverted colors

colorspace package

cols4all

c4a_gui() # it will create a shiny interface (but R will not be used at the same time)

c4a_types() # understand abbreviation

c4a_series() # 16 series like brewer, hcl, tableau, viridis, etc

c4a_overview() # how many palettes per series x types

c4a_palettes(type = "div", series = "hcl") # What palettes are available

# Give me the colors
c4a("hcl.purple_green", 11)
c4a("brewer.accent", 2)    # the 1st one on the website

# Plot the colors
c4a_plot("hcl.purple_green", 11, include.na = TRUE)

*paletteer package

paletteer_d("RColorBrewer::RdBu")
#67001FFF #B2182BFF #D6604DFF #F4A582FF #FDDBC7FF #F7F7F7FF 
#D1E5F0FF #92C5DEFF #4393C3FF #2166ACFF #053061FF 

paletteer_d("ggsci::uniform_startrek")
#CC0C00FF #5C88DAFF #84BD00FF #FFCD00FF #7C878EFF #00B5E2FF #00AF66FF 

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
      geom_point() +
      scale_color_paletteer_d("ggsci::uniform_startrek")
# the next is the same as above
ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
     geom_point() +
     scale_color_manual(values = c("setosa" = "#CC0C00FF", 
                                   "versicolor" = "#5C88DAFF", 
                                   "virginica" = "#84BD00FF"))

ggsci

ggokabeito

ggokabeito: Colorblind-friendly, qualitative 'Okabe-Ito' Scales for ggplot2 and ggraph. It seems to only support up to 9 classes/colors. It will give an error message if we have too many classes; e.g. Error: Insufficient values in manual scale. 15 needed but only 9 provided.)

# Bad
ggplot(mpg, aes(hwy, color = class, fill = class)) +
     geom_density(alpha = .8)

# Bad (single color)
ggplot(mpg, aes(hwy, color = class, fill = class)) +
     geom_density(alpha = .8) +
     scale_fill_brewer(name = "Class") +
     scale_color_brewer(name = "Class")

# Bad
ggplot(mpg, aes(hwy, color = class, fill = class)) +
     geom_density(alpha = .8) +
     scale_fill_brewer(name = "Class", palette ="Set1") +
     scale_color_brewer(name = "Class", palette ="Set1")

# Nice
ggplot(mpg, aes(hwy, color = class, fill = class)) +
     geom_density(alpha = .8) +
     scale_fill_okabe_ito(name = "Class") +
     scale_color_okabe_ito(name = "Class")

Pride palette

Show Pride on Your Plots. gglgbtq package

unikn

Colour related aesthetics: colour, fill and alpha

https://ggplot2.tidyverse.org/reference/aes_colour_fill_alpha.html

Scatterplot with large number of points: alpha

smoothScatter with ggplot2

ggplot(aes(x, y)) +
    geom_point(alpha=.1) 

For base R, we can use the alpha parameter rgb(,,,alpha),

plot(x, y, col=rgb(0,0,0, alpha=.1))
polygon(df, col=adjustcolor(c("red", "blue"), alpha.f=.3))

Combine colors and shapes in legend

  • https://ggplot2-book.org/scales.html#scale-details In order for legends to be merged, they must have the same name.
    df <- data.frame(x = 1:3, y = 1:3, z = c("a", "b", "c"))
    ggplot(df, aes(x, y)) + geom_point(aes(shape = z, colour = z), size=4)
    
  • How to Work with Scales in a ggplot2 in R. This solution is better since it allows to change the legend title. Just make sure the title name we put in both scale_* functions are the same.
    ggplot(mtcars, aes(x=hp, y=mpg)) +
       geom_point(aes(shape=factor(cyl), colour=factor(cyl))) +
       scale_shape_discrete("Cylinders") + # change the legend title from 'factor(cyl)' to 'Cylinders'
       scale_colour_discrete("Cylinders")  # combine shape and colour in one legend; avoid another legend for colour
    
  • GGPLOT Point Shapes Best Tips
  • Simulated data
    df <- data.frame(x = rnorm(100), y = rnorm(100),
                     Treatment = rep(c("Before", "After"), each = 50),
                     Response = rep(c("Sensitive", "Resistant"), each = 50),
                     Subject = rep(1:50, times = 2))
    
    ggplot(df, aes(x = x, y = y, shape = Treatment, color = Response)) +
      geom_point() +
      geom_line(aes(group = Subject), alpha = 0.5) +  # Add lines connecting the same subject
      scale_shape_manual(values = c(16, 17)) +  # You can choose different shapes
      scale_color_manual(values = c("blue", "red")) +  # You can choose different colors
      theme_minimal() +
      labs(title = "Scatterplot with Different Shapes and Colors",
           x = "X-axis label",
           y = "Y-axis label",
           shape = "Treatment",
           color = "Response")
    

ggplot2::scale functions and scales packages

  • Scales control the mapping from data to aesthetics. They take your data and turn it into something that you can see, like size, colour, position or shape.
  • Scales also provide the tools that let you read the plot: the axes and legends.
  • scales 1.2.0

ggplot2::scale_* - axes/axis, legend

https://ggplot2-book.org/scales.html and reference of all scale_* functions. Modifies the scales of the axes, such as the x- and y-axes, color, size, etc.

Naming convention: scale_AestheticName_NameDataType where

  • AestheticName can be x, y, color, fill, size, shape, ...
  • NameDataType can be continuous, discrete, manual or gradient.
  • Table of common functions
scale_AestheticName_NameDataType
scale_x_continuous
scale_x_discrete
scale_x_log10
scale_color_continuous,
scale_color_gradient
scale_color_discrete
scale_color_brewer
scale_color_manual
scale_color_paletteer_d
scale_shape_discrete
scale_fill_brewer,
scale_fill_continuous,
scale_fill_discrete,
scale_fill_gradient
scale_fill_grey,
scale_fill_hue
scale_fill_manual,
scale_colour_viridis_d


Examples:

  • See Figure 12.1: Axis and legend components on the book ggplot2: Elegant Graphics for Data Analysis
    # Set x-axis label
    scale_x_discrete("Car type")   # or a shortcut xlab() or labs()
    scale_x_continuous("Displacement")
    
    # Set legend title
    scale_colour_discrete("Drive\ntrain")    # or a shortcut labs()
    
    # Change the default color
    scale_color_brewer()
    
    # Change the axis scale
    scale_x_sqrt()
    
    # Change breaks and their labels
    scale_x_continuous(breaks = c(2000, 4000), labels = c("2k", "4k"))
    
    # Relabel the breaks in a categorical scale
    scale_y_discrete(labels = c(a = "apple", b = "banana", c = "carrot"))
    
  • How to change the color in geom_point or lines in ggplot
    ggplot() + 
      geom_point(data = data, aes(x = time, y = y, color = sample),size=4) +
      scale_color_manual(values = c("A" = "black", "B" = "red"))
    
    ggplot(data = data, aes(x = time, y = y, color = sample)) + 
      geom_point(size=4) + 
      geom_line(aes(group = sample)) + 
      scale_color_manual(values = c("A" = "black", "B" = "red"))
    
  • See an example at geom_linerange where we have to specify the limits parameter in order to make "8" < "16" < "20"; otherwise it is 16 < 20 < 8.
    Browse[2]> order(coordinates$chr)
    [1] 3 4 1 2
    Browse[2]> coordinates$chr 
    [1] "20" "8"  "16" "16"
    
  • Differences of scale_color_gradient() and scale_color_continuous()
    • scale_color_gradient() (more common than scale_color_continuous) is used to map a continuous variable to a color gradient. It takes two arguments: low and high, which specify the colors for the minimum and maximum values of the variable, respectively. The gradient is automatically generated between these two colors.
    ggplot(data = diamonds, aes(x = carat, y = price, color = depth)) +
      geom_point() +
      scale_color_gradient(low = "blue", high = "red")
    
    • scale_color_continuous() (useful if we want to specify the labels to display on legend) does not automatically generate the color scale. Instead, it requires the user to specify the values to which the colors should be mapped. The limits argument sets the minimum and maximum values for the variable, and the breaks argument specifies the values at which breaks occur.
    ggplot(data = diamonds, aes(x = carat, y = price, color = depth)) +
         geom_point() +
         scale_color_continuous(name = "Depth", 
                                limits = c(40, 80), 
                                breaks = c(40, 60, 80),
                                labels = c("Shallow", "Moderate", "Deep"), # display on legend
                                type = "gradient")
    

ylim and xlim in ggplot2 in axes

https://stackoverflow.com/questions/3606697/how-to-set-limits-for-axes-in-ggplot2-r-plots or the Zooming part of the cheatsheet

Use one of the following

  • + scale_x_continuous(limits = c(-5000, 5000))
  • + coord_cartesian(xlim = c(-5000, 5000))
  • + xlim(-5000, 5000)

Emulate ggplot2 default color palette

Paletteggplot2.png

The above can be created by R >= 4.0.0 using the command scales::show_col(palette.colors(palette = "ggplot2")). We should ignore the 1st color (black). Also if n>=5, the colors do not match with the result of show_col(hue_pal()(5)) .

Answer 1 It is just equally spaced hues around the color wheel. Emulate ggplot2 default color palette

gg_color_hue <- function(n) {
  hues = seq(15, 375, length = n + 1)
  hcl(h = hues, l = 65, c = 100)[1:n]
}

n = 4
cols = gg_color_hue(n)

dev.new(width = 4, height = 4)
plot(1:n, pch = 16, cex = 2, col = cols)

Answer 2 (better, it shows the color values in HEX). It should be read from left to right and then top to down.

scales package

library(scales)
show_col(hue_pal()(4)) # ("#F8766D", "#7CAE00", "#00BFC4", "#C77CFF")
                       # (Salmon, Christi, Iris Blue, Heliotrope)
show_col(hue_pal()(3)) # ("#F8766D", "#00BA38", "#619CFF")
                       # (Salmon, Dark Pastel Green, Cornflower Blue)
show_col(hue_pal()(2)) # ("#F8767D", "#00BFC4") = (salmon, iris blue) 
           # see https://www.htmlcsscolor.com/ for color names

See also the last example in ggsurv() where the KM plots have 4 strata. The colors can be obtained by scales::hue_pal()(4) with hue_pal()'s default arguments.

R has a function called colorName() to convert a hex code to color name; see roloc package on CRAN.

transform scales

How to make that crazy Fox News y axis chart with ggplot2 and scales

Class variables

  • "Set1" is a good choice. See RColorBrewer::display.brewer.all()
  • For ordinal variable, brewer.pal(n, "Spectral") is good. But the middle color is too light. So I modify the middle color
    brewer.pal(5, "Spectral")
    cols[3] <- "#D4C683" # middle of "#FDAE61" and "#ABDDA4"
    

Red, Green, Blue alternatives

  • Red: "maroon"

Heatmap for single channel

How to Make a Heatmap of Customers in R, source code on github. geom_tile() and geom_text() were used. Heatmap in ggplot2 from https://r-charts.com/.

https://scales.r-lib.org/

# White <----> Blue
RColorBrewer::display.brewer.pal(n = 8, name = "Blues")

Heatmap for dual channels

http://www.sthda.com/english/wiki/colors-in-r

library(RColorBrewer)
# Red <----> Blue
display.brewer.pal(n = 8, name = 'RdBu')
# Hexadecimal color specification 
brewer.pal(n = 8, name = "RdBu")

plot(1:8, col=brewer_pal(palette = "RdBu")(8), pch=20, cex=4)

# Blue <----> Red
plot(1:8, col=rev(brewer_pal(palette = "RdBu")(8)), pch=20, cex=4)

Twopalette.svg

Don't rely on color to explain the data

ggpattern

Don't use very bright or low-contrast colors, accessibility

Create your own scale_fill_FOO and scale_color_FOO

Custom colour palettes for {ggplot2}

Themes and background for ggplot2

Background

  • Export plot in .png with transparent background in base R plot.
    x = c(1, 2, 3)
    op <- par(bg=NA)
    plot (x)
    
    dev.copy(png,'myplot.png')
    dev.off()
    par(op)
    
  • Transparent background with ggplot2
    library(ggplot2)
    data("airquality")
    
    p <- ggplot(airquality, aes(Solar.R, Temp)) +
         geom_point() +
         geom_smooth() +
         # set transparency
         theme(
            panel.grid.major = element_blank(), 
            panel.grid.minor = element_blank(),
            panel.background = element_rect(fill = "transparent",colour = NA),
            plot.background = element_rect(fill = "transparent",colour = NA)
            )
    p
    ggsave("airquality.png", p, bg = "transparent")
    
  • ggplot2 theme background color and grids
    ggplot() + geom_bar(aes(x=, fill=y)) +
               theme(panel.background=element_rect(fill='purple')) + 
               theme(plot.background=element_blank())
    
    ggplot() + geom_bar(aes(x=, fill=y)) + 
               theme(panel.background=element_blank()) + 
               theme(plot.background=element_blank()) # minimal background like base R
               # the grid lines are not gone; they are white so it is the same as the background
    
    ggplot() + geom_bar(aes(x=, fill=y)) + 
               theme(panel.background=element_blank()) + 
               theme(plot.background=element_blank()) +
               theme(panel.grid.major.y = element_line(color="grey"))
               # draw grid line on y-axis only
    
    ggplot() + geom_bar() +
               theme_bw()  # very similar to theme_light()
                           # have grid lines
    ggplot() + geom_bar() +
               theme_classic() # similar to base R graphic
                           # no borders on top and right
     
    ggplot() + geom_bar() +
               theme_minimal() # no edge
    
    ggplot() + geom_bar() +
               theme_void() # no grid, no edge
    
    ggplot() + geom_bar() +
               theme_dark()
    

ggthmr

ggthmr package

Font size

For example to make the subtitle font size smaller

my_ggp + theme(plot.sybtitle = element_text(size = 8)) 
# Default font size seems to be 11 for title/subtitle

Remove x and y axis titles

ggplot2 title : main, axis and legend titles

Rotate x-axis labels, change colors

Counter-clockwise

theme(axis.text.x = element_text(angle = 90, size=5, hjust=1)

customize ggplot2 axis labels with different colors

Add axis on top or right hand side

Remove labels

Plotting with ggplot: : adding titles and axis names

ggthemes package

https://cran.r-project.org/web/packages/ggthemes/index.html

ggplot() + geom_bar() +
           theme_solarized()   # sun color in the background

theme_excel()
theme_wsj()
theme_economist()
theme_fivethirtyeight()

rsthemes

rsthemes

thematic

thematic, Top R tips and news from RStudio Global 2021

Common plots

Scatterplot

Handling overlapping points (slides) and the ebook Fundamentals of Data Visualization by Claus O. Wilke.

Scatterplot with histograms

aes(color)

groups

Geom smooth ex.png

Bubble Chart

Ellipse

ggside: scatterplot + marginal density plot

ggextra: scatterplot + marginal histogram/density

https://github.com/daattali/ggExtra

Line plots

Ridgeline plots, mountain diagram

Histogram

Histograms is a special case of bar plots. Instead of drawing each unique individual values as a bar, a histogram groups close data points into bins.

ggplot(data = txhousing, aes(x = median)) +
  geom_histogram()  # adding 'origin =0' if we don't expect negative values.
                    # adding 'bins=10' to adjust the number of bins
                    # adding 'binwidth=10' to adjust the bin width

Histogram vs barplot from deeply trivial.

Boxplot

Be careful that if we added scale_y_continuous(expand = c(0,0), limits = c(0,1)) to the code, it will change the boxplot if some data is outside the range of (0, 1). The console gives a warning message in this case.

Base R method

Box Plots - R Base Graphs

dim(df) # 112436 x 2
mycol <- c("#F8766D", "#7CAE00", "#00BFC4", "#C77CFF")
# mycol defines colors of 4 levels in df$Method (a factor)
boxplot(df$value ~ df$Method, col = mycol, xlab="Method")

Color fill/scale_fill_XXX

n <- 100
k <- 12
set.seed(1234)
cond <- factor(rep(LETTERS[1:k], each=n))
rating <- rnorm(n*k)
dat <- data.frame(cond = cond, rating = rating)

p <- ggplot(dat, aes(x=cond, y=rating, fill=cond)) + 
     geom_boxplot() 

p + scale_fill_hue() + labs(title="hue default") # Same as only p 
p + scale_fill_hue(l=40, c=35) + labs(title="hue options")
p + scale_fill_brewer(palette="Dark2") + labs(title="Dark2")
p + colorspace::scale_fill_discrete_qualitative(palette = "Dark 3") + labs(title="Dark 3")
p + scale_fill_brewer(palette="Accent") + labs(title="Accent")
p + scale_fill_brewer(palette="Pastel1") + labs(title="Pastel1")
p + scale_fill_brewer(palette="Set1") + labs(title="Set1")
p + scale_fill_brewer(palette="Spectral") + labs(title ="Spectral") 
p + scale_fill_brewer(palette="Paired") + labs(title="Paired")
# cbbPalette <- c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
# p + scale_fill_manual(values=cbbPalette)

Scalefill.png

ColorBrewer palettes RColorBrewer::display.brewer.all() to display all brewer palettes.

Reference from ggplot2. scale_fill_binned, scale_fill_brewer, scale_fill_continuous, scale_fill_date, scale_fill_datetime, scale_fill_discrete, scale_fill_distiller, scale_fill_gradient, scale_fill_gradientc, scale_fill_gradientn, scale_fill_grey, scale_fill_hue, scale_fill_identity, scale_fill_manual, scale_fill_ordinal, scale_fill_steps, scale_fill_steps2, scale_fill_stepsn, scale_fill_viridis_b, scale_fill_viridis_c, scale_fill_viridis_d

Jittering - plot the data on top of the boxplot

  • What is a boxplot
  • Quick look
    # Only 1 variable
    ggplot(data.frame(Wi), aes(y = Wi)) + 
      geom_boxplot()
    
    # Two variable, one of them is a factor
    ggplot() + geom_jitter(mapping = aes(x, y))
    
    # Box plot
    ggplot() + geom_boxplot(mapping = aes(x, y))
  • geom_jitter()
  • geom_jitter can affect both X and Y values.
    tibble(x=1:4, y=1:4) %>% ggplot(aes(x, y)) + geom_jitter()
    
  • https://stackoverflow.com/a/17560113
  • How to make scatterplot with geom_jitter plot reproducible?
    set.seed(1); data %>%
      ggplot() +
      geom_jitter(aes(T.categ, sex, colour = status))
    
  • Boxplot with jittered data points in ggplot2
  • # df2 is n x 2 
    ggplot(df2, aes(x=nboot, y=boot)) +
      geom_boxplot(outlier.shape=NA) + #avoid plotting outliers twice
      geom_jitter(aes(color=nboot), position=position_jitter(width=.2, height=0, seed=1)) +
      labs(title="", y = "", x = "nboot")

    If we omit the outlier.shape=NA option in geom_boxplot(), we will get the following plot where some outliers will appear twice. (Another option is outlier.color = NA; see extra point at boxplot with jittered points (ggplot2)).

    Jitterboxplot.png

  • Base plot approach Batch effects and confounders
  • Another base plot approach. boxplot() + stripchart(). See Stripchart in R, How to Create a Strip Chart in R. Consider to add outline = FALSE to boxplot() to avoid drawing outliers in boxplot() when stripchart() has been added.
    ylim <- range(df$estimate, na.rm = TRUE)
    boxplot(estimate~type, data=df, xlab=NULL, ylab=NULL, ylim=ylim, outline=F)
    set.seed(1)
    stripchart(estimate~type, data=df, method = "jitter",
    		pch=19, col=c("salmon", "orange", "yellowgreen", "green"),
    		vertical=TRUE, add=TRUE)

Groups of boxplots

  • How to Make Grouped Boxplot with Jittered Data Points in ggplot2. Use the color parameter in ggplot(aes()).
  • Boxplot With Jittered Points in R
  • How To Make Grouped Boxplots with ggplot2?, A review of Longitudinal Data Analysis in R. Use the fill parameter such as
    mydata %>%
      ggplot(aes(x=Factor1, y=Response, fill=factor(Factor2))) +   
      geom_boxplot() 
    
  • Another method is to use ggpubr::ggboxplot(). Papers TumorPurity.
    ggboxplot(df, "dose", "len",
               fill = "dose", palette = c("#00AFBB", "#E7B800", "#FC4E07"), add.params=list(size=0.1),
               notch=T, add = "jitter", outlier.shape = NA, shape=16,
               size = 1/.pt, x.text.angle = 30, 
               ylab = "Silhouette Values", legend="right",
               ggtheme = theme_pubr(base_size = 8)) +
         theme(plot.title = element_text(size=8,hjust = 0.5), 
               text = element_text(size=8), 
               title = element_text(size=8),
               rect = element_rect(size = 0.75/.pt),
               line = element_line(size = 0.75/.pt),
               axis.text.x = element_text(size = 7),
               axis.line = element_line(colour = 'black', size = 0.75/.pt),
               legend.title = element_blank(),
               legend.position = c(0,1), 
               legend.justification = c(0,1),
               legend.key.size = unit(4,"mm"))
    

p-values on top of boxplots

Violin plot and sina plot

geom_density: Kernel density plot

A panel of density plots

  • Common xlim for all subplots
    ggplot(data = mpg, aes(x = hwy)) +
         geom_density() +
         facet_wrap(~ class)
    
  • Each subplot has its own xlim
    ggplot(data = mpg, aes(x = hwy)) +
         geom_density() +
         facet_wrap(~ class, scales = "free_x")
    

Bivariate analysis with ggpair

Correlation in R: Pearson & Spearman with Matrix Example

GGally::ggpairs

barplot/bar plot

Ordered barplot and facet

  • ?reorder. This, as relevel(), is a special case of simply calling factor(x, levels = levels(x)[....]).
    R> bymedian <- with(InsectSprays, reorder(spray, count, median))
    # bymedian will replace spray (a factor) 
    # The data is not changed except the order of levels (a factor) 
    # In this case, the order is determined by the median of count from each spray level
    #   from small to large.
    
    R> InsectSprays[1:3, ]
      count spray
    1    10     A
    2     7     A
    3    20     A
    R> bymedian
     [1] A A A A A A A A A A A A B B B B B B B B B B B B C C C C C C C C C C C C D D D D D D D
    [44] D D D D D E E E E E E E E E E E E F F F F F F F F F F F F
    attr(,"scores")
       A    B    C    D    E    F 
    14.0 16.5  1.5  5.0  3.0 15.0 
    Levels: C E D A F B
    R> InsectSprays$spray
     [1] A A A A A A A A A A A A B B B B B B B B B B B B C C C C C C C C C C C C D D D D D D D
    [44] D D D D D E E E E E E E E E E E E F F F F F F F F F F F F
    Levels: A B C D E F
    R> boxplot(count ~ bymedian, data = InsectSprays,
             xlab = "Type of spray", ylab = "Insect count",
             main = "InsectSprays data", varwidth = TRUE,
             col = "lightgray")

    Scatterplot

    tibble(y=sample(6), x=letters[1:6]) %>% 
      ggplot(aes(reorder(x, -y), y)) + geom_point(size=4)
    
  • Sorting the x-axis in bargraphs using ggplot2 or this one from Deeply Trivial. reorder(fac, value) was used.
    ggplot(df, aes(x=reorder(x, -y), y=y)) + geom_bar(stat = 'identity')
    
    df$order <- 1:nrow(df)
    # Assume df$y is a continuous variable and df$fac is a character/factor variable
    #   and we want to show factor according to the way they appear in the data
    #   (not following R's order even the variable is of type "character" not "factor")
    # We like to plot df$fac on the y-axis and df$y on x-axis. Fortunately,
    #   ggplot2 will draw barplot vertically or horizontally depending the 2 variables' types
    # The reason of using "-order" is to make the 1st name appears on the top
    ggplot(df, aes(x=y, y=reorder(fac, -order))) + geom_col()
    
    ggplot(df, aes(x=reorder(x, desc(y)), y=y)), geom_col()
  • Predict #TidyTuesday giant pumpkin weights with workflowsets. fct_reorder()
  • Reordering and facetting for ggplot2. tidytext::reorder_within() was used.
  • Chapter2 of data.table cookbook. reorder(fac, value) was used.
  • PCA and UMAP with tidymodels
  • A simple example
    dat <- structure(list(gene = c("CAPN9", "CSF3R", "HPN", "KCNA5", "MTMR7", 
    "NRG3", "SMTNL2", "TMPRSS6"), coef = c(-1.238, -0.892, -0.224, 
    -0.057, 0.133, 0.377, 0.436, 0.804)), row.names = c("4976", "6467", 
    "12355", "13373", "18143", "19010", "23805", "25602"), class = "data.frame")
    
    # Base R plot
    par(mar=c(4,6,4,1))
    barplot(dat$coef, names = dat$gene, horiz = T, las=1,
            main='base R', xlab = "Coefficients")
    
    # GGplot2
    dat %>% ggplot(aes(y=gene, x=coef)) + geom_col(fill = 'gray') + 
        theme(axis.ticks.y = element_blank()) + 
        theme(panel.background = element_blank(), 
              axis.line.x = element_line(colour = 'black')) +
        labs(x="Coefficients", y = '', title = "ggplot2")
    

    Barplot base.png, Barplot ggplot2.png

Proportion barplot

Back to back barplot

Pyramid Chart

ggcharts::pyramid_chart()

Flip x and y axes

coord_flip()

Rotate x-axis labels

ggplot(mydf) + geom_col(aes(x = model, y=value, fill = method), position="dodge")+
  theme(axis.text.x = element_text(angle = 45, hjust=1, size= 8))

Starts at zero

Starting bars and histograms at zero in ggplot2

scale_y_continuous(expand = c(0,0), limits = c(0, YourLimit))

Add patterns

Barplot with colors for a 2nd variable

How to basic: bar plots

By default, the barplots are stacked on top of each other. Use geom_col(position = "dodge") if we want the barplots to be side-by-side.

df <- data.frame(group = c("A", "A", "B", "B", "C", "C"), 
      count = c(3, 4, 5, 6, 7, 8), 
      fill = c("red", "blue", "red", "blue", "red", "blue"))
ggplot(df, aes(x = group, y = count, fill = fill)) + 
      geom_col(position = "dodge")

Ggplotbarplot.png

Base R approach.

Barplot with color gradient

Geomcolviridis.png

Barplot with only horizontal gridlines

Geom bar3.png Geom bar4.png

Barplot with text at the end

Geom bar1.png Geom bar2.png

Polygon and map plot

Polygon.png

geom_step: Step function

Connect observations: geom_path(), geom_step()

Example: KM curves (without legend)

library(survival)
sf <- survfit(Surv(time, status) ~ x, data = aml)
sf
str(sf) # the first 10 forms one strata and the rest 10 forms the other
ggplot() + 
  geom_step(aes(x=c(0, sf$time[1:10]), y=c(1, sf$surv[1:10])), 
            col='red') + 
  scale_x_continuous('Time', limits = c(0, 161)) + 
  scale_y_continuous('Survival probability', limits = c(0, 1)) +
  geom_step(aes(x=c(0, sf$time[11:20]), y=c(1, sf$surv[11:20])), 
            col='black') 
# cf:  plot(sf, col = c('red', 'black'), mark.time=FALSE)

Same example but with legend (see Construct a manual legend for a complicated plot)

cols <- c("NEW"="#f04546","STD"="#3591d1")
ggplot() + 
  geom_step(aes(x=c(0, sf$time[1:10]), y=c(1, sf$surv[1:10]), col='NEW')) +
  scale_x_continuous('Time', limits = c(0, 161)) + 
  scale_y_continuous('Survival probability', limits = c(0, 1)) +
  geom_step(aes(x=c(0, sf$time[11:20]), y=c(1, sf$surv[11:20]), col='STD')) + 
  scale_colour_manual(name="Treatment", values = cols)

To control the line width, use the size parameter; e.g. geom_step(aes(x, y), size=.5). The default size is .5 (where to find this info?).

To allow different line types, use the linetype parameter. The first level is solid line, the 2nd level is dashed, ... We can change the default line types by using the scale_linetype_manual() function. See Line Types in R: The Ultimate Guide for R Base Plot and GGPLOT.

Coefficients, intervals, errorbars

Comparing similarities / differences between groups

comparing similarities / differences between groups

Special plots

Dot plot & forest plot

Lollipop plot

geom_segment() + geom_point()

ggpubr:: ggdotchart()

Correlation Analysis Different

Bump plot: plot ranking over time

https://github.com/davidsjoberg/ggbump

Gauge plots

Sankey diagrams

Horizon chart

Circos plots

Aesthetics

  • We can create a new aesthetic name in aes(aesthetic = variable) function; for example, the "text2" below. In this case "text2" name will not be shown; only the original variable will be used.
    library(plotly)
    g <- ggplot(tail(iris), aes(Petal.Length, Sepal.Length, text2=Species)) + geom_point()
    ggplotly(g, tooltip = c("Petal.Length", "text2"))
    

Aesthetics finder

https://ggplot2tor.com/aesthetics/, video

aes_string()

group

https://ggplot2.tidyverse.org/reference/aes_group_order.html

GUI/Helper packages

ggedit & ggplotgui – interactive ggplot aesthetic and theme editor

esquisse (French, means 'sketch'): creating ggplot2 interactively

https://cran.rstudio.com/web/packages/esquisse/index.html

A 'shiny' gadget to create 'ggplot2' charts interactively with drag-and-drop to map your variables. You can quickly visualize your data accordingly to their type, export to 'PNG' or 'PowerPoint', and retrieve the code to reproduce the chart.

The interface introduces basic terms used in ggplot2:

  • x, y,
  • fill (useful for geom_bar, geom_rect, geom_boxplot, & geom_raster, not useful for scatterplot),
  • color (edges for geom_bar, geom_line, geom_point),
  • size,
  • facet, split up your data by one or more variables and plot the subsets of data together.

It does not include all features in ggplot2. At the bottom of the interface,

  • Labels & title & caption.
  • Plot options. Palette, theme, legend position.
  • Data. Remove subset of data.
  • Export & code. Copy/save the R code. Export file as PNG or PowerPoint.

ggcharts

https://cran.r-project.org/web/packages/ggcharts/index.html

ggeasy

ggx

https://github.com/brandmaier/ggx Create ggplot in natural language

Interactive

plotly

R web → plotly

ggiraph

ggiraph: Make 'ggplot2' Graphics Interactive

ggconf: Simpler Appearance Modification of 'ggplot2'

https://github.com/caprice-j/ggconf

Plotting individual observations and group means

https://drsimonj.svbtle.com/plotting-individual-observations-and-group-means-with-ggplot2

subplot

Adding/Inserting an image to ggplot2

Inserting an image to ggplot2: See annotation_custom.

See also ggbernie which uses a different way ggplot2::layer() and a self-defined geom (geometric object).

Easy way to mix/combine multiple graphs on the same page

annotation_custom

  • predcurvePlot.R from TreatmentSelection. One issue is the font size is large for the text & labels at the bottom. The 2nd issue is the bottom part of the graph/annotation (marker value scale) can be truncated if the window size is too large. If the window is too small, the bottom part can overlap with the top part.
    p <- p + theme(plot.margin = unit(c(1,1,4,1), "lines"))  # hard coding
    p <- p + annotation_custom() # axis for marker value scale
    p <- p + annotation_custom() # label only
    
    • Similar plot but without using base R graphic. One issue is the text is not below the scale (this can be fixed by par(mar) & mtext(text, side=1, line=4)) and the 2nd issue is the same as ggplot2's approach.
      axis(1,at= breaks, label = round(quantile(x1, prob = breaks/100), 1),pos=-0.26) # hard coding
      
    • Another common problem is the plot saved by pdf() or png() can be truncated too. I have a better luck with png() though.

grid

gridExtra

Force a regular plot object into a Grob for use in grid.arrange

gridGraphics package

make one panel blank/create a placeholder

# Method 1: Blank
ggplot() + theme_void()
# Method 2: Display N/A
ggplot() +
    theme_void() +
    geom_text(aes(0,0,label='N/A'))

Overall title

multiple ggplots overall title

Remove vertical/horizontal grids but keep ticks

removeGrid()

patchwork

Common legend

Add a common Legend for combined ggplots

library(ggplot2)
library(patchwork)

p1 <- ggplot(df1, aes(x = x, y = y, colour = group)) + 
  geom_point(position = position_jitter(w = 0.04, h = 0.02), size = 1.8)
p2 <- ggplot(df2, aes(x = x, y = y, colour = group)) + 
  geom_point(position = position_jitter(w = 0.04, h = 0.02), size = 1.8)

# Method 1:
p1 + p2 + plot_layout(guides = "collect") + theme(legend.position = "bottom") 
                                          # one legend on the bottom
# Method 2:
p1 + p2 + plot_layout(guides = "collect") # one legend on the RHS
# Method 2:
p1 + theme(legend.position="none") + p2  # legend (based on p2) is on the RHS
# Method 3:
p1 + p2 + theme(legend.position="none")  # legend (based on p1) is in the middle!!

Overall title

Common Main Title for Multiple Plots in Base R & ggplot2 (2 Examples)

egg

Common x or y labels

Base R plot vs ggplot2

  • My summary
base-R ggplot2
plot(x, y, col) geom_point(aes(x, y, color, shape))
xlim scale_x_continuous(limits)
log="x" scale_x_continuous(trans="log10")
xlab
mtext("Var", cex, line, adj, las, side)
scale_x_discrete(name="sample size")
labs(x)
xlab()
main labs(x, y, title, colour)
ggtitle()
axis(2, labels) scale_y_continuous(labels, breaks)
scale_x_discrete(labels)
? scale_color_discrete('new color title')
? scale_shape_discrete('new shape title')
col scale_color_manual(name,
values = NamedVector)
pch, cex geom_point(pch, size)
plot(mpg, disp, col=factor(cyl))
legend("topleft",
legend = sort(unique(cyl)),
col=1:3, pch=1)
# discrete case
ggplot(mtcars,
aes(mpg, disp, color = factor(cyl))) +
geom_point() +
labs(color = "Number of Cylinders")
text() geom_text()
? theme(title = element_text(size=8),
legend.title = element_blank(),
legend.position = "none",
legend.key = element_blank(),
plot.title = element_text(hjust = 0.5),
plot.sybtitle = element_text(size = 8))
las in plot(), barplot()
text(x, y, labs, srt=45)
theme(axis.text.x = element_text(angle = 90))
matplot() geom_line() + geom_point()
plot(type = 'l'), points() geom_line() + geom_point()
barplot() geom_bar()
par(mfrow) facet_grid()

labs for x and y axes

x and y labels

https://stackoverflow.com/questions/10438752/adding-x-and-y-axis-labels-in-ggplot2 or the Labels part of the cheatsheet

You can set the labels with xlab() and ylab(), or make it part of the scale_*.* call.

labs(x = "sample size", y = "ngenes (glmnet)")

scale_x_discrete(name="sample size")
scale_y_continuous(name="ngenes (glmnet)", limits=c(100, 500))

Change tick mark labels

ggplot2 axis ticks : A guide to customize tick marks and labels

name-value pairs

See several examples (color, fill, size, ...) from opioid prescribing habits in texas.

Prevent sorting of x labels

See Change the order of a discrete x scale.

The idea is to set the levels of x variable.

junk   # n x 2 table
colnames(junk) <- c("gset", "boot")
junk$gset <- factor(junk$gset, levels = as.character(junk$gset))
ggplot(data = junk, aes(x = gset, y = boot, group = 1)) + 
  geom_line() + 
  theme(axis.text.x=element_text(color = "black", angle=30, vjust=.8, hjust=0.8))

Legends

Legend title

  • labs() function
    p <- ggplot(df, aes(x, y)) + geom_point(aes(colour = z))
    p + labs(x = "X axis", y = "Y axis", colour = "Colour\nlegend")
           # Use color to represent the legend title
    
    p <- ggplot(df) + geom_col(aes(x=x, y=y, fill=cat), position = "dodge") 
    p + labs(x = "X", y = "Y", fill = "Category")
           # Use fill to represent the legend title
    
  • scale_colour_manual()
    scale_colour_manual("Treatment", values = c("black", "red"))
    
  • scale_color_discrete() and scale_shape_discrete(). See Combine colors and shapes in legend.
    df <- data.frame(x = 1:3, y = 1:3, z = c("a", "b", "c"))
    ggplot(df, aes(x, y)) + geom_point(aes(shape = z, colour = z), size=5) + 
      scale_color_discrete('new title') + scale_shape_discrete('new title')
    

Layout: move the legend from right to top/bottom of the plot or inside the plot or hide it

gg + theme(legend.position = "top")

# Useful in the boxplot case
gg + theme(legend.position="none")

gg + theme(legend.position = c(0.87, 0.25))

# Customize the edge color and background color
gapminder %>%
  ggplot(aes(gdpPercap,lifeExp, color=continent)) +
  geom_point() +
  scale_x_log10()+
  theme(legend.position = c(0.87, 0.25),
        legend.background = element_rect(fill = "white", color = "black"))

Guide functions for finer control (legend, axis, color scales)

  • https://ggplot2-book.org/scales.html#guide-functions The guide functions, guide_colourbar() and guide_legend(), offer additional control over the fine details of the legend.
  • guide_legend() allows the modification of legends for scales, including fill, color, and shape. This function can be used in scale_fill_manual(), scale_fill_continuous(), ... functions.
    scale_fill_manual(values=c("orange", "blue"), 
                      guide=guide_legend(title = "My Legend Title",
                                         nrow=1,  # multiple items in one row
                                         label.position = "top", # move the texts on top of the color key
                                         keywidth=2.5)) # increase the color key width
    

    The problem with the default setting is it leaves a lot of white space above and below the legend. To change the position of the entire legend to the bottom of the plot, we use theme().

    theme(legend.position = 'bottom')
    
  • guides()
    • Legend. For example, to remove the legend title:
    ggplot(mtcars, aes(x = mpg, y = disp, color = factor(cyl))) +
      geom_point() +
      guides(color = guide_legend(title = NULL))
    
    • Axis. For example, to change the angle of the x-axis labels:
    ggplot(mtcars, aes(x = mpg, y = disp)) +
      geom_point() +
      theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
      guides(x = guide_axis(angle = 45))
    
    • Color scales. For example, to change the number of color breaks:
    ggplot(mtcars, aes(x = mpg, y = disp, color = hp)) +
      geom_point() +
      guides(color = guide_colorbar(nbin = 10))
    

Legend symbol background

ggplot() + geom_point(aes(x, y, color, size)) +
           theme(legend.key = element_blank())
           # remove the symbol background in legend

Construct a manual legend for a complicated plot

https://stackoverflow.com/a/17149021

Legend size

How to Change Legend Size in ggplot2 (With Examples)

data <- data.frame(x = 1:5, y = 1:5, label = c("A", "B", "C", "D", "E"))
ggplot(data, aes(x, y, color = as.factor(label))) +
  geom_point() +
  labs(title = "Legend Size Example with Theme Modification",
       color = "Label") +
  theme(
    legend.text = element_text(size = 12), 
    legend.title = element_text(size = 14)
    )

ggtitle()

Centered title

See the Legends part of the cheatsheet.

ggtitle("MY TITLE") +
  theme(plot.title = element_text(hjust = 0.5))

Subtitle

ggtitle("My title",
        subtitle = "My subtitle")

margins

https://stackoverflow.com/a/10840417

Aspect ratio

?coord_fixed

p <- ggplot(mtcars, aes(mpg, wt)) + geom_point()
p + coord_fixed() # plot is compressed horizontally
p  # fill up plot region

Time series plot

Multiple lines plot https://stackoverflow.com/questions/14860078/plot-multiple-lines-data-series-each-with-unique-color-in-r

set.seed(45)
nc <- 9
df <- data.frame(x=rep(1:5, nc), val=sample(1:100, 5*nc), 
                   variable=rep(paste0("category", 1:nc), each=5))
# plot
# http://colorbrewer2.org/#type=qualitative&scheme=Paired&n=9
ggplot(data = df, aes(x=x, y=val)) + 
    geom_line(aes(colour=variable)) + 
    scale_colour_manual(values=c("#a6cee3", "#1f78b4", "#b2df8a", "#33a02c", "#fb9a99", "#e31a1c", "#fdbf6f", "#ff7f00", "#cab2d6"))

Versus old fashion

dat <- matrix(runif(40,1,20),ncol=4) # make data
matplot(dat, type = c("b"),pch=1,col = 1:4) #plot
legend("topleft", legend = 1:4, col=1:4, pch=1) # optional legend

calendR

Calendar plot in R using ggplot2

Github style calendar plot

geom_point()

See Scatterplot.

df <- data.frame(x=1:3, y=1:3, color=c("red", "green", "blue"))
# Use I() to set aes values to the identify of a value from your data table
ggplot(df, aes(x,y, color=I(color))) + geom_point(size=10) # no color legend
# VS
ggplot(df, aes(x,y, color=color)) + geom_point(size=10) # color is like a class label

geom_bar(), geom_col(), stat_count()

https://ggplot2.tidyverse.org/reference/geom_bar.html

  • geom_bar: Counts the number of cases at each x position and makes the height of the bar proportional to the count (or sum of weights if supplied)
  • geom_col: Leaves the data as is and makes the height of the bar proportional to the value in the data
Function Default Statistic Purpose
geom_bar() stat_count()
df2 <- data.frame(cat = c("A", "A", "A", "B", "B", 
   "B", "B", "B", "C", "C", "C", "C", "C", "C"))
ggplot(df2, aes(x = cat)) + geom_bar()
# Same as
# barplot(table(df2$cat))
geom_col() stat_identity()
df <- data.frame(group = c("A", "B", "C"), 
                 count = c(3, 5, 6))
ggplot(df, aes(x = group, y = count)) + geom_col()
# Same as
# barplot(df$count, names.arg = df$group)
geom_col(position = 'dodge')  # same as 
geom_bar(stat = 'identity', position = 'dodge')

geom_bar() can not specify the y-axis. To specify y-axis, use geom_col().

ggplot() + geom_col(mapping = aes(x, y))

Add colors to the plot

df <- data.frame(group = c("A", "B", "C"), 
                 count = c(3, 5, 6), 
                 fill = c("red", "green", "blue"))
ggplot(df, aes(x = group, y = count, fill = fill)) + 
  geom_col()

Add numbers to the plot

An example

Ordered barplot and reorder()

Ordered barplot and facet

stat_function()

stat_summary()

https://ggplot2.tidyverse.org/reference/stat_summary.html

stat_smooth(), geom_smooth()

?geom_smooth, ?stat_smooth

ggplot(data = mtcars, aes(x = wt, y = mpg)) + 
  geom_point() +
  stat_smooth(method = "glm", formula = "y ~ x", 
              method.args = list(family = poisson(link = "log")), 
              se = FALSE, color = "red") +
  labs(x = "Weight", y = "Miles per gallon")

To control the smoothness, use the "span" parameter. To disable the confidence interval, use "se = F".

geom_smooth(method = 'loess', se = FALSE, span = 0.3)

geom_area()

The Pfizer-Biontech Vaccine May Be A Lot More Effective Than You Think

Square shaped plot

ggplot() + theme(aspect.ratio=1) # do not adjust xlim, ylim

xylim <- range(c(x, y))
ggplot() + coord_fixed(xlim=xylim, ylim=xylim) 

geom_line()

See also aes(..., group, ...).

Connect Paired Points with Lines in Scatterplot

Use geom_line() to create a square bracket to annotate the plot

Barchart with Significance Tests

Interaction plot

Randomized block design

geom_segment()

Line segments, arrows and curves. See an example in geom_errorbar section below.

Cf annotate("segment", ...)

geom_errorbar(): error bars

set.seed(301)
x <- rnorm(10)
SE <- rnorm(10)
y <- 1:10

par(mfrow=c(2,1))
par(mar=c(0,4,4,4))
xlim <- c(-4, 4)
plot(x[1:5], 1:5, xlim=xlim, ylim=c(0+.1,6-.1), yaxs="i", xaxt = "n", ylab = "", pch = 16, las=1)
mtext("group 1", 4, las = 1, adj = 0, line = 1) # las=text rotation, adj=alignment, line=spacing
par(mar=c(5,4,0,4))
plot(x[6:10], 6:10, xlim=xlim, ylim=c(5+.1,11-.1), yaxs="i", ylab ="", pch = 16, las=1, xlab="")
arrows(x[6:10]-SE[6:10], 6:10, x[6:10]+SE[6:10], 6:10, code=3, angle=90, length=0)
mtext("group 2", 4, las = 1, adj = 0, line = 1)

Stklnpt.svg

  • Forest plot example using geom_errorbarh()

Geomerrorbarh.png

geom_rect(), geom_bar()

Note that we can use scale_fill_manual() to change the 'fill' colors (scheme/palette). The 'fill' parameter in geom_rect() is only used to define the discrete variable.

ggplot(data=) +
  geom_bar(aes(x=, fill=)) +
  scale_fill_manual(values = c("orange", "blue"))

geom_raster() and geom_tile()

Waterfall plot

geom_linerange

Circle

Circle in ggplot2 ggplot(data.frame(x = 0, y = 0), aes(x, y)) + geom_point(size = 25, pch = 1)

Annotation

Add a horizontal/vertical line

geom_hline(), geom_vline()

geom_hline(yintercept=1000)
geom_vline(xintercept=99)

text annotations, annotate() and geom_text(): ggrepel package

Text wrap

ggplot2 is there an easy way to wrap annotation text?

p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point()

# Solution 1: Not work with Chinese characters
wrapper <- function(x, ...) paste(strwrap(x, ...), collapse = "\n")
# The a label
my_label <- "Some arbitrarily larger text"
# and finally your plot with the label
p + annotate("text", x = 4, y = 25, label = wrapper(my_label, width = 5))

# Solution 2: Not work with Chinese characters
library(RGraphics)
library(ggplot2)
p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point()
grob1 <-  splitTextGrob("Some arbitrarily larger text")
p + annotation_custom(grob = grob1,  xmin = 3, xmax = 4, ymin = 25, ymax = 25) 

# Solution 3: stringr::str_wrap()
my_label <- "太極者無極而生。陰陽之母也。動之則分。靜之則合。無過不及。隨曲就伸。人剛我柔謂之走。我順人背謂之黏。"
p <- ggplot() + geom_point() + xlim(0, 400) + ylim(0, 300) # 400x300 e-paper
p + annotate("text", x = 0, y = 200, hjust=0, size=5,
             label = stringr::str_wrap(my_label, width =30)) +
    theme_bw () + 
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(), 
          panel.border = element_blank(),
          axis.title = element_blank(), 
          axis.text = element_blank(),
          axis.ticks = element_blank()) 

ggtext

ggtext: Improved text rendering support for ggplot2

ggforce - Annotate areas with ellipses

geom_mark_ellipse()

Other geoms

Exploring other {ggplot2} geoms

geomtextpath

geomtextpath- Create curved text in ggplot2

Build your own geom

Fonts, icons

Lines of best fit

Lines of best fit

Save the plots -- ggsave()

ggsave(). Note svglite package is required, see R Graphics Cookbook. The svglite package provides more standards-compliant output.

By default the units of width & height is inch no matter what output formats we choose.

(3/24/2022) If I save the plot in the svg format using RStudio GUI (Export -> As as Image...) or by the svg() function, the svg plot can't be converted to a png file by ImageMagick. But if I save the plot by using the ggsave() command, the svg plot can be converted to a png file.

$ convert -resize 100% Rerrorbar.svg tmp.png
convert-im6.q16: non-conforming drawing primitive definition `path' @ error/draw.c/RenderMVGContent/4300.
$ convert -resize 100% Rerrorbar2.svg tmp.png # Works

(1/31/2022) For some reason, the text in legend in svg files generated by ggsave() looks fine in browsers but when I insert it into ppt, the word "Sensitive" becomes "Sensitiv e". However, the svg files generated by svg() command looks fine in browsers AND in ppt.

ggsave() will save a plot with the width/height based on the current graphical device if we don't specify them. That's why after we issue ggsave() it will tell us the image size (inch). So in order to have a fixed width/height, we need to specify them explicitly. See

My experience is ggsave() is better than png() because ggsave() makes the text larger when we save a file with a higher resolution.

...
ggsave("filename.png", object, width=8, height=4)
# vs
png("filename.png", width=1200, height=600)
...
dev.off()

We can specify dpi to increase the resolution if we use the png format (svg is not affected); see Chapter 14.5 Outputting to Bitmap (PNG/TIFF) Files from R Graphics Cookbook.

g1 <- ggplot(data = mydf) 
g1
ggsave("myfile.png", g1, height = 7, width = 8, units = "in", dpi = 300)

I got an error - Error in loadNamespace(name) : there is no package called ‘svglite’. After I install the package, everything works fine.

ggsave("raw-output.bmp", p, width=4, height=3, dpi = 100)
# Will generate 4*100 x 3*100 pixel plot

Note:

  • For saving to "png" file, increasing dpi (from 72 to 300) will increase font & point size. dpi/ppi is not an inherent property of an image.
  • If we don't specify any parameters and without resizing the graphics device size, then "png" file created by ggsave() will contain much more pixels compared to "svg" file (e.g. 1200 vs 360).
  • How ggsave() decides width/height if a svg file was used in an Rmd file? A: 7x7 from my experiment. So the font/point size will be smaller compared to a 4x4 inch output.
  • When I created an svg file in Linux with 4x4 inch (width x height), the file is 360 x 360 pixels when I right click the file to get the properties of the file. But macOS cannot return this number nor am I able to find this number from the svg file??

Multiple pages in pdf

https://stackoverflow.com/a/53698682. The key is to save the plot in an object and use the print() function.

pdf("FileName", onefile = TRUE)
for(i in 1:I) {
  p <- ggplot()
  print(p)
}
dev.off()

graphics::smoothScatter: scatter plots with lots of points

Other tips/FAQs

Tips and tricks for working with images and figures in R Markdown documents

Ten Simple Rules for Better Figures

Ten Simple Rules for Better Figures

Recreating the Storytelling with Data look with ggplot

Recreating the Storytelling with Data look with ggplot

ggplot2 does not appear to work when inside a function

https://stackoverflow.com/a/17126172. Use print() or ggsave(). When you use these functions interactively at the command line, the result is automatically printed, but in source() or inside your own functions you will need an explicit print() statement.

BBC

Add your brand to ggplot graph

You Need to Start Branding Your Graphs. Here's How, with ggplot!

Animation and gganimate

ggstatsplot

ggstatsplot: ggplot2 Based Plots with Statistical Details

Write your own ggplot2 function: rlang

Some packages depend on ggplot2

dittoSeq from Bicoonductor

Meme

Python

plotnine: A Grammar of Graphics for Python.

plotnine is an implementation of a grammar of graphics in Python, it is based on ggplot2. The grammar allows users to compose plots by explicitly mapping data to the visual objects that make up the plot.

The Hitchhiker’s Guide to Plotnine