November 30, 2022

What is ggplot2 and why use it?

A package that is

  • elegant and versatile
  • grammar of graphics
  • layered structure that lets you build block by block
  • different from base graphics (base R)
    • curve could be steep
      • but looks amazing when it works!

Building blocks of graph

  • Understanding breakdown of your graph is important

  • This is core of ggplot2!

  • Will make plotting of perfect graph easier

  • Process of adding layers

Prepare library

Options:

  • tidyverse
    • includes many packages (ggplot2, dplyr)
    • time to load
  • ggplot2
library(ggplot2)

library(tidyverse)

ggplot2 layers

ggplot(diamonds) +
  geom_point(aes(x = carat, y = price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price, colour = cut))

What is happening:

ggplot(diamonds) +
  geom_point(aes(x = carat, y = price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price, colour = cut))
  • ggplot(diamonds) loads data frame
  • plus (+) tells ggplot() that there is more to add
  • geom_point() / geom_smooth() defines type of plot
    • geom = geometric object
  • aes(x = , y =, colour = ) defines the variables

Back to basic scatterplot

ggplot(diamonds) +
  geom_point(aes(x = carat, y = price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price))

Adjusting x and y axis limits

  • Method 1
    • Delete points outside range
    • Will not be considered for further calculations!


  • Method 2
    • Zoom in on area of interest
    • Maintain other points in dataset
      • Just not shown

Method 1 - Delete points, y axis

ggplot(diamonds) +
  geom_point(aes(x = carat, y= price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price)) + 
  ylim(0, 3000)

Method 1 - Delete points, y axis

## Warning: Removed 23604 rows containing non-finite values (stat_smooth).
## Warning: Removed 23604 rows containing missing values (geom_point).

Method 1 - Delete points, x axis

ggplot(diamonds) +
  geom_point(aes(x = carat, y= price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price)) + 
  ylim(0, 3000) +
  xlim(0, 1)

Method 1 - Delete points, x axis

## Warning: Removed 23716 rows containing non-finite values (stat_smooth).
## Warning: Removed 23716 rows containing missing values (geom_point).

Method 2 - Zooming in

ggplot(diamonds) +
  geom_point(aes(x = carat, y= price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price)) + 
  coord_cartesian(xlim = c(0, 1), ylim = c(0, 3000))

Method 2 - Zooming in

Add a title

ggplot(diamonds) +
  geom_point(aes(x = carat, y= price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price)) + 
  coord_cartesian(xlim = c(0, 1), ylim = c(0, 3000)) + 
  labs(title = "Carat vs Price",
       subtitle = "Diamond Dataset")

Add a title

Change axis labels

ggplot(diamonds) +
  geom_point(aes(x = carat, y= price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price)) + 
  coord_cartesian(xlim = c(0, 1), ylim = c(0, 3000)) + 
  labs(title = "Carat vs Price",
       subtitle = "Diamond Dataset",
       x = "Carat Diamond",
       y = "Price ($)")

Change axis labels

Change colour line

ggplot(diamonds) +
  geom_point(aes(x = carat, y= price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price), colour = "firebrick", size = 2) + 
  coord_cartesian(xlim = c(0, 1), ylim = c(0, 3000)) + 
  labs(title = "Carat vs Price",
       subtitle = "Diamond Dataset",
       x = "Carat Diamond",
       y = "Price ($)")

Change colour line

Change colour theme scatter

ggplot(diamonds) +
  geom_point(aes(x = carat, y= price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price), colour = "firebrick", size = 2) + 
  coord_cartesian(xlim = c(0, 1), ylim = c(0, 3000)) + 
  labs(title = "Carat vs Price",
       subtitle = "Diamond Dataset",
       x = "Carat Diamond",
       y = "Price ($)") +
  scale_colour_brewer(palette = "Spectral")

Change colour theme scatter

Colours in R

Change legend

ggplot(diamonds) +
  geom_point(aes(x = carat, y= price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price), colour = "firebrick", size = 2) + 
  coord_cartesian(xlim = c(0, 1), ylim = c(0, 3000)) + 
  labs(title = "Carat vs Price",
       subtitle = "Diamond Dataset",
       x = "Carat Diamond",
       y = "Price ($)") +
  scale_colour_brewer(palette = "Spectral") +
  theme(legend.position = "bottom",
        legend.title = element_text(colour = "Steelblue4", face = "bold", size = 15))

Change legend

Change legend

ggplot(diamonds) +
  geom_point(aes(x = carat, y= price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price), colour = "firebrick", size = 2) + 
  coord_cartesian(xlim = c(0, 1), ylim = c(0, 3000)) + 
  labs(title = "Carat vs Price",
       subtitle = "Diamond Dataset",
       x = "Carat Diamond",
       y = "Price ($)") +
  scale_colour_brewer(palette = "Spectral") +
  theme(legend.position = "bottom",
        legend.title = element_text(colour = "Steelblue4", face = "bold", size = 15),
        legend.background = element_rect(fill = "Slategray2", size = 0.5, linetype = "solid"))

Change legend

Change x axis tick marks

  • To demonstrate this –> zoomed in more

Change x axis tick marks

ggplot(diamonds) +
  geom_point(aes(x = carat, y= price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price), colour = "firebrick", size = 2) + 
  coord_cartesian(xlim = c(0.2, 1), ylim = c(0, 3000)) + 
  labs(title = "Carat vs Price",
       subtitle = "Diamond Dataset",
       x = "Carat Diamond",
       y = "Price ($)") +
  scale_colour_brewer(palette = "Spectral") +
  theme(legend.position = "bottom",
        legend.title = element_text(colour = "Steelblue4", face = "bold", size = 15),
        legend.background = element_rect(fill = "Slategray2", size = 0.5, linetype = "solid")) +
  scale_x_continuous(breaks = seq(0.2, 1, 0.05))

Change x axis tick marks

Customize text for axis ticks

ggplot(diamonds) +
  geom_point(aes(x = carat, y= price, colour = cut)) +
  geom_smooth(aes(x = carat, y = price), colour = "firebrick", size = 2) + 
  coord_cartesian(xlim = c(0.2, 1), ylim = c(0, 3000)) + 
  labs(title = "Carat vs Price",
       subtitle = "Diamond Dataset",
       x = "Carat Diamond",
       y = "Price ($)") +
  scale_colour_brewer(palette = "Spectral") +
  theme(legend.position = "bottom",
        legend.title = element_text(colour = "Steelblue4", face = "bold", size = 15),
        legend.background = element_rect(fill = "Slategray2", size = 0.5, linetype = "solid")) +
  scale_x_continuous(breaks = seq(0.2, 1, 0.05)) +
  scale_y_continuous(breaks = seq(0, 3000, 1000),
                    labels =c("free", "expensive", "more expensive", "really expensive"))

Customize text for axis ticks

Change overall theme

  • Possible to change theme at once
    • Pre-coded themes in ggplot
    • theme_bw()
    • theme_classic()
    • theme_gray()
    • …
  • Example with + theme_classic()

Change to theme_classic

Your turn!

  • Explore theme
    • Use ?theme to find options
    • Change axis labels
    • font, size, colour
  • Make graph with Midwest data set