Multiple packages:
- Basic R plot()
- Helpful for first data exploration
- ggplot2
- Allows for many modifications
November 16, 2022
Multiple packages:
Base plots can be quick
hist(diamonds$price)
But so is ggplot
ggplot(diamonds)+ geom_histogram(aes(x = price))
ggplot(diamonds)+ geom_histogram(aes(x = price))
ggplot(diamonds, aes(price, fill = cut)) + geom_histogram()
A package that is
Options:
library(ggplot2) library(tidyverse)
x y 6 1 3 3 5 4 1 5 2 6
colnames(diamonds)
## [1] "carat" "cut" "color" "clarity" "depth" "table" "price" ## [8] "x" "y" "z"
summary(diamonds)
carat cut color clarity depth
Min. :0.2000 Fair : 1610 D: 6775 SI1 :13065 Min. :43.00
1st Qu.:0.4000 Good : 4906 E: 9797 VS2 :12258 1st Qu.:61.00
Median :0.7000 Very Good:12082 F: 9542 SI2 : 9194 Median :61.80
Mean :0.7979 Premium :13791 G:11292 VS1 : 8171 Mean :61.75
3rd Qu.:1.0400 Ideal :21551 H: 8304 VVS2 : 5066 3rd Qu.:62.50
Max. :5.0100 I: 5422 VVS1 : 3655 Max. :79.00
J: 2808 (Other): 2531
table price x y
Min. :43.00 Min. : 326 Min. : 0.000 Min. : 0.000
1st Qu.:56.00 1st Qu.: 950 1st Qu.: 4.710 1st Qu.: 4.720
Median :57.00 Median : 2401 Median : 5.700 Median : 5.710
Mean :57.46 Mean : 3933 Mean : 5.731 Mean : 5.735
3rd Qu.:59.00 3rd Qu.: 5324 3rd Qu.: 6.540 3rd Qu.: 6.540
Max. :95.00 Max. :18823 Max. :10.740 Max. :58.900
z
Min. : 0.000
1st Qu.: 2.910
Median : 3.530
Mean : 3.539
3rd Qu.: 4.040
Max. :31.800
Tell ggplot what data set to use
ggplot(diamonds)
Tell ggplot what data set to use
ggplot(diamonds)
ggplot(diamonds, aes (x = carat, y = price))
ggplot(diamonds, aes (x = carat, y = price))
ggplot(diamonds, aes(x = carat, y = price)) + geom_point()
ggplot(diamonds, aes(x = carat, y = price)) + geom_point()
ggplot(diamonds, aes(x = carat, y = price)) + geom_point()
ggplot(diamonds, aes(x = carat, y = price, colour = cut)) + geom_point()
ggplot(diamonds, aes(x = carat, y = price, colour = cut)) + geom_point()
ggplot(diamonds, aes(x = carat, y = price, colour = cut)) + geom_point() + geom_smooth()
ggplot(diamonds, aes(x = carat, y = price, colour = cut)) + geom_point() + geom_smooth()
ggplot(diamonds)+ geom_point(aes(x = carat, y = price, colour = cut)) + geom_smooth(aes(x = carat, y = price, colour = cut))
ggplot(diamonds)+ geom_point(aes(x = carat, y = price, colour = cut)) + geom_smooth(aes(x = carat, y = price, colour = cut))
ggplot(diamonds) + geom_point(aes(x = carat, y = price, colour = cut)) + geom_smooth(aes(x = carat, y = price))
ggplot(diamonds) + geom_point(aes(x = carat, y = price, colour = cut)) + geom_smooth(aes(x = carat, y = price))
ggplot(diamonds) + geom_point(aes(x = carat, y = price, colour = cut, shape = cut)) + geom_smooth(aes(x = carat, y = price))
ggplot(diamonds) + geom_point(aes(x = carat, y = price, colour = cut, shape = cut)) + geom_smooth(aes(x = carat, y = price))
colnames(mpg)
## [1] "manufacturer" "model" "displ" "year" "cyl" ## [6] "trans" "drv" "cty" "hwy" "fl" ## [11] "class"
hist(mpg$displ)
hist(mpg$hwy)
ggplot(mpg) + geom_point(aes(x = displ, y = hwy, colour = class)) + geom_smooth(aes(x = displ, y = hwy))
Access to slides:
Slides will be available on github
Look for Purdue R cafe on github.com
Email us:
Dr Priyanka Baloni: pbaloni@purdue.edu
Dr Anke Tukker: atukker@purdue.edu