class: center, middle, inverse, title-slide # Essentials 06: Data Visualisation with ggplot2 --- class: inverse # Session Objectives ### Data Visualisation with ggplot2 - Introduction to ggplot2 - Boxplot - Grouped Boxplot - Scatterplot - Means plot --- class: inverse # Setup ### .orange[Tasks]: 1. Open the slides for today 1. Open/create your seminRs project 1. Download the [Rmd document](https://and.netlify.app/seminr/06/essentials/essentials_06.Rmd) to the r_docs folder of your seminRs project & open it <br> <br> <br> <br> <br> <br> <br> <br> You can [download the Solutions Rmd](https://and.netlify.app/seminr/06/essentials/essentials_solutions_06.Rmd) to check your answers later --- class: inverse # Intro to ggplot2 Figures/graphs/plots are a useful way to look at your data - Quickly see general trends of data with minimal effort - Easy check of data - Gives readers a general overview of your results/data <br> ggplot2 is a really useful package for data visualisation - Part of tidyverse - Follows .italic[The Grammar of Graphics] - Plots are built in layers from 3 key components: data, coordinate system, & geometric elements (geoms) - Super customisable --- class: inverse # Using ggplot() - Built layer by layer + - Variables are .italic[mapped] onto elements of the plot as 'aesthetics' - Use aes() to include variable(s) as an aesthetic - Geoms are 'visual marks' that represent data points e.g. - geom_bar() – creates a layer of bars - geom_point() – creates a layer of data points - geom_histogram() – creates a layer with a histogram - geom_text() – creates layer with text on .italic[& many more options...] --- class: inverse # General Process 1. Create base layer by specifying the data & the variables to map onto the plot 1. Add geom layer (to display our data points) 1. Add/edit visual properties (scale, colours, shapes etc.) 1. Add semantic properties (e.g. title, labels) 1. Add a theme to make our plot pretty <br> <br> <br> <br> <br> <br> .orange[Today we're going to follow this process, to build different types of common graphs!] --- class: inverse # Basic Boxplot .panelset[ .panel[.panel-name[Code] ```r peng_boxplot <- ggplot2::ggplot(peng_data, aes(island, bill_length_mm)) peng_boxplot + geom_boxplot() + labs(title = "Penguin Bill Length Split by Island", x = "Island", y = "Bill Length (mm)") + theme_classic() ``` <br> <br> <br> ### .orange[Task]: Create a boxplot of peng_data with sex on the x axis & flipper_length_mm on the y axis ] .panel[.panel-name[Plot] <img src="index_files/figure-html/box-1-plot-1.png" style="display: block; margin: auto;" /> ]] --- class: inverse # Grouped Boxplot .panelset[ .panel[.panel-name[Code] ```r peng_boxplot_2 <- ggplot2::ggplot(peng_data, aes(island, bill_length_mm, fill = sex)) peng_boxplot_2 + geom_boxplot() + labs(title = "Penguin Bill Length Split by Island and Sex", x = "Island", y = "Bill Length (mm)", fill = "Sex") + theme_minimal() ``` <br> <br> ### .orange[Task]: Create a grouped boxplot of peng_data with sex on the x axis, flipper_length_mm on the y axis, & grouped by species ] .panel[.panel-name[Plot] <img src="index_files/figure-html/box-2-plot-1.png" style="display: block; margin: auto;" /> ]] --- class: inverse # Scatterplots .panelset[ .panel[.panel-name[Code] ```r peng_scatter <- ggplot2::ggplot(peng_data, aes(bill_length_mm, bill_depth_mm)) peng_scatter + geom_point() + labs(title = "Scatterplot of Bill Length & Bill Depth", x = "Bill Length (mm)", y = "Bill Depth (mm)") + theme_bw() ``` .panel[.panel-name[Plot] <img src="index_files/figure-html/scatter-1-plot-1.png" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r peng_scatter_2 <- ggplot2::ggplot(peng_data, aes(bill_length_mm, bill_depth_mm)) peng_scatter_2 + geom_point(colour = "dark blue", size = 3, shape = 20, alpha = .5) + labs(title = "Scatterplot of Bill Length & Bill Depth", x = "Bill Length (mm)", y = "Bill Depth (mm)") + theme_bw() ``` <br> <br> ### .orange[Task]: Create a scatterplot of peng_data with body_mass_g on the x axis, flipper_length_mm on the y axis ] ] .panel[.panel-name[Plot] <img src="index_files/figure-html/scatter-2-plot-1.png" style="display: block; margin: auto;" /> ]] --- class: inverse # Means Plots .panelset[ .panel[.panel-name[Code] ```r peng_means <- ggplot2::ggplot(peng_data, aes(species, flipper_length_mm)) peng_means + stat_summary(fun = "mean", geom = "point", size = 4, colour = "pink") + labs(title = "Mean Flipper Length per Penguin Species", x = "Species", y = "Flipper Length (mm)") + theme_minimal() ``` <br> <br> ### .orange[Task]: Create a means plot of peng_data with island on the x axis, body_mass_g on the y axis ] .panel[.panel-name[Plot] <img src="index_files/figure-html/means-plot-1.png" style="display: block; margin: auto;" /> ]] --- class: inverse # Don't Be Misleading! .pull-left[ <img src="./images/axes.png" width="90%" /> ] .pull-right[ <br> <img src="./images/img.jpg" width="90%" /> ] --- class: inverse # Changing the Coordinate System .panelset[ .panel[.panel-name[Code] ```r peng_means <- ggplot2::ggplot(peng_data, aes(species, flipper_length_mm)) peng_means + stat_summary(fun = "mean", geom = "point", size = 4, colour = "pink") + labs(title = "Mean Flipper Length per Penguin Species", x = "Species", y = "Flipper Length (mm)") + coord_cartesian(ylim = c(0, 250)) + scale_y_continuous(breaks = seq(0, 250, 50)) + theme_minimal() ``` <br> ### .orange[Task]: Using the means plot in the previous task, alter the y axis & add appropriate scale breaks (use your judgement here for what would be appropriate) ] .panel[.panel-name[Plot] <img src="index_files/figure-html/coord-plot-1.png" style="display: block; margin: auto;" /> ]] --- class: inverse # General Tips Think about what makes a good graph when creating your own: - Clear & tidy - Simple – presents a lot with a little! - Coherent - Accurate – doesn’t distort results - Informative captions/titles & labels - Is referred to in the text ("see Figure 1") ### To Avoid Problems - Run & inspect every change you make - Be aware that creating plots with ggplot is often a long process - A really common error is forgetting the .orange[+] for each layer - Set code chunk options to echo = FALSE (plot is displayed but the code isn't) --- class: center, middle <div class="padlet-embed" style="border:1px solid rgba(0,0,0,0.1);border-radius:2px;box-sizing:border-box;overflow:hidden;position:relative;width:100%;background:#F4F4F4"><p style="padding:0;margin:0"><iframe src="https://uofsussex.padlet.org/embed/nrud4gk8x63gbfdc" frameborder="0" allow="camera;microphone;geolocation" style="width:100%;height:608px;display:block;padding:0;margin:0"></iframe></p><div style="padding:8px;text-align:right;margin:0;"><a href="https://padlet.com?ref=embed" style="padding:0;margin:0;border:none;display:block;line-height:1;height:16px" target="_blank"><img src="https://padlet.net/embeds/made_with_padlet.png" width="86" height="16" style="padding:0;margin:0;background:none;border:none;display:inline;box-shadow:none" alt="Made with Padlet"></a></div></div>