Custom Visualisations

1 Learning Objectives

  • Explore tools for customising plot appearance in R.

2 Introduction

We have made some very basic plots so far, but we can do a lot more. Customising your graphs may, at first glance, appear reasonably trivial but the harsh reality is that many graphing packages in R have terrible defaults that contain lots of chart junk (i.e., unnecessary elements that distract the audience) as we saw in Lab 3.1. Thankfully R also provides great flexibility that facilitates customisation of almost every graph component. In fact, many major organisations actually use custom ggplot2 themes to create their stylised graphics (e.g., BBC, The Economist, etc.). In this section, we will look at some of the ways we can customise our plots.

3 Start a Script

For this lab or project, begin by:

  • Starting a new R script
  • Create a good header section and table of contents
  • Save the script file with an informative name
  • set your working directory

Aim to make the script a future reference for doing things in R!

4 Data and Packages

# Load packages
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.3     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(palmerpenguins)

# Load data
data(penguins)
# Simulate data for line plots
newdata <- data.frame(x = runif(24, -2, 2), # Create a variable x with 24 values between -2 and 2
                      y = rnorm(24))        # Create a variable y with 24 random normal values

5 Customising base R Plots

5.1 Histograms

Let’s start by re-creating a histogram of the flipper length of the penguins. We can use the hist() function to do this:

# Create histogram of flipper length
hist(penguins$flipper_length_mm) # Specify data

We can modify the aesthetics of the histogram using the col argument and provide context using the main and xlab arguments:

# Create histogram of flipper length  
hist(penguins$flipper_length_mm,           # Specify data
     breaks = 20,                          # Change number of bins
     col = "darkblue",                     # Change colour
     main = "Histogram of flipper length", # Add title
     xlab = "Flipper length (mm)")         # Add x-axis label

These are only some of the aesthetic options we can modify. We can also change the axis limits, axis labels, axis tick marks, and more. Let’s say we want to change the x-axis limits to 160 and 240, the x-axis label to “Flipper length (mm)”, and remove the y-axis tick marks and title. We can do this using the xlim, xlab, yaxt and ylab arguments, respectively:

# Create histogram of flipper length
hist(penguins$flipper_length_mm,           # Specify data
     breaks = 20,                          # Change number of bins
     col = "darkblue",                     # Change colour
     main = "Histogram of flipper length", # Add title
     xlab = "Flipper length (mm)",         # Add x-axis label
     xlim = c(160, 240),                   # Change x-axis limits
     yaxt = "n",                           # Remove y-axis tick marks
     ylab = "")                            # Remove y-axis label    

We can modify the plot margins using the par() function. The par() function is used to set or query graphical parameters. The mar argument is used to set the margins of the plot. The default margins are c(5, 4, 4, 2) + 0.1. The numbers in the vector are the number of lines of margin to be specified on the four sides of the plot (bottom, left, top, and right). The default unit is lines, but other units can be specified using the mar argument. For example, if we want to change the bottom margin to 10 lines, we can do the following:

# Change the bottom margin to 10 lines
par(mar = c(10, 4, 4, 2) + 0.1)

# Create histogram of flipper length
hist(penguins$flipper_length_mm,           # Specify data
     breaks = 20,                          # Change number of bins
     col = "darkblue",                     # Change colour
     main = "Histogram of flipper length", # Add title
     xlab = "Flipper length (mm)",         # Add x-axis label
     xlim = c(160, 240),                   # Change x-axis limits
     yaxt = "n",                           # Remove y-axis tick marks
     ylab = "")                            # Remove y-axis label    

We can also modify the background if we really want! We can do this using the par() function and the bg argument. The bg argument is used to set the background colour of the plot. The default background colour is white. We can change the background colour to light blue using the following code:

# Change the bottom margin back to the default
par(mar = c(5, 4, 4, 2) + 0.1)

# Change the background colour to light blue
par(bg = "lightblue")

# Create histogram of flipper length
hist(penguins$flipper_length_mm,           # Specify data
     breaks = 20,                          # Change number of bins
     col = "darkblue",                     # Change colour
     main = "Histogram of flipper length", # Add title
     xlab = "Flipper length (mm)",         # Add x-axis label
     xlim = c(160, 240),                   # Change x-axis limits
     yaxt = "n",                           # Remove y-axis tick marks
     ylab = "")                            # Remove y-axis label    

5.2 Boxplots

Let’s start by re-creating a boxplot of the flipper length of the penguins. We can use the boxplot() function to do this:

# Change the background colour back to white
par(bg = "white")

# Create boxplot of flipper length
boxplot(penguins$flipper_length_mm) # Specify data

We can modify the aesthetics of the boxplot using the col argument and provide context using the main and xlab arguments:

# Boxplot showing flipper lengths by species
boxplot(flipper_length_mm ~ species,                    # Specify data
        data = penguins,                                # Specify data frame
        main = "Boxplot of Flipper Lengths by Species", # Add title
        xlab = "Species",                               # Add x-axis label
        ylab = "Flipper Length (mm)",                   # Add y-axis label
        col = c("red", "green", "blue"))                # Change colour

Now it isn’t strictly necessary here as each penguin species is already labelled on the x-axis, but we could add a legend to the plot using the legend() function:

# Boxplot showing flipper lengths by species
boxplot(flipper_length_mm ~ species,                    # Specify data
        data = penguins,                                # Specify data frame
        main = "Boxplot of Flipper Lengths by Species", # Add title
        xlab = "Species",                               # Add x-axis label
        ylab = "Flipper Length (mm)",                   # Add y-axis label
        col = c("red", "green", "blue"))                # Change colour

# Adding legend (optional but useful)
legend("topleft",                                       # Specify location
       legend = levels(penguins$species),               # Specify legend labels
       fill = c("red", "green", "blue"))                # Specify legend colours

5.3 Barplots

Let’s start by re-creating a barplot of the species of penguins in the dataset. We can use the barplot() function to do this:

# Create barplot of species
barplot(table(penguins$species)) # Specify data

The barplot() function has a number of arguments that you can use to customise the barplot. We can change the aesthetics of the boxplot using the col argument and provide context using the main and xlab arguments:

# Barplot showing species
barplot(table(penguins$species),         # Specify data
        main = "Barplot of Species",     # Add title
        xlab = "Species",                # Add x-axis label
        ylab = "Count",                  # Add y-axis label
        col = c("red", "green", "blue")) # Change colour

To view your colour options you can run ?colors() in the console. If you want to change the order of the bars, you can also re-order the factor levels of the species variable to make the plot easier to interpret:

# Reorder the levels of the factor
penguins$species <- factor(penguins$species, levels = c("Chinstrap", "Gentoo", "Adelie"))

# Barplot showing species
barplot(table(rev(penguins$species)),    # Specify data
        main = "Barplot of Species",     # Add title
        xlab = "Species",                # Add x-axis label
        ylab = "Count",                  # Add y-axis label
        col = c("red", "green", "blue")) # Change colour

We can also change the y-axis limits as they currently do not fit the data range for the Adelie pengiuns:

# Barplot showing species
barplot(table(penguins$species),         # Specify data
        main = "Barplot of Species",     # Add title
        xlab = "Species",                # Add x-axis label
        ylab = "Count",                  # Add y-axis label
        col = c("red", "green", "blue"), # Change colour
        ylim = c(0, 160),                # Change y-axis limits
        yaxt = "n")                      # Remove y-axis tick marks  

# Add custom y-axis tick marks
axis(side = 2,                            # Specify side of plot
     at = seq(0, 160, by = 20),           # Specify tick mark positions
     las = 2)                             # Specify tick mark orientation - 2 = horizontal and 3 = vertical

5.4 Scatterplots

Let’s start by re-creating a scatterplot of flipper length against body mass for the penguins dataset. We can do this using the plot() function:

# Scatterplot of flipper length against body mass
plot(penguins$flipper_length_mm,                          # Specify data
     penguins$body_mass_g,                                # Specify data
     main = "Scatterplot of Flipper Length vs Body Mass", # Add title
     xlab = "Flipper Length (mm)",                        # Add x-axis label
     ylab = "Body Mass (g)")                              # Add y-axis label

We can also add some additional context by colouring the points by species:

# Scatterplot of flipper length against body mass
plot(penguins$flipper_length_mm,                          # Specify data
     penguins$body_mass_g,                                # Specify data
     main = "Scatterplot of Flipper Length vs Body Mass", # Add title
     xlab = "Flipper Length (mm)",                        # Add x-axis label
     ylab = "Body Mass (g)",                              # Add y-axis label
     col = c("red", "green", "blue")[penguins$species],  # Change colour
     pch = 16)                                            # Change point symbol

# Add legend
legend("topleft",                          # Specify position of legend
       legend = levels(penguins$species),  # Specify legend labels
       col = c("red", "green", "blue"),    # Specify legend colours
       pch = 16)                           # Specify legend symbol

If you want to view which legend symbols are available, you can use the ?points function to view the help file for the points() function. This will show you the different legend symbols that are available. You can also change the size of the legend symbols using the cex argument.

You can also add trend lines to the scatterplot using the abline() function. For example, if we want to add a linear trend line to the scatterplot, we can do the following:

# Scatterplot of flipper length against body mass
plot(penguins$flipper_length_mm,                          # Specify data
     penguins$body_mass_g,                                # Specify data
     main = "Scatterplot of Flipper Length vs Body Mass", # Add title
     xlab = "Flipper Length (mm)",                        # Add x-axis label
     ylab = "Body Mass (g)",                              # Add y-axis label
     col = c("red", "green", "blue")[penguins$species],  # Change colour
     pch = 16)                                            # Change point symbol

# Add legend
legend("topleft",                          # Specify position of legend
       legend = levels(penguins$species),  # Specify legend labels
       col = c("red", "green", "blue"),    # Specify legend colours
       pch = 16)                           # Specify legend symbol

# Add linear trend line for each species
abline(lm(penguins$body_mass_g ~ penguins$flipper_length_mm), # Specify linear model
       col = "black",                                         # Change colour
       lwd = 2)                                               # Change line width

This trend line is for all three species. If we want to add one for each species it becomes a bit more complicated as we have to fit a linear model for each species and then add the trend line for each species:

# Scatterplot of flipper length against body mass
plot(penguins$flipper_length_mm,                          # Specify data
     penguins$body_mass_g,                                # Specify data
     main = "Scatterplot of Flipper Length vs Body Mass", # Add title
     xlab = "Flipper Length (mm)",                        # Add x-axis label
     ylab = "Body Mass (g)",                              # Add y-axis label
     col = c("red", "green", "blue")[penguins$species],   # Change colour
     pch = 16)                                            # Change point symbol

# Add legend
legend("topleft",                          # Specify position of legend
       legend = levels(penguins$species),  # Specify legend labels
       col = c("red", "green", "blue"),    # Specify legend colours
       pch = 16)                           # Specify legend symbol

# Add trend lines for each species
species_levels <- levels(penguins$species)
colors <- c("red", "green", "blue")
for (i in seq_along(species_levels)) {
    species_data <- subset(penguins, species == species_levels[i])
    fit <- lm(body_mass_g ~ flipper_length_mm, data = species_data)
    abline(fit, col = colors[i], lw = 2)
}

6 Customising ggplot2 Plots

6.1 Histograms

We can see that this produces the same histogram as the base R version. These are only some of the aesthetic options we can modify. We can also change the axis limits, axis labels, axis tick marks, and more. Let’s say we want to change the x-axis limits to 160 and 240, the x-axis label to “Flipper length (mm)”, and remove the y-axis tick marks and title. We can do this by using the xlim(), xlab(), ylab(), and theme() functions:

# Create histogram of flipper length
ggplot(data = penguins, aes(x = flipper_length_mm)) + # Specify data and aesthetic mappings
  geom_histogram() +                                  # Add histogram layer
  xlim(160, 240) +                                    # Change x-axis limits
  xlab("Flipper length (mm)") +                       # Change x-axis label
  ylab(NULL) +                                        # Remove y-axis label
  theme(axis.ticks.y = element_blank(),               # Remove y-axis tick marks
        axis.title.y = element_blank(),               # Remove y-axis title
        axis.text.y = element_blank())                # Remove y-axis text
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

This produces a ggplot2 plot that closely resembles the base R plot we created earlier. However, we can do more with ggplot2. For example, we can change the colour of the bars using the fill argument:

# Create histogram of flipper length
ggplot(data = penguins, aes(x = flipper_length_mm)) +  # Specify data and aesthetic mappings
  geom_histogram(fill = "darkblue") +                  # Change colour
  xlim(160, 240) +                                     # Change x-axis limits
  xlab("Flipper length (mm)") +                        # Change x-axis label
  ylab(NULL) +                                         # Remove y-axis label
  ggtitle("Histogram of Flipper Length") +             # Add title
  theme(axis.ticks.y = element_blank(),                # Remove y-axis tick marks
        axis.title.y = element_blank(),                # Remove y-axis title
        axis.text.y = element_blank())                 # Remove y-axis text
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

You can change the size of any text on the ggplot2 plot using the size argument. For example, let’s say we want to increase the size of the title and axis labels:

# Create histogram of flipper length
ggplot(data = penguins, aes(x = flipper_length_mm)) +  # Specify data and aesthetic mappings
  geom_histogram(fill = "darkblue") +                  # Change colour
  xlim(160, 240) +                                     # Change x-axis limits
  xlab("Flipper length (mm)") +                        # Change x-axis label
  ylab(NULL) +                                         # Remove y-axis label
  ggtitle("Histogram of Flipper Length") +             # Add title
  theme(axis.ticks.y = element_blank(),                # Remove y-axis tick marks
        axis.title.y = element_blank(),                # Remove y-axis title
        axis.text.y = element_blank(),                 # Remove y-axis text
        plot.title = element_text(size = 20),          # Change title text size
        axis.title.x = element_text(size = 15))        # Change x-axis label text size
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

6.2 Boxplots

Let’s re-create the same boxplot using ggplot2:

# Create boxplot of flipper length
ggplot(data = penguins, aes(x = species, y = flipper_length_mm)) + # Specify data and aesthetic mappings
  geom_boxplot()                                                   # Add boxplot layer

We can also add the same customisations as before:

# Create boxplot of flipper length
ggplot(data = penguins, aes(x = species, y = flipper_length_mm)) + # Specify data and aesthetic mappings
  geom_boxplot(data = penguins, aes(fill = species)) +             # Add boxplot layer and change colour by species 
  labs(title = "Boxplot of flipper length",                        # Add title
       x = "Species",                                              # Add x-axis label
       y = "Flipper length (mm)")                                  # Add y-axis label

It is possible to modify the legend title and labels using the labs() function. For example, let’s say we want to change the legend title to “Species” and the legend labels to “Adelie”, “Chinstrap”, and “Gentoo”. We can also change the position:

# Create boxplot of flipper length
ggplot(data = penguins, aes(x = species, y = flipper_length_mm)) +   # Specify data and aesthetic mappings
  geom_boxplot(data = penguins, aes(fill = species)) +               # Add boxplot layer and change colour by species 
  labs(title = "Boxplot of flipper length",                          # Add title
       x = "Species",                                                # Add x-axis label
       y = "Flipper length (mm)",                                    # Add y-axis label
       fill = "Species") +                                           # Change legend title
  scale_fill_discrete(labels = c("Adelie", "Chinstrap", "Gentoo")) + # Change legend labels
  theme(legend.position = "bottom")                                  # Change legend position

6.3 Barplots

Now let’s re-create the same barplot using ggplot2:

# Create barplot of species
ggplot(data = penguins, aes(x = species)) + # Specify data and aesthetic mappings
  geom_bar()                                # Add bar plot layer

We can see that this produces the same barplot as the base R version. We can also add the same customisations as before:

# Create barplot of species
ggplot(data = penguins, aes(x = species)) +    # Specify data and aesthetic mappings
  geom_bar(fill = c("red", "green", "blue")) + # Add bar plot layer and change colour
  labs(title = "Barplot of Species",           # Add title
       x = "Species",                          # Add x-axis label
       y = "Count")                            # Add y-axis label

There is a way to modify the entire theme of the plot using the theme() function. There are several pre-defined themes available in ggplot2, such as theme_bw(), theme_classic(), and theme_minimal().

# Create barplot of species
ggplot(data = penguins, aes(x = species)) +    # Specify data and aesthetic mappings
  geom_bar(fill = c("red", "green", "blue")) + # Add bar plot layer and change colour
  labs(title = "Barplot of Species",           # Add title
       x = "Species",                          # Add x-axis label
       y = "Count") +                          # Add y-axis label
  theme_bw()                                   # Change theme

You can also create your own custom theme, but is very advanced and beyond the scope of this course.

6.4 Scatterplots

Let’s start by re-creating a scatterplot of flipper length against body mass for the penguins data set using ggplot2:

# Create scatterplot of flipper length against body mass
ggplot(data = penguins, aes(x = flipper_length_mm, y = body_mass_g)) + # Specify data and aesthetic mappings
  geom_point() +                                                       # Add scatterplot layer
  labs(title = "Scatterplot of Flipper Length vs Body Mass",           # Add title
       x = "Flipper Length (mm)",                                      # Add x-axis label
       y = "Body Mass (g)")                                            # Add y-axis label

This is a nice plot, but we can add some additional context by colouring the points by species:

# Create scatterplot of flipper length against body mass
ggplot(data = penguins, aes(x = flipper_length_mm, y = body_mass_g, colour = species)) + # Specify data and aesthetic mappings
  geom_point() +                                                                         # Add scatterplot layer
  labs(title = "Scatterplot of Flipper Length vs Body Mass",                             # Add title
       x = "Flipper Length (mm)",                                                        # Add x-axis label
       y = "Body Mass (g)",                                                              # Add y-axis label
       colour = "Species") +                                                             # Add legend title    
  theme(legend.position = "bottom")                                                      # Change legend position

Adding trendlines for each species is much easier in ggplot2 than in base R. We can do this using the geom_smooth() function:

# Create scatterplot of flipper length against body mass
ggplot(data = penguins, aes(x = flipper_length_mm, y = body_mass_g, colour = species)) + # Specify data and aesthetic mappings
  geom_point() +                                                                         # Add scatterplot layer
  geom_smooth(method = "lm", se = FALSE) +                                               # Add trendline layer
  labs(title = "Scatterplot of Flipper Length vs Body Mass",                             # Add title
       x = "Flipper Length (mm)",                                                        # Add x-axis label
       y = "Body Mass (g)",                                                              # Add y-axis label
       colour = "Species") +                                                             # Add legend title    
  theme(legend.position = "bottom")                                                      # Change legend position
`geom_smooth()` using formula = 'y ~ x'

7 Saving Plots

7.1 Saving Plots in RStudio

You can save plots in RStudio by clicking on the Export button in the Plots pane. This will open a pop-up window where you can select the file type and location to save the plot.

7.2 Saving Plots in R

You can also save plots in R using the ggsave() function. This function takes the following arguments:

  • filename: The name of the file to save the plot to.
  • plot: The plot to save.
  • device: The graphics device to use. The default is png.
  • width: The width of the plot in inches.
  • height: The height of the plot in inches.
  • units: The units to use for the width and height. The default is in.
  • dpi: The resolution of the plot in dots per inch. The default is 300.

Let’s save the scatterplot we created earlier as a png file. This will be saved to your working directory!

# Create scatterplot of flipper length against body mass
ggplot(data = penguins, aes(x = flipper_length_mm, y = body_mass_g, colour = species)) + # Specify data and aesthetic mappings
  geom_point() +                                                                         # Add scatterplot layer
  geom_smooth(method = "lm", se = FALSE) +                                               # Add trendline layer
  labs(title = "Scatterplot of Flipper Length vs Body Mass",                             # Add title
       x = "Flipper Length (mm)",                                                        # Add x-axis label
       y = "Body Mass (g)",                                                              # Add y-axis label
       colour = "Species") +                                                             # Add legend title    
  theme(legend.position = "bottom")                                                      # Change legend position

# Save plot as png file
ggsave(filename = "scatterplot.png", plot = last_plot())

8 Activities

Let’s use the data from an R package called gapminder to practice creating different types of plots. The gapminder package contains data from the Gapminder Foundation, which collects and organises data from around the world. The data set we will be using contains information about life expectancy, population, and GDP per capita for 142 countries from 1952 to 2007.

8.1 Load the data

Start by installing the gapminder package and loading it into your workspace. Then, load the data into your workspace using the data() function.

💡 Click here to view a solution
# Install and load the gapminder package
install.packages("gapminder")

# Load the gapminder package
library(gapminder)

8.2 Create a Histogram

Create a histogram of the lifeExp variable using either Base R or ggplot2. Use a bindwith of 1 and colour the bars by continent if using ggplot2 (this won’t easily work for `Base R). Add a title and axis labels.

💡 Click here to view a Base R solution
# Create histogram using base R
hist(gapminder$lifeExp,                     # Specify data
     breaks = seq(0, 100, 1),               # Specify breaks
     main = "Histogram of Life Expectancy", # Add title
     xlab = "Life Expectancy",              # Add x-axis label
     ylab = "Count")                        # Add y-axis label


# Advanced solution for base R
# Split the data by continent
data_split <- split(gapminder, gapminder$continent) 

# Set up colors
colors <- rainbow(length(data_split))                   # Create a vector of colors

# Plot the first histogram to set up the plot
hist(data_split[[1]]$lifeExp, breaks = seq(0, 100, 1),  # Specify data and breaks
     col = colors[1], xlim = c(0, 100),                 # Specify color and x-axis limits
     main = "Histogram of Life Expectancy",             # Add title
     xlab = "Life Expectancy", ylab = "Count")          # Add axis labels

# Add the other histograms
for(i in 2:length(data_split)) {
    hist(data_split[[i]]$lifeExp, breaks = seq(0, 100, 1), 
         col = colors[i], add = TRUE)
}

# Add a legend
legend("topleft", legend = names(data_split), fill = colors)
💡 Click here to view a ggplot2 solution
# Create histogram using ggplot2
ggplot(data = gapminder, aes(x = lifeExp, fill = continent)) + # Specify data and aesthetic mappings
  geom_histogram(binwidth = 1) +                               # Add histogram layer
  labs(title = "Histogram of Life Expectancy",                 # Add title
       x = "Life Expectancy",                                  # Add x-axis label
       y = "Count")                                            # Add y-axis label

8.3 Create a Boxplot

Create a boxplot of the gdpPercap variable by continent using either Base R or ggplot2. Add a title and axis labels.

💡 Click here to view a Base R solution
# Create boxplot using base R
boxplot(gapminder$gdpPercap ~ gapminder$continent,   # Specify data
        main = "Boxplot of GDP per Capita",          # Add title
        xlab = "Continent", ylab = "GDP per Capita") # Add axis labels
💡 Click here to view a ggplot2 solution
# Create boxplot using ggplot2
ggplot(data = gapminder, aes(x = continent, y = gdpPercap)) + # Specify data and aesthetic mappings
  geom_boxplot() +                                            # Add boxplot layer
  labs(title = "Boxplot of GDP per Capita",                   # Add title
       x = "Continent", y = "GDP per Capita")                 # Add axis labels

8.4 Create a Barplot

Create a barplot of the average lifeExp by continent using either Base R or ggplot2. Add a title and axis labels. You will need to calculate the average lifeExp by continent first as well as the standard error of the mean using the tapply() function.

💡 Click here to view a Base R solution
# Calculate the average life expectancy by continent
lifeExp_avg <- tapply(gapminder$lifeExp, gapminder$continent, mean)

# Calculate the standard error of the mean
lifeExp_sem <- tapply(gapminder$lifeExp, gapminder$continent, sd) / sqrt(tapply(gapminder$lifeExp, gapminder$continent, length))

# Create barplot using base R
midpoints <- barplot(lifeExp_avg, main = "Average Life Expectancy by Continent", # Specify data and add title
                     xlab = "Continent", ylab = "Life Expectancy",               # Add axis labels
                     ylim = c(0, 100),                                           # Specify y-axis limits
                     col = c("red", "blue", "green", "yellow", "purple"))        # Specify colors

# Add error bars
for(i in 1:length(lifeExp_avg)) {
  # Coordinates for the error bars
  x0 <- midpoints[i]
  y0 <- lifeExp_avg[i] - lifeExp_sem[i] # Lower point of the error bar
  y1 <- lifeExp_avg[i] + lifeExp_sem[i] # Upper point of the error bar

  # Draw the error bars
  arrows(x0, y0, x0, y1, angle = 90, code = 3, length = 0.05)
}
💡 Click here to view a ggplot2 solution
# Create barplot using ggplot2
ggplot(data = gapminder, aes(x = continent, y = lifeExp)) +      # Specify data and aesthetic mappings
  geom_bar(stat = "summary", fun = "mean", fill = c("red", 
                                                    "blue", 
                                                    "green", 
                                                    "yellow", 
                                                    "purple")) + # Add barplot layer
  geom_errorbar(stat = "summary", fun.data = "mean_se",          # Add error bars
                width = 0.2, color = "black") +                  # Specify width and color
  labs(title = "Average Life Expectancy by Continent",           # Add title
       x = "Continent", y = "Life Expectancy")                   # Add axis labels

8.5 Create a scatterplot

Create a scatterplot of lifeExp vs. gdpPercap using either Base R or ggplot2. Add a title and axis labels. Colour the points by continent. Describe the relationship between lifeExp and gdpPercap.

💡 Click here to view a Base R solution
# Define a colour for each continent
continent_colours <- setNames(rainbow(length(levels(gapminder$continent))), 
                             levels(gapminder$continent))

# Create scatterplot using base R
plot(gapminder$gdpPercap, gapminder$lifeExp,            # Specify data
     main = "Life Expectancy vs. GDP per Capita",       # Add title
     xlab = "GDP per Capita", ylab = "Life Expectancy", # Add axis labels
     col = continent_colours[gapminder$continent],       # Specify colors
     pch = 16)                                          # Specify point type

# Add a legend to the bottom right corner
legend("bottomright", legend = names(continent_colours), 
       col = continent_colours, 
       pch = 16, title = "Continent")
💡 Click here to view a ggplot2 solution
# Create scatterplot using ggplot2
ggplot(data = gapminder, aes(x = gdpPercap, y = lifeExp, color = continent)) + # Specify data and aesthetic mappings
  geom_point() +                                                               # Add scatterplot layer
  labs(title = "Life Expectancy vs. GDP per Capita",                           # Add title
       x = "GDP per Capita", y = "Life Expectancy")                            # Add axis labels

8.6 Create a line plot

Create a line plot of the average lifeExp by year using either Base R or ggplot2. Add a title and axis labels. Colour the lines by continent. Which continent has the highest average life expectancy? Which continent has the lowest average life expectancy?

💡 Click here to view a Base R solution
# Calculate the average life expectancy by year and continent
lifeExp_avg <- tapply(gapminder$lifeExp,                # Specify data
                      list(gapminder$year,              # Group by year
                           gapminder$continent), mean)  # Group by continent

# Create an empty plot
plot(0, 0, type = "n", xlim = range(gapminder$year),         # Specify x-axis limits
     ylim = range(lifeExp_avg, na.rm = TRUE),                # Specify y-axis limits
     main = "Average Life Expectancy by Year and Continent", # Add title
     xlab = "Year", ylab = "Life Expectancy")                # Add axis labels

# Define colors
colors <- rainbow(ncol(lifeExp_avg)) # One color for each continent

# Add lines for each continent
continents <- colnames(lifeExp_avg) # Get the names of the continents
for(i in 1:ncol(lifeExp_avg)) {     # Loop through each continent
  lines(row.names(lifeExp_avg),     # Specify x-axis values
        lifeExp_avg[, i],           # Specify y-axis values
        col = colors[i],            # Specify color
        type = "b", pch = 19)       # Specify line type and point type
}

# Add a legend
legend("bottomright", 
       legend = continents, # Specify legend labels
       col = colors,        # Specify colors
       lty = 1,             # Specify line type
       pch = 19)            # Specify point type
💡 Click here to view a ggplot2 solution
# Create line plot using ggplot2
ggplot(data = gapminder, aes(x = year, y = lifeExp, color = continent)) + # Specify data and aesthetic mappings
  geom_line(stat = "summary", fun = "mean") +                             # Add line plot layer
  geom_point(stat = "summary", fun = "mean", pch = 19) +                  # Add points
  labs(title = "Average Life Expectancy by Year and Continent",           # Add title
       x = "Year", y = "Life Expectancy")                                 # Add axis labels

8.7 Which framework is easier to customise?

Which framework do you prefer for creating plots? Why? Write a short paragraph describing your thoughts.

8.8 Make your own plot

Create a plot of your choice using either Base R or ggplot2 and the gapminder data. Be creative! You can use tutorials like this excellent one to help you.