Interactive Plots

1 Objectives

  • Overview of plotly and htmlwidgets;
  • Draw an interactive graph using ggplotly();
  • Draw an interactive graph using plot_ly().

2 Start a Script

For this lab or project, begin by:

  • Starting a new R script
  • Create a good header section and table of contents
  • Save the script file with an informative name
  • set your working directory

Aim to make the script a future reference for doing things in R!

3 Introduction

ggplot2 is the most widely used package for data visualisation in R. Its consistent syntax, useful defaults, and flexibility make it a fantastic tool for creating high-quality figures. Although ggplot2 is great, there are other data visualisation tools that deserve a place in a data scientist’s toolbox. We’ll begin our foray into the interactive world with plotly, which is a high-level interface to plotly.js and provides an easy-to-use user interface to generate slick interactive graphics. These interactive graphs give the user the ability to zoom the plot in and out, hover over a point to get additional information, filter to groups of points, and much more. Such interactivity contribute to an engaging user experience and allows information to be displayed in ways that are not possible with static figures.

3.1 htmlwidgets

The .js in plotly.js is short for JavaScript. JavaScript is a programming language that runs a majority of the internet’s interactive webpages. To make a webpage interactive, JavaScript code is embedded into HTML which is run by the user’s web browser. As the user interacts with the page, the JavaScript renders new HTML, providing the interactive experience that we are looking for. htmlwidgets is the framework that allows for the creation of R bindings to JavaScript libraries. These JavaScript visualizations can be embedded into R Markdown documents (html) or shiny apps.

4 Packages and Data

We’re going to look at a dataset about TV shows that was collected from IMDb (The Internet Movie Database). You will need to download this dataset from here and save it with your other lab datasets. It’s a relatively basic dataset. It contains 48 observations of 6 variables. The variables are:

  • title - the title of the TV show;
  • seasonNumber - the number of seasons the show has had;
  • av_rating - the average rating of the show;
  • share - percentage share of all TV sets in use that were tuned to the show;
  • genres - the genre(s) of the show;
  • statuts - whether the show is considered to be a riser (better) or faller (worse).

Let’s load the packages and data:

# Load packages
if(!require("tidyverse")) install.packages("tidyverse")
if(!require("plotly")) install.packages("plotly")

# Load data
tv_shows <- read.csv("data/tv_shows.csv", header = TRUE)

# Inspect data
glimpse(tv_shows, n = 5)
Rows: 48
Columns: 6
$ title        <chr> "BoJack Horseman", "BoJack Horseman", "BoJack Horseman", …
$ seasonNumber <int> 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 6, 7, 8, 9, …
$ av_rating    <dbl> 7.7871, 8.0440, 8.3554, 8.6384, 9.4738, 8.7110, 8.7655, 8…
$ share        <dbl> 0.88, 0.73, 0.62, 0.78, 0.45, 11.55, 15.42, 12.34, 10.79,…
$ genres       <chr> "Animation,Comedy,Drama", "Animation,Comedy,Drama", "Anim…
$ status       <chr> "riser", "riser", "riser", "riser", "riser", "riser", "ri…

As we can see, the tv_shows data frame contains basic information such as the show title, season number and genre as well as the average rating, audience share and status. We’ll use this data to create some interactive plots.

5 plotly

There are two main approaches to initialize a plotly object: transforming a ggplot2 object with ggplotly() or setting up aesthetics mappings with plot_ly() directly.

5.1 ggplotly()

ggplotly() takes existing ggplot2 objects and converts them into interactive plotly graphics. This makes it easy to create interactive figures while using the ggplot2 syntax that we’re already used to. Additionally, ggplotly() allows us to use ggplot2 functionality that would not be as easily replicated with plotly and tap into the wide range of ggplot2 extension packages. Let’s start by making a static graph:

# create a static ggplot2 plot
my_plot <- # assign your ggplot2 object to a name
  ggplot(tv_shows) + # call ggplot() on the tv_shows data frame
  aes(x = seasonNumber,  # set the x aesthetic to seasonNumber
      y = av_rating, # set the y aesthetic to av_rating
      group = title, # set the group aesthetic to title
      colour = title) + # set the colour aesthetic to title
  geom_line() # add a line layer

# view the plot
my_plot

Okay, this graph before ain’t very pretty. Let’s make it look a little nicer:

# create a nicer looking plot
my_improved_plot <- # assign your ggplot2 object to a name
  ggplot(tv_shows) + # call ggplot() on the tv_shows data frame
  aes(x = seasonNumber, # set the x aesthetic to seasonNumber
      y = av_rating, # set the y aesthetic to av_rating
      group = title, # set the group aesthetic to title
      colour = status) + # set the colour aesthetic to status
  geom_line() + # add a line layer
  theme_minimal() + # set the theme to minimal
  labs(title = "Quitting while you're ahead - which TV shows ended at their peak?", # set the title
       x = "Season number", # set the x axis label
       y = "Average rating", # set the y axis label
       caption = "Data: IMDb", # set the caption
       colour = "Ratings trend") + # set the colour legend title
  scale_x_continuous(breaks = c(1:10)) + # set the x axis breaks
  expand_limits(y = c(5,10)) + # set the y axis limits
  theme(panel.grid.minor = element_blank()) + # remove minor grid lines
  scale_colour_manual(values =c ("riser" = "blue", "faller" = "grey")) # set the colour scale

# view the plot
my_improved_plot

After assigning your ggplot2 object to a name, the only step to plotly-ize it is calling ggplotly() on that object. The difference between the two is that the plotly figure is interactive. Try it out for yourself! Some of the interactive features to try out include hovering over a point to see the exact x and y values, zooming in by selecting (click+drag) a region, and subsetting to specific groups by clicking their names in the legend.

# Convert to plotly
ggplotly(my_improved_plot)

The difference between the two is that the plotly figure is interactive. Try it out for yourself! Some of the interactive features to try out include hovering over a point to see the exact x and y values, zooming in by selecting (click+drag) a region, and subsetting to specific groups by clicking their names in the legend.

5.2 plot_ly()

plot_ly() is the base plotly command to initialize a plot from a data frame, similar to ggplot() from ggplot2. Let’s dig in to see how this works:

# create a plotly plot
tv_shows %>% # call the tv_shows data frame
  plot_ly(x = ~ seasonNumber, # set the x aesthetic to seasonNumber
          y = ~ av_rating, # set the y aesthetic to av_rating
          color = ~ status) # set the colour aesthetic to status

Although we did not specify the plot type, it defaulted to a scatter plot. This is no good to us as we want a line graph, so we need to use the add_lines() function:

# add a line layer
plot_ly(tv_shows) %>% # call the tv_shows data frame
  add_lines( # add a line layer
    x = ~ seasonNumber, # set the x aesthetic to seasonNumber
    y = ~ av_rating, # set the y aesthetic to av_rating
    color = ~ status) # set the colour aesthetic to status

Woah, that doesn’t look right! We need to group our data by TV show to prevent all the dots joining:

# replot with grouping
plot_ly(tv_shows) %>% # call the tv_shows data frame
  group_by(title) %>%  # group_by() function from dplyr
  add_lines( # add a line layer
    x = ~ seasonNumber, # set the x aesthetic to seasonNumber
    y = ~ av_rating, # set the y aesthetic to av_rating
    color = ~ status) # set the colour aesthetic to status

Much better! We still need to improve the graph though. Let’s add the TV show title to points when we hover over them:

# add hover text
plot_ly(tv_shows) %>% # call the tv_shows data frame
  group_by(title) %>% # group_by() function from dplyr
  add_lines( # add a line layer
    x = ~ seasonNumber, # set the x aesthetic to seasonNumber
    y = ~ av_rating, # set the y aesthetic to av_rating
    color = ~ status, # set the colour aesthetic to status
    text = ~  title,  # What to text
    hoverinfo = 'text')  # Tells plotly what information will be shown

We can also tidy up our axis titles and main title:

# tidy up the plot
plot_ly(tv_shows) %>% # call the tv_shows data frame
  group_by(title) %>% # group_by() function from dplyr
  add_lines( # add a line layer
    x = ~ seasonNumber, # set the x aesthetic to seasonNumber
    y = ~ av_rating, # set the y aesthetic to av_rating
    color = ~ status, # set the colour aesthetic to status
    text = ~  title, # What to text
    hoverinfo = 'text' # Tells plotly what information will be shown
  ) %>%
  layout(title = "Quitting while you're ahead - which TV shows ended at their peak?", # set the title
         xaxis = list(title = "Season number"), # set the x axis label
         yaxis = list(title = "Average rating")) # set the y axis label

I personally prefer to create my graphs using ggplot2 then convert them using ggplotly(). If you want to learn more about using plot_ly() I recommend you take a look at Sievert (2019), which is a free online book dedicated to interactive visualisations using plotly in R. I hope you will agree that there is not much of a learning curve for plotly due to the intuitive syntax.

6 Activities

6.1 Interactive Plot

Take one of your existing ggplots and turn it into an interactive visualisation using ggplotly().

6.2 Plotly

Try to replicate your ggplot from exercise one using plot_ly().

6.3 Penguins

Use the penguins dataset from the palmerpenguins package to create an interactive plotly graph. You can use the ggplotly() function to convert your ggplot to plotly. I would suggest that you make a scatterplot of bill length vs bill depth and colour the points by species. You can also add a trendline using geom_smooth().

💡 Click here to view a solution
# Load packages
library(palmerpenguins)
library(ggplot2)
library(plotly)

# Import data
penguins <- penguins

# Create ggplot
penguins_plot <- # Create a ggplot
  ggplot(penguins) + # Call the penguins data frame
  aes(x = bill_length_mm, # Set the x aesthetic to bill_length_mm
      y = bill_depth_mm, # Set the y aesthetic to bill_depth_mm
      colour = species) + # Set the colour aesthetic to species
  geom_point() + # Add a point layer
  geom_smooth(method = "lm", se = FALSE) + # Add a trendline
  labs(title = "Bill length vs bill depth", # Set the title
       x = "Bill length (mm)", # Set the x axis label
       y = "Bill depth (mm)", # Set the y axis label
       colour = "Species") + # Set the colour legend title
  theme_minimal() # Set the theme

# Convert to plotly
ggplotly(penguins_plot)

7 Recap

  • ggplot2 is a package for creating static graphs;
  • plotly is a package for creating interactive graphs;
  • ggplotly() is a function that converts ggplot2 graphs to plotly graphs;
  • plot_ly() is the base function for creating plotly graphs.