Visualising COVID data using R and leaflet

Data visualisation series (Post #2)

Riddhiman

Leaflet is a JavaScript library for interactive maps. Leaflet for R is super easy to use and a great way to visualise data that has a spatial dimension. Below are some examples on how to use the leaflet package in R to visualise some COVID-19 data.

Packages

# Pacman is a package management tool 
install.packages("pacman")
library(pacman)

# p_load automatically installs packages if needed
p_load(dplyr, leaflet)

Sample dataset

A summarised version of the COVID-19 Data Repository hosted by JHU is available for download here

df <- read.csv("covid_data.csv")

Confirmed cases across countries

Let’s say we wanted to visualise how the cumulative case load is distributed across different countries on a map (as of a certain date).

plt <- df %>% 
    filter(date == "2021-09-01") %>% 
    
    # Circle radius 
    # Arbitrary scaling function for dramatic effect
    mutate(rad = sqrt(confirmed/max(confirmed)) * 100) %>% 
    
    # Leaflet
    leaflet(options = leafletOptions(minZoom = 2, maxZoom = 6)) %>% 
    
    # Base map layer
    # Lots of other options see https://rstudio.github.io/leaflet/basemaps.html
    addProviderTiles(providers$Stamen.Toner,
                     options = providerTileOptions(opacity = 0.6)) %>%
    
    addCircleMarkers(lng = ~Longitude, 
                     lat = ~Latitude, 
                     radius = ~rad, 
                     popup = ~text,  ## Pop-ups understand html (see df$text)
                     weight = 0.7,
                     stroke =T, 
                     color = "#000000",
                     fillColor = "#79B4B7", 
                     fillOpacity = 0.3)
plt

The nice thing about leaflet is that visualisations are interactive (try clicking one of the bubbles). Note that the labels have already been generated.

head(df$text)
## [1] "<b>Afghanistan</b><br>Confirmed: <b>0.0</b><br>Death: <b>0.0</b><br>Fatality Ratio: <b>0.0%</b><br>"
## [2] "<b>Afghanistan</b><br>Confirmed: <b>0.0</b><br>Death: <b>0.0</b><br>Fatality Ratio: <b>0.0%</b><br>"
## [3] "<b>Afghanistan</b><br>Confirmed: <b>0.0</b><br>Death: <b>0.0</b><br>Fatality Ratio: <b>0.0%</b><br>"
## [4] "<b>Afghanistan</b><br>Confirmed: <b>0.0</b><br>Death: <b>0.0</b><br>Fatality Ratio: <b>0.0%</b><br>"
## [5] "<b>Afghanistan</b><br>Confirmed: <b>0.0</b><br>Death: <b>0.0</b><br>Fatality Ratio: <b>0.0%</b><br>"
## [6] "<b>Afghanistan</b><br>Confirmed: <b>0.0</b><br>Death: <b>0.0</b><br>Fatality Ratio: <b>0.0%</b><br>"

Tracking confirmed cases over time

Now say we wanted to understand how the number of confirmed cases were distributed across countries at different points in time. Leaflet has a nice feature which allows users to add layers to the same graphic and UI elements (like radio buttons) to help switch between layers using the group argument and the addLayersControl() function.


# Date series (every 21 days)
vec <- seq.Date(as.Date("2020-03-01"), to = as.Date("2021-09-01"), by = "21 days")
vec <- as.character(vec)

plt <- df %>% 
    
    # Track the top 20 countries for each date
    group_by(date) %>% 
    
    arrange(desc(confirmed)) %>% 
    
    filter(row_number() <= 50, 
           date %in% vec) %>% 
    
    # Circle radius 
    # Arbitrary scaling function for dramatic effect
    mutate(rad = sqrt(confirmed/max(confirmed)) * 80) %>% 
    
    # Leaflet
    leaflet(options = leafletOptions(minZoom = 2, maxZoom = 6, )) %>% 
    
    # Base map layer
    # Lots of other options see https://rstudio.github.io/leaflet/basemaps.html
    addProviderTiles(providers$Stamen.TonerLite,
                     options = providerTileOptions(opacity = 0.8)) %>%
    
    addCircleMarkers(lng = ~Longitude, 
                     lat = ~Latitude, 
                     radius = ~rad, 
                     popup = ~text,
                     weight = 0.7,
                     stroke = T, 
                     color = "#000000",
                     fillColor = "#525c63", 
                     fillOpacity = 0.5, 
                     group = ~date, 
                     labelOptions = labelOptions(noHide = F)) %>% 
    # Layer control
    addLayersControl(
        
        # Using baseGroups adds radio buttons which makes it easier to switch
        baseGroups = vec,
        
        # Using overlayGroups adds checkboxes        
        # overlayGroups = vec
        
        options = layersControlOptions(collapsed = FALSE))
plt

It is interesting to note which countries were affected by COVID-19 in early 2020, how it then rapidly spread across Europe, the impact of lock-downs onthe number of confirmed cases, India’s terrible second wave in mid 2021 and now the increasing number of cases in South East Asia.

Thoughts? Comments? Helpful? Not helpful? Like to see anything else added in here? Let me know!