Density plots

Learn how to visualise continuous variables with density plots

Published

September 13, 2024

1 Density plots

Density plots

Density plots show the distribution (i.e. the probability density) of the values of a continuous variable.

They are created with geom_density().

Let’s plot the VOT data from alb_vot.

alb_vot <- read_csv("data/coretta2021/alb-vot.csv") |> 
  mutate(
    # Multiply by 1000 to get ms from s
    vot = (voi_onset - release) * 1000
  )
Rows: 180 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): speaker, file, label, consonant
dbl (2): release, voi_onset

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
alb_vot

VOT is a numeric continuous variable so density plots are appropriate.

To plot the probability density of a continuous variable, you can use the density geometry. Remember, all geometry functions start with geom_.

Fill in the … in the following code to create a density plot of VOT values in alb_vot.

alb_vot %>%
  ggplot(aes(x = vot)) +
  ...

Note that to create a density plot, you only need to specify the x-axis. The y-axis is the probability density, which is automatically calculated (a bit like counts in bar charts, remember?).

This is what the plot should look like.

1.1 Make things cosy with a rug

The density line shows you a smoothed representation of the data distribution over the VOT values, but you might also want to see the raw data.

You can do so by adding the rug geometry. Go ahead and add a rug…

alb_vot %>%
  ggplot(aes(vot)) +
  geom_density() +
  ...

You should get the following:

Nice huh?

Rug

Raw data can be shown with a rug, i.e. ticks on the axes that mark where the data is.

You can add a rug with geom_rug().

Quiz 1

What can you notice about the distribution of VOT values?

Are there multiple peaks in the distribution?