Class 24

More on Iteration and Tables

Materials for class on

2024-11-19

Agenda

Today we’ll focus on:

  • Practicing Iteration
  • Tables!

Iteration

We’re going to work through some of the in-text examples from the Iteration chapter together.

If you want to try the file manipulation examples, you can download the gapminder folder from our drive folder.

Tables

It’s nice to make your data outputs look nicer than raw console output sometimes! And you might need to include other kinds of tables in your reports as well. There are a many different ways to create tables in Quarto (and Rmarkdown), but today we’ll focus on the {kable} ecosystem.

{kable} is expanded with the {kableExtra} package. Tables need to be created in different ways depending on your final output, so one thing these packages manage is turning your data output or Quarto markdown into a format that is suitable for html, slides, PDF, etc. This means some packages excel at one of these outputs, and don’t really do the others. {kableExtra} is a good multi-output option.

This is about Rmarkdown, but applies also to Quarto, and gives an overview of some other options: https://bookdown.org/yihui/rmarkdown-cookbook/table-other.html

R Graph Gallery also has an overview: https://r-graph-gallery.com/table.html

A nice recent (if biased) review of major options: https://vincentarelbundock.github.io/tinytable/vignettes/alternatives.html

Some packages to explore:

Tables from Tibbles

Let’s look at the stimuli from the experiment we looked at a while ago, Husband (2022), and show it as a formatted table using kableExtra.

library(tidyverse)
library(kableExtra)
df_stimuli <- tibble::tribble(
     ~condition,  ~item,  ~sentence,
  "expected", 1,  "The highlight of Jack’s trip to India was when he got to ride an elephant in the parade.",
  "unexpected", 1, "The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade.",
  "expected", 2, "You never forget how to ride a bicycle once you’ve learned.",
  "unexpected", 2, "You never forget how to ride an elephant once you’ve learned."
  )
df_stimuli
#> # A tibble: 4 × 3
#>   condition   item sentence                                                     
#>   <chr>      <dbl> <chr>                                                        
#> 1 expected       1 The highlight of Jack’s trip to India was when he got to rid…
#> 2 unexpected     1 The highlight of Jack’s trip to India was when he got to rid…
#> 3 expected       2 You never forget how to ride a bicycle once you’ve learned.  
#> 4 unexpected     2 You never forget how to ride an elephant once you’ve learned.

We can make it look like a nice report table by processing it through the {kableExtra} styling:

df_stimuli |> 
  kbl() |> 
  kable_styling()
condition item sentence
expected 1 The highlight of Jack’s trip to India was when he got to ride an elephant in the parade.
unexpected 1 The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade.
expected 2 You never forget how to ride a bicycle once you’ve learned.
unexpected 2 You never forget how to ride an elephant once you’ve learned.

We can also use different style settings:

df_stimuli |> 
  kbl() |> 
  kable_styling(bootstrap_options = c("striped", "hover"))
condition item sentence
expected 1 The highlight of Jack’s trip to India was when he got to ride an elephant in the parade.
unexpected 1 The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade.
expected 2 You never forget how to ride a bicycle once you’ve learned.
unexpected 2 You never forget how to ride an elephant once you’ve learned.

You can use gt() in a similar fashion:

library(gt)
df_stimuli |> 
  gt()
condition item sentence
expected 1 The highlight of Jack’s trip to India was when he got to ride an elephant in the parade.
unexpected 1 The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade.
expected 2 You never forget how to ride a bicycle once you’ve learned.
unexpected 2 You never forget how to ride an elephant once you’ve learned.

And adjust the visual style:

library(gt)
df_stimuli |> 
  gt() |> 
  opt_row_striping() |> 
  opt_stylize(style = 4, color = "pink")
condition item sentence
expected 1 The highlight of Jack’s trip to India was when he got to ride an elephant in the parade.
unexpected 1 The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade.
expected 2 You never forget how to ride a bicycle once you’ve learned.
unexpected 2 You never forget how to ride an elephant once you’ve learned.

Tables from Summary Output

You can use the same approach to turn your summary data into a nice table.

msleep |> 
  group_by(vore) |> 
  summarize(
    across(where(is.numeric), \(x) mean(x, na.rm = TRUE))
  ) |> 
  kbl() |> 
  kable_styling()
vore sleep_total sleep_rem sleep_cycle awake brainwt bodywt
carni 10.378947 2.290000 0.3733333 13.62632 0.0792556 90.75111
herbi 9.509375 1.366667 0.4180556 14.49062 0.6215975 366.87725
insecti 14.940000 3.525000 0.1611111 9.06000 0.0215500 12.92160
omni 10.925000 1.955556 0.5924242 13.07500 0.1457312 12.71800
NA 10.185714 1.880000 0.1833333 13.81429 0.0076260 0.85800

Sometimes you want to create a summary row in your table, which can be a bit awkward in R, since rows are intended to be observations. If we wanted to get the mean of the means at the bottom, we could compute that and use bind_rows() to put it back together:

df_meansleep <- msleep |> 
  group_by(vore) |> 
  summarize(
    across(where(is.numeric), \(x) mean(x, na.rm = TRUE))
  ) 

total_meansleep <- df_meansleep |>  
  summarize(
    across(where(is.numeric), \(x) mean(x, na.rm = TRUE))
  ) |> 
  mutate(vore = "TOTAL")

bind_rows(df_meansleep, total_meansleep)
#> # A tibble: 6 × 7
#>   vore    sleep_total sleep_rem sleep_cycle awake brainwt  bodywt
#>   <chr>         <dbl>     <dbl>       <dbl> <dbl>   <dbl>   <dbl>
#> 1 carni         10.4       2.29       0.373 13.6  0.0793   90.8  
#> 2 herbi          9.51      1.37       0.418 14.5  0.622   367.   
#> 3 insecti       14.9       3.52       0.161  9.06 0.0216   12.9  
#> 4 omni          10.9       1.96       0.592 13.1  0.146    12.7  
#> 5 <NA>          10.2       1.88       0.183 13.8  0.00763   0.858
#> 6 TOTAL         11.2       2.20       0.346 12.8  0.175    96.8

If you are using {gt}, there are some other ways to create summaries, described in this vignette.

msleep |> 
  group_by(vore) |> 
  summarize(
    across(where(is.numeric), \(x) mean(x, na.rm = TRUE))
  ) |> 
  gt() |> 
  grand_summary_rows(
    columns = c(sleep_total, sleep_rem, sleep_cycle, awake, brainwt, bodywt),
    fns = "mean"
  )
vore sleep_total sleep_rem sleep_cycle awake brainwt bodywt
carni 10.378947 2.290000 0.3733333 13.62632 0.07925556 90.75111
herbi 9.509375 1.366667 0.4180556 14.49062 0.62159750 366.87725
insecti 14.940000 3.525000 0.1611111 9.06000 0.02155000 12.92160
omni 10.925000 1.955556 0.5924242 13.07500 0.14573118 12.71800
NA 10.185714 1.880000 0.1833333 13.81429 0.00762600 0.85800
mean 11.18781 2.203444 0.3456515 12.81325 0.175152 96.82519

The {janitor} package provides some useful utility functions for doing this, by “adorning” tables. For example:

library(janitor)
starwars |> 
  filter(species == "Human") |> 
  tabyl(gender, eye_color) |> 
  adorn_totals(c("row", "col")) |> 
  adorn_percentages("row") |> 
  adorn_pct_formatting(rounding = "half up", digits = 0) |> 
  adorn_ns() |> 
  adorn_title("combined") |> 
  gt()
gender/eye_color blue blue-gray brown dark hazel unknown yellow Total
feminine 33% (3) 0% (0) 44% (4) 0% (0) 11% (1) 11% (1) 0% (0) 100% (9)
masculine 35% (9) 4% (1) 46% (12) 4% (1) 4% (1) 0% (0) 8% (2) 100% (26)
Total 34% (12) 3% (1) 46% (16) 3% (1) 6% (2) 3% (1) 6% (2) 100% (35)