Class 24

Agenda

Today we’ll focus on:

Practicing Iteration
Tables!

Iteration

We’re going to work through some of the in-text examples from the Iteration chapter together.

If you want to try the file manipulation examples, you can download the gapminder folder from our drive folder.

Tables

It’s nice to make your data outputs look nicer than raw console output sometimes! And you might need to include other kinds of tables in your reports as well. There are a many different ways to create tables in Quarto (and Rmarkdown), but today we’ll focus on the {kable} ecosystem.

{kable} is expanded with the {kableExtra} package. Tables need to be created in different ways depending on your final output, so one thing these packages manage is turning your data output or Quarto markdown into a format that is suitable for html, slides, PDF, etc. This means some packages excel at one of these outputs, and don’t really do the others. {kableExtra} is a good multi-output option.

This is about Rmarkdown, but applies also to Quarto, and gives an overview of some other options: https://bookdown.org/yihui/rmarkdown-cookbook/table-other.html

R Graph Gallery also has an overview: https://r-graph-gallery.com/table.html

A nice recent (if biased) review of major options: https://vincentarelbundock.github.io/tinytable/vignettes/alternatives.html

Some packages to explore:

gt - based on a “grammar of tables”
- https://themockup.blog/static/resources/gt-cookbook.html
- https://themockup.blog/static/resources/gt-cookbook-advanced.html
kable and kableExtra
gtsummary
reactable
flextable - focus on Word and Powerpoint outputs (ftExtra)
huxtable - emphasis on text/number based tables, especially for PDF (LaTeX) output styling while also working for html/markdown
tinytable
modelsummary - statistical tables (some overlap with gtsummary)

Tables from Tibbles

Let’s look at the stimuli from the experiment we looked at a while ago, Husband (2022), and show it as a formatted table using kableExtra.

library(tidyverse)
library(kableExtra)

df_stimuli <- tibble::tribble(
     ~condition,  ~item,  ~sentence,
  "expected", 1,  "The highlight of Jack’s trip to India was when he got to ride an elephant in the parade.",
  "unexpected", 1, "The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade.",
  "expected", 2, "You never forget how to ride a bicycle once you’ve learned.",
  "unexpected", 2, "You never forget how to ride an elephant once you’ve learned."
  )
df_stimuli

#> # A tibble: 4 × 3
#>   condition   item sentence                                                     
#>   <chr>      <dbl> <chr>                                                        
#> 1 expected       1 The highlight of Jack’s trip to India was when he got to rid…
#> 2 unexpected     1 The highlight of Jack’s trip to India was when he got to rid…
#> 3 expected       2 You never forget how to ride a bicycle once you’ve learned.  
#> 4 unexpected     2 You never forget how to ride an elephant once you’ve learned.

We can make it look like a nice report table by processing it through the {kableExtra} styling:

df_stimuli |> 
  kbl() |> 
  kable_styling()

condition	item	sentence
expected	1	The highlight of Jack’s trip to India was when he got to ride an elephant in the parade.
unexpected	1	The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade.
expected	2	You never forget how to ride a bicycle once you’ve learned.
unexpected	2	You never forget how to ride an elephant once you’ve learned.

We can also use different style settings:

df_stimuli |> 
  kbl() |> 
  kable_styling(bootstrap_options = c("striped", "hover"))

condition	item	sentence
expected	1	The highlight of Jack’s trip to India was when he got to ride an elephant in the parade.
unexpected	1	The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade.
expected	2	You never forget how to ride a bicycle once you’ve learned.
unexpected	2	You never forget how to ride an elephant once you’ve learned.

You can use gt() in a similar fashion:

library(gt)
df_stimuli |> 
  gt()

condition	item	sentence
expected	1	The highlight of Jack’s trip to India was when he got to ride an elephant in the parade.
unexpected	1	The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade.
expected	2	You never forget how to ride a bicycle once you’ve learned.
unexpected	2	You never forget how to ride an elephant once you’ve learned.

And adjust the visual style:

library(gt)
df_stimuli |> 
  gt() |> 
  opt_row_striping() |> 
  opt_stylize(style = 4, color = "pink")

condition	item	sentence
expected	1	The highlight of Jack’s trip to India was when he got to ride an elephant in the parade.
unexpected	1	The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade.
expected	2	You never forget how to ride a bicycle once you’ve learned.
unexpected	2	You never forget how to ride an elephant once you’ve learned.

Tables from Summary Output

You can use the same approach to turn your summary data into a nice table.

msleep |> 
  group_by(vore) |> 
  summarize(
    across(where(is.numeric), \(x) mean(x, na.rm = TRUE))
  ) |> 
  kbl() |> 
  kable_styling()

vore	sleep_total	sleep_rem	sleep_cycle	awake	brainwt	bodywt
carni	10.378947	2.290000	0.3733333	13.62632	0.0792556	90.75111
herbi	9.509375	1.366667	0.4180556	14.49062	0.6215975	366.87725
insecti	14.940000	3.525000	0.1611111	9.06000	0.0215500	12.92160
omni	10.925000	1.955556	0.5924242	13.07500	0.1457312	12.71800
NA	10.185714	1.880000	0.1833333	13.81429	0.0076260	0.85800

Sometimes you want to create a summary row in your table, which can be a bit awkward in R, since rows are intended to be observations. If we wanted to get the mean of the means at the bottom, we could compute that and use bind_rows() to put it back together:

df_meansleep <- msleep |> 
  group_by(vore) |> 
  summarize(
    across(where(is.numeric), \(x) mean(x, na.rm = TRUE))
  ) 

total_meansleep <- df_meansleep |>  
  summarize(
    across(where(is.numeric), \(x) mean(x, na.rm = TRUE))
  ) |> 
  mutate(vore = "TOTAL")

bind_rows(df_meansleep, total_meansleep)

#> # A tibble: 6 × 7
#>   vore    sleep_total sleep_rem sleep_cycle awake brainwt  bodywt
#>   <chr>         <dbl>     <dbl>       <dbl> <dbl>   <dbl>   <dbl>
#> 1 carni         10.4       2.29       0.373 13.6  0.0793   90.8  
#> 2 herbi          9.51      1.37       0.418 14.5  0.622   367.   
#> 3 insecti       14.9       3.52       0.161  9.06 0.0216   12.9  
#> 4 omni          10.9       1.96       0.592 13.1  0.146    12.7  
#> 5 <NA>          10.2       1.88       0.183 13.8  0.00763   0.858
#> 6 TOTAL         11.2       2.20       0.346 12.8  0.175    96.8

If you are using {gt}, there are some other ways to create summaries, described in this vignette.

msleep |> 
  group_by(vore) |> 
  summarize(
    across(where(is.numeric), \(x) mean(x, na.rm = TRUE))
  ) |> 
  gt() |> 
  grand_summary_rows(
    columns = c(sleep_total, sleep_rem, sleep_cycle, awake, brainwt, bodywt),
    fns = "mean"
  )

	vore	sleep_total	sleep_rem	sleep_cycle	awake	brainwt	bodywt
	carni	10.378947	2.290000	0.3733333	13.62632	0.07925556	90.75111
	herbi	9.509375	1.366667	0.4180556	14.49062	0.62159750	366.87725
	insecti	14.940000	3.525000	0.1611111	9.06000	0.02155000	12.92160
	omni	10.925000	1.955556	0.5924242	13.07500	0.14573118	12.71800
	NA	10.185714	1.880000	0.1833333	13.81429	0.00762600	0.85800
mean	—	11.18781	2.203444	0.3456515	12.81325	0.175152	96.82519

The {janitor} package provides some useful utility functions for doing this, by “adorning” tables. For example:

library(janitor)
starwars |> 
  filter(species == "Human") |> 
  tabyl(gender, eye_color) |> 
  adorn_totals(c("row", "col")) |> 
  adorn_percentages("row") |> 
  adorn_pct_formatting(rounding = "half up", digits = 0) |> 
  adorn_ns() |> 
  adorn_title("combined") |> 
  gt()

gender/eye_color	blue	blue-gray	brown	dark	hazel	unknown	yellow	Total
feminine	33% (3)	0% (0)	44% (4)	0% (0)	11% (1)	11% (1)	0% (0)	100% (9)
masculine	35% (9)	4% (1)	46% (12)	4% (1)	4% (1)	0% (0)	8% (2)	100% (26)
Total	34% (12)	3% (1)	46% (16)	3% (1)	6% (2)	3% (1)	6% (2)	100% (35)

Related Reading

Agenda

Iteration

Tables

Tables from Tibbles

Tables from Summary Output