Class 24
More on Iteration and Tables
Agenda
Today we’ll focus on:
- Practicing Iteration
- Tables!
Iteration
We’re going to work through some of the in-text examples from the Iteration chapter together.
If you want to try the file manipulation examples, you can download the gapminder folder from our drive folder.
Tables
It’s nice to make your data outputs look nicer than raw console output sometimes! And you might need to include other kinds of tables in your reports as well. There are a many different ways to create tables in Quarto (and Rmarkdown), but today we’ll focus on the {kable} ecosystem.
{kable} is expanded with the {kableExtra} package. Tables need to be created in different ways depending on your final output, so one thing these packages manage is turning your data output or Quarto markdown into a format that is suitable for html, slides, PDF, etc. This means some packages excel at one of these outputs, and don’t really do the others. {kableExtra} is a good multi-output option.
This is about Rmarkdown, but applies also to Quarto, and gives an overview of some other options: https://bookdown.org/yihui/rmarkdown-cookbook/table-other.html
R Graph Gallery also has an overview: https://r-graph-gallery.com/table.html
A nice recent (if biased) review of major options: https://vincentarelbundock.github.io/tinytable/vignettes/alternatives.html
Some packages to explore:
- gt - based on a “grammar of tables”
- kable and kableExtra
- gtsummary
- reactable
- flextable - focus on Word and Powerpoint outputs (ftExtra)
- huxtable - emphasis on text/number based tables, especially for PDF (LaTeX) output styling while also working for html/markdown
- tinytable
- modelsummary - statistical tables (some overlap with gtsummary)
Tables from Tibbles
Let’s look at the stimuli from the experiment we looked at a while ago, Husband (2022), and show it as a formatted table using kableExtra.
df_stimuli <- tibble::tribble(
~condition, ~item, ~sentence,
"expected", 1, "The highlight of Jack’s trip to India was when he got to ride an elephant in the parade.",
"unexpected", 1, "The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade.",
"expected", 2, "You never forget how to ride a bicycle once you’ve learned.",
"unexpected", 2, "You never forget how to ride an elephant once you’ve learned."
)
df_stimuli#> # A tibble: 4 × 3
#> condition item sentence
#> <chr> <dbl> <chr>
#> 1 expected 1 The highlight of Jack’s trip to India was when he got to rid…
#> 2 unexpected 1 The highlight of Jack’s trip to India was when he got to rid…
#> 3 expected 2 You never forget how to ride a bicycle once you’ve learned.
#> 4 unexpected 2 You never forget how to ride an elephant once you’ve learned.
We can make it look like a nice report table by processing it through the {kableExtra} styling:
| condition | item | sentence |
|---|---|---|
| expected | 1 | The highlight of Jack’s trip to India was when he got to ride an elephant in the parade. |
| unexpected | 1 | The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade. |
| expected | 2 | You never forget how to ride a bicycle once you’ve learned. |
| unexpected | 2 | You never forget how to ride an elephant once you’ve learned. |
We can also use different style settings:
| condition | item | sentence |
|---|---|---|
| expected | 1 | The highlight of Jack’s trip to India was when he got to ride an elephant in the parade. |
| unexpected | 1 | The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade. |
| expected | 2 | You never forget how to ride a bicycle once you’ve learned. |
| unexpected | 2 | You never forget how to ride an elephant once you’ve learned. |
You can use gt() in a similar fashion:
| condition | item | sentence |
|---|---|---|
| expected | 1 | The highlight of Jack’s trip to India was when he got to ride an elephant in the parade. |
| unexpected | 1 | The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade. |
| expected | 2 | You never forget how to ride a bicycle once you’ve learned. |
| unexpected | 2 | You never forget how to ride an elephant once you’ve learned. |
And adjust the visual style:
| condition | item | sentence |
|---|---|---|
| expected | 1 | The highlight of Jack’s trip to India was when he got to ride an elephant in the parade. |
| unexpected | 1 | The highlight of Jack’s trip to India was when he got to ride a bicycle in the parade. |
| expected | 2 | You never forget how to ride a bicycle once you’ve learned. |
| unexpected | 2 | You never forget how to ride an elephant once you’ve learned. |
Tables from Summary Output
You can use the same approach to turn your summary data into a nice table.
msleep |>
group_by(vore) |>
summarize(
across(where(is.numeric), \(x) mean(x, na.rm = TRUE))
) |>
kbl() |>
kable_styling()| vore | sleep_total | sleep_rem | sleep_cycle | awake | brainwt | bodywt |
|---|---|---|---|---|---|---|
| carni | 10.378947 | 2.290000 | 0.3733333 | 13.62632 | 0.0792556 | 90.75111 |
| herbi | 9.509375 | 1.366667 | 0.4180556 | 14.49062 | 0.6215975 | 366.87725 |
| insecti | 14.940000 | 3.525000 | 0.1611111 | 9.06000 | 0.0215500 | 12.92160 |
| omni | 10.925000 | 1.955556 | 0.5924242 | 13.07500 | 0.1457312 | 12.71800 |
| NA | 10.185714 | 1.880000 | 0.1833333 | 13.81429 | 0.0076260 | 0.85800 |
Sometimes you want to create a summary row in your table, which can be a bit awkward in R, since rows are intended to be observations. If we wanted to get the mean of the means at the bottom, we could compute that and use bind_rows() to put it back together:
df_meansleep <- msleep |>
group_by(vore) |>
summarize(
across(where(is.numeric), \(x) mean(x, na.rm = TRUE))
)
total_meansleep <- df_meansleep |>
summarize(
across(where(is.numeric), \(x) mean(x, na.rm = TRUE))
) |>
mutate(vore = "TOTAL")
bind_rows(df_meansleep, total_meansleep)#> # A tibble: 6 × 7
#> vore sleep_total sleep_rem sleep_cycle awake brainwt bodywt
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 carni 10.4 2.29 0.373 13.6 0.0793 90.8
#> 2 herbi 9.51 1.37 0.418 14.5 0.622 367.
#> 3 insecti 14.9 3.52 0.161 9.06 0.0216 12.9
#> 4 omni 10.9 1.96 0.592 13.1 0.146 12.7
#> 5 <NA> 10.2 1.88 0.183 13.8 0.00763 0.858
#> 6 TOTAL 11.2 2.20 0.346 12.8 0.175 96.8
If you are using {gt}, there are some other ways to create summaries, described in this vignette.
msleep |>
group_by(vore) |>
summarize(
across(where(is.numeric), \(x) mean(x, na.rm = TRUE))
) |>
gt() |>
grand_summary_rows(
columns = c(sleep_total, sleep_rem, sleep_cycle, awake, brainwt, bodywt),
fns = "mean"
)| vore | sleep_total | sleep_rem | sleep_cycle | awake | brainwt | bodywt | |
|---|---|---|---|---|---|---|---|
| carni | 10.378947 | 2.290000 | 0.3733333 | 13.62632 | 0.07925556 | 90.75111 | |
| herbi | 9.509375 | 1.366667 | 0.4180556 | 14.49062 | 0.62159750 | 366.87725 | |
| insecti | 14.940000 | 3.525000 | 0.1611111 | 9.06000 | 0.02155000 | 12.92160 | |
| omni | 10.925000 | 1.955556 | 0.5924242 | 13.07500 | 0.14573118 | 12.71800 | |
| NA | 10.185714 | 1.880000 | 0.1833333 | 13.81429 | 0.00762600 | 0.85800 | |
| mean | — | 11.18781 | 2.203444 | 0.3456515 | 12.81325 | 0.175152 | 96.82519 |
The {janitor} package provides some useful utility functions for doing this, by “adorning” tables. For example:
library(janitor)
starwars |>
filter(species == "Human") |>
tabyl(gender, eye_color) |>
adorn_totals(c("row", "col")) |>
adorn_percentages("row") |>
adorn_pct_formatting(rounding = "half up", digits = 0) |>
adorn_ns() |>
adorn_title("combined") |>
gt()| gender/eye_color | blue | blue-gray | brown | dark | hazel | unknown | yellow | Total |
|---|---|---|---|---|---|---|---|---|
| feminine | 33% (3) | 0% (0) | 44% (4) | 0% (0) | 11% (1) | 11% (1) | 0% (0) | 100% (9) |
| masculine | 35% (9) | 4% (1) | 46% (12) | 4% (1) | 4% (1) | 0% (0) | 8% (2) | 100% (26) |
| Total | 34% (12) | 3% (1) | 46% (16) | 3% (1) | 6% (2) | 3% (1) | 6% (2) | 100% (35) |