Class 22

Navigating Lists

Materials for class on

2024-11-12

Follow-up Reading and Working Materials

Agenda

Today we’ll focus on:

  • Lists
  • Rectangling
  • Homework 3 work time
Poll

Have you ever worked with JSON files before?

  1. yes
  2. no

Working with Lists and Rectangling

We’ll use the material from the Rectangling chapter to work through together in class and discuss lists.

Preparation for Homework 4 Rectangling

You worked through some of the rectangling chapter, and continue before next class. For the next homework, you will do some more text analysis and practice using the package {RedditExtractoR} to pull posts/comments and other data from Reddit. If we have time we’ll look at doing this in an interactive script, as below:

library(RedditExtractoR)
library(tidyverse)
library(here)
# run this code interactively to save the rds file
top_hk_urls <- find_thread_urls(subreddit="hollowknight", sort_by="top")
write_rds(top_hk_urls, here("top_hk_urls.rds"))
# this code will run when rendering to read in the rds file
top_hk_urls <- read_rds(here("top_hk_urls.rds"))
nat_geo_user <- get_user_content("nationalgeographic")
write_rds(nat_geo_user, here("natgeouser.rds"))
nat_geo_user <- read_rds(here("natgeouser.rds"))

str(nat_geo_user)
#> List of 1
#>  $ nationalgeographic:List of 3
#>   ..$ about   :List of 8
#>   .. ..$ created_utc  : chr "2017-08-24"
#>   .. ..$ timestamp    : num 1.5e+09
#>   .. ..$ name         : chr "nationalgeographic"
#>   .. ..$ is_employee  : logi FALSE
#>   .. ..$ is_mod       : logi TRUE
#>   .. ..$ is_gold      : logi FALSE
#>   .. ..$ thread_karma : num 445619
#>   .. ..$ comment_karma: num 114478
#>   ..$ comments:'data.frame': 1000 obs. of  12 variables:
#>   .. ..$ url           : chr [1:1000] "https://www.reddit.com/r/Dinosaurs/comments/n418mx/im_dr_nizar_ibrahim_a_paleontologist_and_nat_geo/" "https://www.reddit.com/r/Dinosaurs/comments/n418mx/im_dr_nizar_ibrahim_a_paleontologist_and_nat_geo/" "https://www.reddit.com/r/Dinosaurs/comments/n418mx/im_dr_nizar_ibrahim_a_paleontologist_and_nat_geo/" "https://www.reddit.com/r/Dinosaurs/comments/n418mx/im_dr_nizar_ibrahim_a_paleontologist_and_nat_geo/" ...
#>   .. ..$ date_utc      : chr [1:1000] "2021-05-03" "2021-05-03" "2021-05-03" "2021-05-03" ...
#>   .. ..$ timestamp     : num [1:1000] 1.62e+09 1.62e+09 1.62e+09 1.62e+09 1.62e+09 ...
#>   .. ..$ subreddit     : chr [1:1000] "Dinosaurs" "Dinosaurs" "Dinosaurs" "Dinosaurs" ...
#>   .. ..$ thread_author : chr [1:1000] "nationalgeographic" "nationalgeographic" "nationalgeographic" "nationalgeographic" ...
#>   .. ..$ comment_author: chr [1:1000] "nationalgeographic" "nationalgeographic" "nationalgeographic" "nationalgeographic" ...
#>   .. ..$ thread_title  : chr [1:1000] "I\031m Dr. Nizar Ibrahim, a paleontologist and Nat Geo Explorer, here to talk about all things related to dinosaurs\024AMA!" "I\031m Dr. Nizar Ibrahim, a paleontologist and Nat Geo Explorer, here to talk about all things related to dinosaurs\024AMA!" "I\031m Dr. Nizar Ibrahim, a paleontologist and Nat Geo Explorer, here to talk about all things related to dinosaurs\024AMA!" "I\031m Dr. Nizar Ibrahim, a paleontologist and Nat Geo Explorer, here to talk about all things related to dinosaurs\024AMA!" ...
#>   .. ..$ comment       : chr [1:1000] "Difficult question! I don't think I can really pick one, there are several candidates. I would say it's usually"| __truncated__ "There are a lot of interesting dinosaur names out there, with amusing background stories. I can't tell the enti"| __truncated__ "It's always a bit challenging when we are reconstructing extinct animals. We can look at living reptiles and, o"| __truncated__ "Dinosaurs are amazing, and so, understandably, we  get asked many questions about these incredible animals, and"| __truncated__ ...
#>   .. ..$ score         : num [1:1000] 15 15 15 13 12 11 21 24 10 17 ...
#>   .. ..$ up            : num [1:1000] 15 15 15 13 12 11 21 24 10 17 ...
#>   .. ..$ downs         : num [1:1000] 0 0 0 0 0 0 0 0 0 0 ...
#>   .. ..$ golds         : num [1:1000] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ threads :'data.frame': 926 obs. of  11 variables:
#>   .. ..$ url      : chr [1:926] "https://www.nationalgeographic.com/science/phenomena/2009/07/20/photographing-the-glow-of-the-human-body/" "https://www.nationalgeographic.com/news/2004/10/human-ancestor-skeletons-indonesia/" "https://www.nationalgeographic.com/science/prehistoric-world/mass-extinction/" "https://www.nationalgeographic.com/animals/2020/06/dolphins-use-tools-peers-similar-great-apes/?cmpid=org=ngp::"| __truncated__ ...
#>   .. ..$ date_utc : chr [1:926] "2020-06-27" "2020-06-26" "2020-06-25" "2020-06-25" ...
#>   .. ..$ timestamp: num [1:926] 1.59e+09 1.59e+09 1.59e+09 1.59e+09 1.59e+09 ...
#>   .. ..$ subreddit: chr [1:926] "u_nationalgeographic" "u_nationalgeographic" "u_nationalgeographic" "u_nationalgeographic" ...
#>   .. ..$ author   : chr [1:926] "nationalgeographic" "nationalgeographic" "nationalgeographic" "nationalgeographic" ...
#>   .. ..$ title    : chr [1:926] "Photographing the glow of the human body" "Hobbit-Like Human Ancestor Found in Asia" "What are mass extinctions, and what causes them?" "Dolphins learn how to use tools from peers, just like great apes\024The study upends the belief that only mothe"| __truncated__ ...
#>   .. ..$ text     : chr [1:926] "" "" "" "" ...
#>   .. ..$ golds    : num [1:926] 0 0 0 0 0 0 0 0 0 0 ...
#>   .. ..$ score    : num [1:926] 58 145 3 41 22 ...
#>   .. ..$ ups      : num [1:926] 58 145 3 41 22 ...
#>   .. ..$ downs    : num [1:926] 0 0 0 0 0 0 0 0 0 0 ...
str(nat_geo_user[["nationalgeographic"]]$about)
#> List of 8
#>  $ created_utc  : chr "2017-08-24"
#>  $ timestamp    : num 1.5e+09
#>  $ name         : chr "nationalgeographic"
#>  $ is_employee  : logi FALSE
#>  $ is_mod       : logi TRUE
#>  $ is_gold      : logi FALSE
#>  $ thread_karma : num 445619
#>  $ comment_karma: num 114478
basecomments <- nat_geo_user[["nationalgeographic"]]$comments |> 
  tibble() |> 
  filter(score > 40)
basecomments
#> # A tibble: 146 × 12
#>    url    date_utc timestamp subreddit thread_author comment_author thread_title
#>    <chr>  <chr>        <dbl> <chr>     <chr>         <chr>          <chr>       
#>  1 https… 2021-04…    1.62e9 climbing  nationalgeog… nationalgeogr… "I journeye…
#>  2 https… 2021-04…    1.62e9 climbing  nationalgeog… nationalgeogr… "I journeye…
#>  3 https… 2021-04…    1.62e9 climbing  nationalgeog… nationalgeogr… "I journeye…
#>  4 https… 2021-04…    1.62e9 climbing  nationalgeog… nationalgeogr… "I journeye…
#>  5 https… 2021-04…    1.62e9 IAmA      nationalgeog… nationalgeogr… "We are res…
#>  6 https… 2021-04…    1.62e9 IAmA      nationalgeog… nationalgeogr… "We are res…
#>  7 https… 2021-05…    1.62e9 IAmA      nationalgeog… nationalgeogr… "I study a …
#>  8 https… 2021-09…    1.63e9 IAmA      nationalgeog… nationalgeogr… "I\u0019m P…
#>  9 https… 2021-09…    1.63e9 IAmA      nationalgeog… nationalgeogr… "I\u0019m P…
#> 10 https… 2021-09…    1.63e9 IAmA      nationalgeog… nationalgeogr… "I\u0019m P…
#> # ℹ 136 more rows
#> # ℹ 5 more variables: comment <chr>, score <dbl>, up <dbl>, downs <dbl>,
#> #   golds <dbl>
pluckcomments <- pluck(nat_geo_user, "nationalgeographic", "comments") |> 
  tibble() |>  
  filter(score > 40) 

pluckcomments
#> # A tibble: 146 × 12
#>    url    date_utc timestamp subreddit thread_author comment_author thread_title
#>    <chr>  <chr>        <dbl> <chr>     <chr>         <chr>          <chr>       
#>  1 https… 2021-04…    1.62e9 climbing  nationalgeog… nationalgeogr… "I journeye…
#>  2 https… 2021-04…    1.62e9 climbing  nationalgeog… nationalgeogr… "I journeye…
#>  3 https… 2021-04…    1.62e9 climbing  nationalgeog… nationalgeogr… "I journeye…
#>  4 https… 2021-04…    1.62e9 climbing  nationalgeog… nationalgeogr… "I journeye…
#>  5 https… 2021-04…    1.62e9 IAmA      nationalgeog… nationalgeogr… "We are res…
#>  6 https… 2021-04…    1.62e9 IAmA      nationalgeog… nationalgeogr… "We are res…
#>  7 https… 2021-05…    1.62e9 IAmA      nationalgeog… nationalgeogr… "I study a …
#>  8 https… 2021-09…    1.63e9 IAmA      nationalgeog… nationalgeogr… "I\u0019m P…
#>  9 https… 2021-09…    1.63e9 IAmA      nationalgeog… nationalgeogr… "I\u0019m P…
#> 10 https… 2021-09…    1.63e9 IAmA      nationalgeog… nationalgeogr… "I\u0019m P…
#> # ℹ 136 more rows
#> # ℹ 5 more variables: comment <chr>, score <dbl>, up <dbl>, downs <dbl>,
#> #   golds <dbl>