Class 22
Navigating Lists
Follow-up Reading and Working Materials
Agenda
Today we’ll focus on:
- Lists
- Rectangling
- Homework 3 work time
Poll
Have you ever worked with JSON files before?
- yes
- no
Working with Lists and Rectangling
We’ll use the material from the Rectangling chapter to work through together in class and discuss lists.
Preparation for Homework 4 Rectangling
You worked through some of the rectangling chapter, and continue before next class. For the next homework, you will do some more text analysis and practice using the package {RedditExtractoR} to pull posts/comments and other data from Reddit. If we have time we’ll look at doing this in an interactive script, as below:
#> List of 1
#> $ nationalgeographic:List of 3
#> ..$ about :List of 8
#> .. ..$ created_utc : chr "2017-08-24"
#> .. ..$ timestamp : num 1.5e+09
#> .. ..$ name : chr "nationalgeographic"
#> .. ..$ is_employee : logi FALSE
#> .. ..$ is_mod : logi TRUE
#> .. ..$ is_gold : logi FALSE
#> .. ..$ thread_karma : num 445619
#> .. ..$ comment_karma: num 114478
#> ..$ comments:'data.frame': 1000 obs. of 12 variables:
#> .. ..$ url : chr [1:1000] "https://www.reddit.com/r/Dinosaurs/comments/n418mx/im_dr_nizar_ibrahim_a_paleontologist_and_nat_geo/" "https://www.reddit.com/r/Dinosaurs/comments/n418mx/im_dr_nizar_ibrahim_a_paleontologist_and_nat_geo/" "https://www.reddit.com/r/Dinosaurs/comments/n418mx/im_dr_nizar_ibrahim_a_paleontologist_and_nat_geo/" "https://www.reddit.com/r/Dinosaurs/comments/n418mx/im_dr_nizar_ibrahim_a_paleontologist_and_nat_geo/" ...
#> .. ..$ date_utc : chr [1:1000] "2021-05-03" "2021-05-03" "2021-05-03" "2021-05-03" ...
#> .. ..$ timestamp : num [1:1000] 1.62e+09 1.62e+09 1.62e+09 1.62e+09 1.62e+09 ...
#> .. ..$ subreddit : chr [1:1000] "Dinosaurs" "Dinosaurs" "Dinosaurs" "Dinosaurs" ...
#> .. ..$ thread_author : chr [1:1000] "nationalgeographic" "nationalgeographic" "nationalgeographic" "nationalgeographic" ...
#> .. ..$ comment_author: chr [1:1000] "nationalgeographic" "nationalgeographic" "nationalgeographic" "nationalgeographic" ...
#> .. ..$ thread_title : chr [1:1000] "I\031m Dr. Nizar Ibrahim, a paleontologist and Nat Geo Explorer, here to talk about all things related to dinosaurs\024AMA!" "I\031m Dr. Nizar Ibrahim, a paleontologist and Nat Geo Explorer, here to talk about all things related to dinosaurs\024AMA!" "I\031m Dr. Nizar Ibrahim, a paleontologist and Nat Geo Explorer, here to talk about all things related to dinosaurs\024AMA!" "I\031m Dr. Nizar Ibrahim, a paleontologist and Nat Geo Explorer, here to talk about all things related to dinosaurs\024AMA!" ...
#> .. ..$ comment : chr [1:1000] "Difficult question! I don't think I can really pick one, there are several candidates. I would say it's usually"| __truncated__ "There are a lot of interesting dinosaur names out there, with amusing background stories. I can't tell the enti"| __truncated__ "It's always a bit challenging when we are reconstructing extinct animals. We can look at living reptiles and, o"| __truncated__ "Dinosaurs are amazing, and so, understandably, we get asked many questions about these incredible animals, and"| __truncated__ ...
#> .. ..$ score : num [1:1000] 15 15 15 13 12 11 21 24 10 17 ...
#> .. ..$ up : num [1:1000] 15 15 15 13 12 11 21 24 10 17 ...
#> .. ..$ downs : num [1:1000] 0 0 0 0 0 0 0 0 0 0 ...
#> .. ..$ golds : num [1:1000] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ threads :'data.frame': 926 obs. of 11 variables:
#> .. ..$ url : chr [1:926] "https://www.nationalgeographic.com/science/phenomena/2009/07/20/photographing-the-glow-of-the-human-body/" "https://www.nationalgeographic.com/news/2004/10/human-ancestor-skeletons-indonesia/" "https://www.nationalgeographic.com/science/prehistoric-world/mass-extinction/" "https://www.nationalgeographic.com/animals/2020/06/dolphins-use-tools-peers-similar-great-apes/?cmpid=org=ngp::"| __truncated__ ...
#> .. ..$ date_utc : chr [1:926] "2020-06-27" "2020-06-26" "2020-06-25" "2020-06-25" ...
#> .. ..$ timestamp: num [1:926] 1.59e+09 1.59e+09 1.59e+09 1.59e+09 1.59e+09 ...
#> .. ..$ subreddit: chr [1:926] "u_nationalgeographic" "u_nationalgeographic" "u_nationalgeographic" "u_nationalgeographic" ...
#> .. ..$ author : chr [1:926] "nationalgeographic" "nationalgeographic" "nationalgeographic" "nationalgeographic" ...
#> .. ..$ title : chr [1:926] "Photographing the glow of the human body" "Hobbit-Like Human Ancestor Found in Asia" "What are mass extinctions, and what causes them?" "Dolphins learn how to use tools from peers, just like great apes\024The study upends the belief that only mothe"| __truncated__ ...
#> .. ..$ text : chr [1:926] "" "" "" "" ...
#> .. ..$ golds : num [1:926] 0 0 0 0 0 0 0 0 0 0 ...
#> .. ..$ score : num [1:926] 58 145 3 41 22 ...
#> .. ..$ ups : num [1:926] 58 145 3 41 22 ...
#> .. ..$ downs : num [1:926] 0 0 0 0 0 0 0 0 0 0 ...
#> List of 8
#> $ created_utc : chr "2017-08-24"
#> $ timestamp : num 1.5e+09
#> $ name : chr "nationalgeographic"
#> $ is_employee : logi FALSE
#> $ is_mod : logi TRUE
#> $ is_gold : logi FALSE
#> $ thread_karma : num 445619
#> $ comment_karma: num 114478
basecomments <- nat_geo_user[["nationalgeographic"]]$comments |>
tibble() |>
filter(score > 40)
basecomments#> # A tibble: 146 × 12
#> url date_utc timestamp subreddit thread_author comment_author thread_title
#> <chr> <chr> <dbl> <chr> <chr> <chr> <chr>
#> 1 https… 2021-04… 1.62e9 climbing nationalgeog… nationalgeogr… "I journeye…
#> 2 https… 2021-04… 1.62e9 climbing nationalgeog… nationalgeogr… "I journeye…
#> 3 https… 2021-04… 1.62e9 climbing nationalgeog… nationalgeogr… "I journeye…
#> 4 https… 2021-04… 1.62e9 climbing nationalgeog… nationalgeogr… "I journeye…
#> 5 https… 2021-04… 1.62e9 IAmA nationalgeog… nationalgeogr… "We are res…
#> 6 https… 2021-04… 1.62e9 IAmA nationalgeog… nationalgeogr… "We are res…
#> 7 https… 2021-05… 1.62e9 IAmA nationalgeog… nationalgeogr… "I study a …
#> 8 https… 2021-09… 1.63e9 IAmA nationalgeog… nationalgeogr… "I\u0019m P…
#> 9 https… 2021-09… 1.63e9 IAmA nationalgeog… nationalgeogr… "I\u0019m P…
#> 10 https… 2021-09… 1.63e9 IAmA nationalgeog… nationalgeogr… "I\u0019m P…
#> # ℹ 136 more rows
#> # ℹ 5 more variables: comment <chr>, score <dbl>, up <dbl>, downs <dbl>,
#> # golds <dbl>
pluckcomments <- pluck(nat_geo_user, "nationalgeographic", "comments") |>
tibble() |>
filter(score > 40)
pluckcomments#> # A tibble: 146 × 12
#> url date_utc timestamp subreddit thread_author comment_author thread_title
#> <chr> <chr> <dbl> <chr> <chr> <chr> <chr>
#> 1 https… 2021-04… 1.62e9 climbing nationalgeog… nationalgeogr… "I journeye…
#> 2 https… 2021-04… 1.62e9 climbing nationalgeog… nationalgeogr… "I journeye…
#> 3 https… 2021-04… 1.62e9 climbing nationalgeog… nationalgeogr… "I journeye…
#> 4 https… 2021-04… 1.62e9 climbing nationalgeog… nationalgeogr… "I journeye…
#> 5 https… 2021-04… 1.62e9 IAmA nationalgeog… nationalgeogr… "We are res…
#> 6 https… 2021-04… 1.62e9 IAmA nationalgeog… nationalgeogr… "We are res…
#> 7 https… 2021-05… 1.62e9 IAmA nationalgeog… nationalgeogr… "I study a …
#> 8 https… 2021-09… 1.63e9 IAmA nationalgeog… nationalgeogr… "I\u0019m P…
#> 9 https… 2021-09… 1.63e9 IAmA nationalgeog… nationalgeogr… "I\u0019m P…
#> 10 https… 2021-09… 1.63e9 IAmA nationalgeog… nationalgeogr… "I\u0019m P…
#> # ℹ 136 more rows
#> # ℹ 5 more variables: comment <chr>, score <dbl>, up <dbl>, downs <dbl>,
#> # golds <dbl>