Getting data from the web
Using APIs to get data
library(tidyverse) library(forcats) library(broom) library(wbstats) library(wordcloud) library(tidytext) library(viridis) set.seed(1234) theme_set(theme_minimal()) There are many ways to obtain data from the Internet. Four major categories are: click-and-download on the internet as a “flat” file, such as .
Practice getting data from the Twitter API
library(tidyverse) library(rtweet) set.seed(1234) theme_set(theme_minimal()) Run the code below in your console to download this exercise as a set of R scripts. usethis::use_course("cis-ds/getting-data-from-the-web-api-access") There are several packages for R for accessing and searching Twitter.
Writing API queries
library(tidyverse) library(stringr) library(jsonlite) library(httr) theme_set(theme_minimal()) What happens if someone has not already written a package for the API from which we want to obtain data? We have to write our own function!
Simplifying lists
library(tidyverse) library(httr) library(repurrrsive) set.seed(123) theme_set(theme_minimal()) Run the code below in your console to download this exercise as a set of R scripts. usethis::use_course("cis-ds/getting-data-from-the-web-api-access") Not all lists are easily coerced into data frames by simply calling content() %>% as_tibble().
Scraping web pages
library(tidyverse) library(rvest) library(lubridate) theme_set(theme_minimal()) Run the code below in your console to download this exercise as a set of R scripts. usethis::use_course("cis-ds/getting-data-from-the-web-scraping") What if data is present on a website, but isn’t provided in an API at all?