Computing for Information Science
Computing for Information Science
Home
FAQ
Schedule of Topics
Homework
Setup
Notes
Light
Dark
Automatic
text
Basic workflow for text analysis
Obtain your text sources Text data can come from lots of areas: Web sites Twitter Databases PDF documents Digital scans of printed materials The easier to convert your text data into digitally stored text, the cleaner your results and fewer transcription errors.
Mar 1, 2019
text
Practicing tidytext with song titles
library(tidyverse) library(acs) library(tidytext) library(here) set.seed(1234) theme_set(theme_minimal()) Run the code below in your console to download this exercise as a set of R scripts. usethis::use_course("cis-ds/text-analysis-fundamentals-and-sentiment-analysis") Today let’s practice our tidytext skills with a basic analysis of song titles.
Mar 1, 2019
text
Practicing sentiment analysis with Harry Potter
library(tidyverse) library(tidytext) library(harrypotter) set.seed(1234) theme_set(theme_minimal()) Run the code below in your console to download this exercise as a set of R scripts. usethis::use_course("cis-ds/text-analysis-fundamentals-and-sentiment-analysis") Load Harry Potter text Run the following code to download the harrypotter package:
Mar 1, 2019
text
Practicing tidytext with Hamilton
library(tidyverse) library(tidytext) library(ggtext) library(here) set.seed(123) theme_set(theme_minimal()) About seven months ago, my wife and I became addicted to Hamilton. My name is Alexander Hamilton I admit, we were quite late to the party.
Mar 1, 2019
text
Supervised classification with text data
library(tidyverse) library(tidymodels) library(tidytext) set.seed(1234) theme_set(theme_minimal()) A common task in social science involves hand-labeling sets of documents for specific variables (e.g. manual coding). In previous years, this required hiring a set of research assistants and training them to read and evaluate text by hand.
Mar 1, 2019
text
Predicting song artist from lyrics
library(tidyverse) library(tidymodels) library(here) library(stringr) library(textrecipes) library(themis) library(vip) set.seed(123) theme_set(theme_minimal()) Run the code below in your console to download this exercise as a set of R scripts. usethis::use_course("cis-ds/text-analysis-classification-and-topic-modeling") Beyoncé and Taylor Swift at the 2009 MTV Video Music Awards.
Mar 1, 2019
text
Topic modeling
library(tidyverse) library(tidymodels) library(tidytext) library(textrecipes) library(topicmodels) library(here) library(rjson) library(tm) library(tictoc) library(appa) set.seed(1234) theme_set(theme_minimal()) Typically when we search for information online, there are two primary methods: Keywords - use a search engine and type in words that relate to whatever it is we want to find Links - use the networked structure of the web to travel from page to page.
Mar 1, 2019
text
Cite
×