R: tidytext: tidytext hgwells

From OnnoWiki
Jump to navigation Jump to search
# Ref: https://github.com/dgrtwo/tidy-text-mining/blob/master/01-tidy-text.Rmd
library(knitr)
opts_chunk$set(message = FALSE, warning = FALSE, cache = TRUE)
options(width = 100, dplyr.width = 100)
library(ggplot2)
theme_set(theme_light())
## Word frequencies
library(gutenbergr)
hgwells <- gutenberg_download(c(35, 36, 5230, 159))
hgwells
# load("data/hgwells.rda")
tidy_hgwells <- hgwells %>%
  unnest_tokens(word, text) %>%
  anti_join(stop_words)
# word count
tidy_hgwells %>%
  count(word, sort = TRUE)


Referensi

Pranala Menarik