A word cloud
I did a short analysis of Winston Churchill’s “Their finest hour”. I did this for a class on data collection and production. Here is a word cloud I created from this analysis:
If you are interested in the code that I used, you can have a look below:
# Data Method: Text mining
# File: textmining1.R
# Theme: Download text data from web and create wordcloud
# Install the easypackages package
install.packages("easypackages")
library(easypackages)
# Load multiple packages using easypackage function "packages"
packages("XML","wordcloud","RColorBrewer","NLP","tm","quanteda", prompt = T)
# Download text data from website
churchLocation <-URLencode("http://www.historyplace.com/speeches/churchill-hour.htm")
# use htmlTreeParse function to read and parse paragraphs
doc.html<- htmlTreeParse(churchLocation, useInternal=TRUE)
church <- unlist(xpathApply(doc.html, '//b', xmlValue))
church <- church[-26] #Getting rid of the unnecessary last document
head(church, 3)
# Vectorize mlk
words.vec <- VectorSource(church)
# Check the class of words.vec
class(words.vec)
# Create Corpus object for preprocessing
words.corpus <- Corpus(words.vec)
inspect(words.corpus)
# Turn all words to lower case
words.corpus <- tm_map(words.corpus, content_transformer(tolower))
# Remove punctuations, numbers
words.corpus <- tm_map(words.corpus, removePunctuation)
words.corpus <- tm_map(words.corpus, removeNumbers)
# How about stopwords, then uniform bag of words created
words.corpus <- tm_map(words.corpus, removeWords, stopwords("english"))
# Create Term Document Matrix
tdm <- TermDocumentMatrix(words.corpus)
inspect(tdm)
m <- as.matrix(tdm)
wordCounts <- rowSums(m)
wordCounts <- sort(wordCounts, decreasing=TRUE)
head(wordCounts)
# Create Wordcloud
cloudFrame<-data.frame(word=names(wordCounts),freq=wordCounts)
set.seed(1234)
wordcloud(cloudFrame$word,cloudFrame$freq)
wordcloud(names(wordCounts),wordCounts, min.freq=3,random.order=FALSE, max.words=500,scale=c(3,.5), rot.per=0.35,colors=brewer.pal(8,"Dark2"))
# Run the program on Winston Churchill's Finest Hour speech?
# http://www.historyplace.com/speeches/churchill-hour.htm