The Hunger Games Trilogy Sentiment Analysis

By Annie Harris

The Hunger Games trilogy by Suzanne Collins is one of the most popular and widely translated trilogies of its genre. The Hunger Games trilogy consists of The Hunger Games, Catching Fire, and Mockingjay. Released in the late 2000s, teens and young adults took liking to the novel for its fierce display of strength, loyalty, identity, and love.

The first book is set in the dystopian nation of Panem. Here, the government punishes young residents of the nation to battle for their lives in a series of games called the Hunger Games. The series follows the main characters, Peeta, Katniss, and Gale as they fight against the powerful Capitol to gain independence. Throughout the series, the main characters suffer loss, experience deaths of friends and family, and endure harsh conditions while battling the Capitol. As the stakes grow for their independence, I am hypothesizing that the sentiment of the novels grows harsher. Therefore, I am hypothesizing that The Hunger Games Trilogy develops a more negative sentiment towards the Capitol throughout the books.

Top 20 Words

To start this analysis, I wanted to look at the top ten most popular words in each book to better understand the context of the sentiment.

Step 1: Download The Hunger Games Trilogy in .txt format.

Step 2: Install packages and load: scales, tidyverse, tidytext, ggplot2, stringr, and readtext.

Step 3: Read the .txt file and sort the data for the top ten most popular words in the novels. Graph the results. The example code below is for The Hunger Games book and was substitued for both Catching Fire and Mocking Jay.

thehungergames = readLines("THEHUNGERGAME.txt")

hungergamesDF = data.frame(text = thehungergames)

document_linesHG = unnest_tokens(hungergamesDF, input = text, output = line, token = "sentences", to_lower = F)

df_text_bigrams_tidyHG = document_linesHG %>% unnest_tokens(output = bigram, input= line, token = "ngrams", n=2)

df_text_bigrams_tidyHG %>% count(bigram, sort = TRUE)

bigrams_separatedHG = df_text_bigrams_tidyHG %>% separate (bigram, c("word1", "word2"), sep = "")

bigrams_filteredHG = bigrams_separatedHG %>% filter(!word1 %in% stop_words$word) %>% filter(!word2 %in% stop_words$word)

document_linesHG$lineNo = seq_along(document_linesHG$line)

thehungergamesWords = document_linesHG %>% unnest_tokens(output = word, input = line, token = "words")

thehungergamesWords2 = thehungergamesWords %>% anti_join(stop_words, by= c("word"))

thehungergamesWords3 = thehungergamesWords2 %>% count(word, sort = TRUE)

thehungergamesWords4 = thehungergamesWords3 %>% filter(!word %in% c("im", "hes", "dont", "ive" ))

thehungergamesWords4 %>% top_n(20) %>% ggplot(aes(reorder(word, n), n)) + geom_bar(stat="identity") + coord_flip() + ggtitle("The Hunger Games")

Top 20 Findings

After analyzing the top twenty words from each book, I was not surprised to see that Peeta was the top word in each. I was interested to see that “Capitol” rose to number 6 in Catching Fire and then to number 3 in MockingJay. I was also interested to see that “Kill" became one of the top twenty words in MockingJay. These graphs tend to support my hypothesis that the books develop a more negative sentiment throughout.

Sentiment Analysis

I next wanted to look at the overall sentiment of words throughout the book. Here, I would expect that The Hunger Games have the most positively associated sentiments, and Catching Fire and MockingJay have less positively associated sentiments.

Step 1: Install packages and load: tidyverse, tidytext, and devtools.

Step 2: Call sentiment packages.

Step 3: Filter the sentiment packages and graph the results. This process was repeated for each book.

get_sentiments("afinn") hungersentiment = thehungergamesWords4 %>% inner_join(get_sentiments("nrc"))

ggplot(data = hungersentiment, aes(x = sentiment, y = n)) + geom_bar(aes(fill = sentiment), stat = "identity") + theme(legend.position = "none") + xlab("Sentiment") + ylab("Total Count") + ggtitle("Total Sentiment Score for Hunger Games")

Sentiment Findings

I was not surprised to see that The Hunger Games had a very similar negative and positive sentiment word score. However, I was surprised to see that Catching Fire had a more positive word score that negative. Similarly, Catching Fire also saw an increase in words associated with joy. Finally, I was not surprised to see that MockingJay had a more negative word sentiment than the previous two books.

Popular Negative and Positive Sentiment Words

Finally, to test my hypothesis, I wanted to look at the top ten most popular negative and positive sentiment words in each book. I hypothesized that as the trilogy progressed the negative sentiment words would relate to the characters desperation for independence from the Capitol. Therefore, I would assume that some of the words would relate to government power, and abuse of power for the negative sentiment in the later novels.

Step 1: Install packages and load: tidyverse, tidytext, and devtools.

Step 2: Call sentiment packages.

Step 3: Filter the sentiment packages and graph the results. This process was repeated for each book.

bing_word_countsHG = thehungergamesWords %>% inner_join(get_sentiments("bing")) %>% count(word, sentiment, sort = TRUE) %>% ungroup()

bing_word_countsHG %>% group_by(sentiment) %>% top_n(10) %>% ungroup() %>% mutate(word = reorder(word, n)) %>% ggplot(aes(word, n, fill = sentiment)) + geom_col(show.legend = FALSE) + facet_wrap(~sentiment, scales = "free_y") + labs(y = "Contribution to sentiment", x = NULL) + coord_flip()

Popular Negative and Positive Sentiment Words Analysis

Through this final analysis, I concluded that the data did not support my hypothesis. The negative and positive sentiment words reflect more of the individual characters thoughts and emotions rather than their outward emotions towards the environment and Capitol. I was interested to see the spike in popularity of the word “peacekeepers.” I would hypothesize that despite the harsh conditions, the characters attempted to stay positive to continue their fight for independence.

The outlier in this analysis, however, would be the word "uprising." While this does align with my initial hypothesis, the trend does not follow suit in MockingJay.