Lyrical Analysis of Taylor Swift's Albums

By: Jacqueline Hickey

Introduction

For this text analysis project, I decided to analyze the evolution of Taylor Swift’s albums to see if her sentiment has changed over time. In 2006, Taylor Swift started her career in country music with the debut of her first album Taylor Swift. This album peaked at number five on Billboards 2000 and Swift became recognized in the country music community. Two years later in 2008, the community saw a shift in Swift’s music from country to pop with the release of her second album Fearless, which featured a mixture of both country and pop songs. In 2010, Swift released her third album Speak Now. She continued to release three more albums Red, 1989, and Reputation in 2012, 2014, and 2017 respectively. As Swift’s music career expanded she changed genres from her beginning days in country music to fully pop music. As she gained popularity, she began to incorporate more and more songs about breakups with her boyfriends and feuds with other celebrities.

Hypothesis: Taylor Swift has increased negative sentiment and the amount of negative words since her first album in 2006.

Required Packages

In order to perform the text sentiment analysis on Taylor Swift’s six albums, I installed and downloaded the following packages: tidyverse, tidytext, worldcloud2, and ggplot2.

Lyrics

In order to get the lyrics of the album, I used the genius database dataset found on github to upload all of the albums. Once the library was installed on R, I found her albums in the database by using the following code. For every album, I entered the album name into the album section. I have provided the code for the “Taylor Swift” album below.

devtools::install_github("josiahparry/genius")
library(genius)
genius_album(artist="Taylor Swift", album="Taylor Swift") -> swift

Filtering The Data

To begin the analysis, I examined the lyrics of the album using the function str in R. Once the data was structured, I unnested the tokens and removed stopwords. After examining some of the top words for the Reputation album, I noticed that many top words were lyrical sounds in songs versus actual words and I filtered those out. These words include: di, da, ooh, ha, ah, uh, mm, ey, and la Below I have provided the code for the “Taylor Swift” album.


str(swift)
swift %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
count(word, sort=TRUE) -> swiftCount
swiftCount %>%
filter(!word %in% c("di", "da", "ooh", "ha", "ah", "uh", “mm”,"ey","la)) -> swiftCountFiltered

Album Sentiments Using Bing

I then analyzed how each word contributed to each sentiment using the function Bing. I wanted to compare the number of positive and negative words most used in the albums. I inner joined the Bing scale and I filtered the data to include only the words that were said more than three times due to the large amount of words said one, two or three times in the songs. I then created a graph to demonstrate the amount of sentiment the word had in the album.


swiftCountFiltered %>% inner_join(get_sentiments("bing")) -> swiftSentBing
swiftSentBing %>%
group_by(sentiment) %>%
ungroup() %>%
mutate(word = reorder(word, n)) %>%
filter(n > 3) -> swiftSentGraph
ggplot(swiftSentGraph, aes(word, n, fill = sentiment)) +
geom_col(show.legend = FALSE) +
facet_wrap(~sentiment, scales = "free_y") + labs(y = "Contribution to sentiment",
x = NULL) +
coord_flip() + ggtitle("Taylor Swift Album Contribution to Sentiment")
Swift
Fearless
Speak
Red
1989
Reputation

Findings

As you can see from the six different graphs of Swift’s albums, there is always more negative words used than positive words. The positive words in Swift’s first album had a larger contribution to sentiment than the negative words used. In Swift’s sixth album, there were about the same amount of positive and negative words used, with more contribution for the positive words. The album 1989, had words that contributed a lot to both positive and negative sentiment, with the words love (positive) and shake (negative). Although overall, the word bad had the largest contribution to sentiment. These graphs demonstrated how Taylor Swift always used a large amount of negative words in each album, with an increasing amount of positive words except in the 1989 album. Overall, using Bing does not show an increase in negative words or sentiment used, rather the opposite, an increase in positive words and sentiment.

Album Analysis Using Afinn

I then analyzed the albums using Afinn. The Afinn sentiment scale is a list of English words rated for valence with an integer between minus five and plus five for negative and positive respectively. This scale is useful to see how strongly positive or negative the words in the songs are. I inner joined the Afinn scale and again filtered the data to include only the words that were said more than three times.


swiftCountFiltered %>%
inner_join(get_sentiments("afinn")) -> swiftSenAfinn
swiftSenAfinn %>%
ungroup() %>%
mutate(word = reorder(word, n)) %>%
filter(n > 3) -> swiftAfinnGraph
ggplot(swiftAfinnGraph, aes(reorder(word, score), score, fill = score)) + geom_col() +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +ggtitle("Afinn Score") +
xlab("Word")

Visualizations

Swift
Fear
Speak Now
Red
1989
Reputation

Findings

In Taylor Swift’s first album, a little more than half of the words used more than three times in the album had a positive afinn score. This increases in the second album Fearless, with only one word having a score of more than 2. In the Speak Now album, there are more words that have a negative afinn score. In Red and 1989 , there are the most words with a negative three score. In Reputation, the amount of words is split between negative and positive. This is the first album to have a word with a positive afinn score of four. It is interesting to see that none of Swift’s words have a worse score than negative three. This is probably because she does not use explicit language in her lyrics. She also only has one word surpassing the score of positive three. While analyzing the six graphs, Swift’s music becomes more negative with more negative words being used more often until her final album. I thought that her last album would be the most negative so it was interesting to see that it was actually the 1989 album.

Conclusions

Overall, Taylor Swift's music did not increase in negativity or the amount of negative words over time. Although her music shifted from country to pop, there was no real effect on her language use. I thought the use of negative words and worse negative words would continue to increase from Taylor Swift to Reputation. The Bing analysis demonstrates that the amount of positive words increases in each album. Although the Afinn scale demonstrates an increase in negative words and decrease in valence for Red and 1989, the negative words decrease and valence increases overall for the Reputation album.