By Erik Webb
Austin Richard Post, professionally known as Post Malone, released a new album titled “beerbongs & bentleys” April 27, 2018. Shooting up the billboard charts, the album is breaking streaming records and making Post a more well known name in the music industry.
For those of you not familiar with his work, Post burst onto the scene in 2015 with his debut single “White Iverson,” as the song reached 14 on the Billboard Top 100 before fizzling out. Post then released his first album, “Stoney,” in December 2016, gaining him more fame and popularity. The most popular song, titled “Congratulations,” reached No. 8 on the Billboard charts, the highest ranking a Post song has had until this most recent album.
This new album, “beerbongs & bentleys,” is just Post Malone’s second album featuring guest appearances from Swae Lee, 21 Savage, Ty Dolla Sign, Nicki Minaj, G-Eazy and YG. Three singles, including “Rockstar,” “Candy Paint” and “Psycho,” all released prior to the album release, helped increase the popularity of the album and get fans excited for the release last month.
Because of the popularity of the album, I thought it would be interesting to study the lyrics and see if there was a correlation between how positive the album is and how successful “beerbongs & bentleys” has been on streaming platforms.
My hypothesis is that the lyrics of the album will be very high in sentiment. I believe this to be true because people would much rather listen to songs with higher sentiment, things that are positive and upbeat, than songs with negative sentiment that could make you feel sad or are depressing in nature.
To test my hypothesis, I first had to copy and paste all of the lyrics from the album into a single spreadsheet since no such dataset existed. Once I had it all complied, I saved the 1,005-row spreadsheet as a .csv and imported it into R.
Once I had all of the words listed, I ran a code that separated and counted each word used in the lyrics and made a data frame called “songs.” I then made two different variables assigning each word a sentiment, one from the afinn lexicon and the other from the bing lexicon. I did this because I needed both lists – the afinn sentiment analysis to make score the songs sentiment on a numeric scale and the bing sentiment analysis to separate the words into positive and negative associations.
songs <- lyrics %>% dplyr::select(lines) %>% unnest_tokens(word, lines) %>% count(word, sort = TRUE) %>% ungroup()
songs_afinn <- songs %>% inner_join(get_sentiments("afinn")) %>% ungroup()
songs_bing <- songs %>% inner_join(get_sentiments("bing")) %>% ungroup()
I think I only ended up needing the bing sentiment at this point in my project and used the afinn again later.
From there I took the top 20 words with both negative and positive sentiment and made a graph of each of them.
songs_bing %>%filter(sentiment=="negative") -> negative_sent
negative_sent %>% head(20) %>%
mutate(word = reorder(word, n)) %>%
ggplot(aes(word,n, fill = sentiment)) +
geom_col(show.legend = FALSE) +
facet_wrap(~sentiment, scales = "free_y") +
coord_flip()
I also decided to make a word cloud of the most popular words, separating out the stop words from the data frame to get a more accurate account of the most popular words. I then made the cloud using the top 100 words used in the album.
songs_minus <- songs %>% anti_join(stop_words) %>% head(100)
names(songs_minus)[2] <- "freq"
wordcloud2(songs_minus, size=1, shape = "oval", fontFamily="Arial", color="random-dark")
As you can see from the word cloud, the most popular word is “yeah,” followed by “baby,” “ayy,” “fault” and “night.” If you look at the rest of the words on the cloud, there shouldn’t be a surprise to see a lot of curse words and words in the titles, and therefore in the choruses, of songs on the album.
It wasn’t surprising to see the word “like” used so much in the lyrics, as it is a very common word and obviously used a lot more frequently than most. What did surprise me was the list of other words that are associated with positive sentiment, mostly because I forgot that they are used in the lyrics. These include the words “saint,” “promise” and “rich.” It didn’t surprise, however, that words like “rockstar” and “better” appeared on the list because they are
There were a lot more negative words on the lists than positive words, which was surprising to see and later confirmed by my sentiment analysis of each song. I forgot how negative a sentiment curse words have associated with them, so it makes sense that a couple of the words on the list are curse words. I forgot the word “spoil” was in one of the song titles and was surprised to see it on the list at first, but that feeling quickly subsided.
To get the general sentiment of the album, I also broke down the album by song and analyzed which songs have a more positive sentiment attached to them and which ones had more negative sentiments. I did this by adding up the afinn sentiment score attached to each word and getting an aggregate score for each song.
lyrics %>% filter(song=="Spoil My Night") -> spoil_my_night
spoil_my_night <- spoil_my_night %>%
dplyr::select(lines) %>%
unnest_tokens(word, lines) %>%
inner_join(get_sentiments("afinn")) %>%
count(word, score, sort = TRUE) %>%
ungroup() %>%
mutate(total = score * n)
sum(spoil_my_night$total)
arrange(total_sentiment, desc(total_sentiment$score)) -> songSentRearranged
ggplot(songSentRearranged, aes(reorder(title,-score), score)) + geom_bar(stat = "identity",fill="#bf80ff") -> sentiment_graph
The problem I found with that is that a lot of the lyrics in the rap songs are slang words or modified words that are not found in normal speech and, therefore, not in the lexicon. Each song only had about 10-20 words that could be measured, so it might not be the most accurate score for each song. But I still believe it give a general indication as to the general sentiment of each song so you can compare them.
After adding up the scores for each song, here are the results:
Song | Sentiment Score |
---|---|
Paranoid | -5 |
Spoil My Night | -5 |
Rich & Sad | -28 |
Zach and Codeine | -4 |
Takin' Shots | -4 |
Rockstar | 3 |
Over Now | -13 |
Psycho | -18 |
Better Now | 36 |
Ball For Me | 25 |
Otherside | 0 |
Stay | -39 |
Blame It On Me | -29 |
Same Bitches | -84 |
Jonestown (Interlude) | -6 |
92 Explorer | 10 |
Candy Paint | -33 |
Sugar Wraith | -9 |
I was very surprised to see such negative scores overall in my initial findings. It wasn’t what I was expecting, but quickly realized how much curse words affect the score since they all have a very low sentiment score attached to them and are used quite frequently.
After conducting my research, my hypothesis was not supported. Only three of the songs on the album have a positive sentiment score, whereas most of the songs are pretty negative. This can be seen in the breakdown of words by sentiment and then get a general idea of which words are used frequently throughout the album in the word cloud.
I would interested to see how much the results would change if I were to adjust the lexicons because I believe that to be the biggest issue in conducting this research. With swear words at such a negative sentiment value and the slang words not included in the lexicon to begin with, I think that my results were skewed from the beginning.
Despite how popular the album is, its fame is not because the songs contain a very high sentiment score, but rather fans of Post listening to an artist they enjoy.