Changes in Music Over the Last 50 Years

By: Kate Wallace

Background & Hypothesis

Earlier this semester, I decided to use the Genius and Spotify APIs to decide if Maroon 5 had gotten more poppy throughout the course of their 6 studio albums. To define poppy, I used the New World Encyclopedia definition, which states “pop music is often distinguished… by stylistic traits such as a danceable rhythm or beat, simple melodies, and a repeating structure.”. To test this, I used unique word count for each track as well as most common words and the Spotify API’s measurement Danceability. You can look at this project here.

My findings were pretty conclusive that Maroon 5 had in fact gotten more poppy (more repetitive and more danceable,) and that the most common words used stayed the same throughout. I decided to take this a step further to see if it was just Maroon 5 who got more poppy, or if a poppiness shift had occured overall in music.

Obviously, I can’t look at every album or track ever released, so I turned to a Wikipedia article titled, List of best-selling albums by year in the United States. This shows the top album from every year for the last 6 decades based off of Billboard. Here is a link to Billboard's methodology. I decided to do the last five decades because many of the top albums for the 60s were soundtracks. It is important to note, that I took out all years that soundtracks were the top album because I felt like those were such a different genre as they are meant to be consumed with a visual. I also took out Josh Groban’s Christmas album because many of those Christmas songs are timeless.

For this project, I not only looked at unique words and danceability, but I also looked at energy, sentiment, and musical key. I hypothesize that over the course of 5 decades, the albums will have less unique words, be more danceable, more energetic, more negative, and have mostly major keys. The less unique words, more danceability, and more energy all come from the pop definition as I feel like these changes would show a more pop style. I don't think the keys will change because there are only so many to choose from and certain keys are more simplistic and user friendly. I hypothesize that music will be more negative because as the years have gone by, artists have become more vulnerable and more comfortable sharing their true feelings.

Even if none of my hypotheses are true, I thought it would be interesting to see how music has changed over time, if at all.

Word Count

One adjective used to describe poppiness is “repetitive.” I decided that average unique word count per album would help show repetitiveness because songs with less unique words would be more repetitive. To do this, I first used the Genius API to get the lyrics for each album. Then, I used the count function in R. I then looked in the environment to see how many observations there were for each album and divided it by the number of tracks on the album. This gave me the average number of unique words for the album. Below is the code to get the lyrics and find the count.

SimonaandGarfunkel <- genius_album(artist = 'Simon and Garfunkel', album = 'Bridge over Troubled Water')

SimonaandGarfunkel %>% unnest_tokens (word, lyric) %>% anti_join(stop_words) %>% count(word, sort = TRUE) -> BridgeovertroubledwaterCount

This graph shows the number of unique words per year. So, in 1970, the number one album was Bridge over troubled water by Simon & Garfunkel. Once running the above code, I found that the environment showed 353 unique words which I divided by 11 which was the number of tracks on the album. I recorded all of this in an excel worksheet I found it easier to keep all of the numbers straight, and because I wanted to create the graphs in Tableau.

353 divided by 11 gave me an average of 32.1 which is shown on the graph above. This graph shows that the average unique words actually increase throughout the years. The huge spikes in the 2000s and 2010s come from rap albums as they don’t follow the typical verse, chorus, verse, repeat chorus, bridge, repeat chorus structure. They also just overall have more unique words. In an effort to see if these rap albums were skewing the data, I created two graphs, one to show the average per decade with rap and one to show it without.

The Rap albums I didn’t include in the ‘without rap’ graph are The Eminem Show by Eminem (2002), Get Rich or Die Tryin’ by 50 Cent (2003), Tha Carter III by Lil Wayne (2008), Tha Carter III by Eminem (2010), Views by Drake (2016), and Scorpion by Drake (2018).

I found the average unique word count by decade by taking an average of all of the average word count totals for each album within the decade. Although taking out the rap albums did help make the unique words per album a bit closer, both of the graphs show that the number of unique words per album have increased since the 1970s.

Sentiment

The next variable I wanted to look at was sentiment of the album’s lyrics over time. To do this, I also used the Genius API. I took out the stop words for each album, and then used get_sentiments to find an afinn score for the words. Afinn gives each word a positive or negative score between -5 and 5 based on the word’s sentiment. After I found the afinn scores for every word that afinn had a score for, I used R to give me the average score for each album and recorded these averages in Microsoft Excel.

SimonaandGarfunkel %>% unnest_tokens (word, lyric) %>% anti_join(stop_words) -> BridgeOverSent

BridgeoverSentiment <- BridgeOverSent %>% inner_join(get_sentiments("afinn"))

mean(BridgeoverSentiment[["score"]])

The averages are shown in the graph below.

This graph shows the average sentiment per #1 album by year. 0 is seen to be neutral and the positive numbers are positive and the negative numbers are negative found this graph to be very interesting, because not only does it seem like average sentiment is more negative over time, but the sentiments of each album seem to be more polarized, meaning most of the albums are either extremely negative or extremely positive. For example, if 0 is neutral The Eminem Show (2002) had an average sentiment of -1.64 and The 20/20 Experience (2014) had average sentiment of 1.26, but most of the albums in the earlier decades stayed closer to 0 (between 0.5 and -0.5). I feel like this shift could be because artists felt more comfortable expressing their explicit feelings, especially the negative ones. They didn’t try to play it safe because people clearly wanted music that they could relate to.

For this graph, I then took the average of each decade in excel using the =average function and choosing the cells that represented the average sentiment for years within the decade. (ex: I chose the sentiment for the years 1970-1979 to get the average for the 1970s.) It reiterates that music has gotten more negative, especially in the last 20 years.

Danceability

Danceability, as well as the next two variables Energy and Key were done using the Spotify API.

Spotify defines danceability as “how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.” I thought this would be a good measure of poppiness because the definition states that pop music has a danceable beat. To find the average danceability per album, I found the average danceability in R. Here is example code for Simon & Garfunkel's Bridge Over Trouble Water.

BridgeOver <- get_artist_audio_features('Simon and Garfunkel')

BridgeOver %>% filter(album_name %in% 'Bridge Over Troubled Water') -> BridgeOverFinal

BridgeOverFinal %>% arrange (-danceability) %>% select (track_name, danceability) -> BridgeOverDanceability

mean(BridgeOverFinal[["danceability"]])

For some albums, I couldn’t find them in the Spotify API through the artist, so I had to call each track individually using its Spotify ID. Here is example code for The 20/20 Experience by Justin Timberlake.

get_track_audio_features('773hekg7UEdbGvv3lJ3CmV', authorization = get_spotify_access_token()) -> pusherlovegirljt get_track_audio_features('6vt0I1cw1YmAIKDJvHVIM5', authorization = get_spotify_access_token()) -> suitandtiejt get_track_audio_features('7I7EnQnVJH1uSJ0cSQKPuu', authorization = get_spotify_access_token()) -> dontholstradthewalljt get_track_audio_features('7z0JDE4w67HXt5lEWsU2Hj', authorization = get_spotify_access_token()) -> strawberrybubblegumjt get_track_audio_features('79MOydAvZYm8nyyzd6fiVi', authorization = get_spotify_access_token()) -> tunnelvisionjt get_track_audio_features('7xxEK0MQvkME1LSS2cIW7R', authorization = get_spotify_access_token()) -> spaceshipcoupejt get_track_audio_features('4CfYxSs4Dr8KWORCmN3hom', authorization = get_spotify_access_token()) -> thatgirljt get_track_audio_features('2AobDJxjDp5TbxGdR3JGen', authorization = get_spotify_access_token()) -> letthegroovegetinjt get_track_audio_features('4rHZZAmHpZrA3iH5zx8frV', authorization = get_spotify_access_token()) -> mirrorsjt get_track_audio_features('06aFFobYGQNlyqXcEmYPSm', authorization = get_spotify_access_token()) -> blueoceanfloorjt get_track_audio_features('3HaW1TqoBUrTwEYmtguKTa', authorization = get_spotify_access_token()) -> dressonjt get_track_audio_features('3rPDcjQNtOK0gey9Gu1tZ2', authorization = get_spotify_access_token()) -> bodycountjt

After this, I put the averages into an excel spreadsheet and created a graph showing the average danceability per year.

Using this graph, it is hard to tell how danceability has changed, overall, it seems like an upward trend, but I decided to use the averages for each decade to make better conclusions.

This graph clearly shows that music has gotten more danceable since we hit the 2000s. This is to be expected as music seems to be more poppy.

Energy

Spotify defines energy as a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. Below are two graphs showing Energy, one by year and one by decade. I found the averages the same way I did for danceability, but searched for Energy instead.

BridgeOverFinal %>% arrange (-energy) %>% select (track_name, energy) -> BridgeOverEnergy

mean(BridgeOverEnergy[["energy"]])

Below are two graphs, the first shows average energy per year, and the second shows average energy by Decade.

Both of these graphs show an overall increase in average energy by year until around 2010 when it begins to fall again.

Key

The final variable I looked at was Key. Spotify has data on which key every single track is in. To find this, I used R to sort by key_mode which gives the key pitch class and if it is major or minor.

BridgeOverFinal %>% arrange (key_mode) %>% select (track_name, key_mode) -> BridgeOverKeyMode

BridgeOverKeyMode %>% count(key_mode, SORT=TRUE) -> BOKM

I then recorded the number of times a track was in each key by decade to see which ones were most common. So, for example, I looked at all of the keys listed for soongs in albums from between 1970 and 1979 and then tallied how many of eacah there were. Here is a table of the number of tracks in each key overall for the best albums of the last 50 years.

I thought this table was interesting even though it doesn't directly relate to my hypothesis, it shows that the 5 most used keys are major, which made me start to believe that my hypothesis was correct and that most pop songs are in major keys.

To further check this, I decided to make a couple more decade focused tables. The first shows what percent of used keys were major in a given decade. This was found by counting the number of tracks in major keys and dividing it by the total number of tracks for that decade. I did this in excel. The other graph shows which key was most popular in each decade. They are all major.

Even though the percent of major keys per decade declines a bit, it still shows that an average of 68% of the songs on the top albums per decade are in major. That's almost 7 out of every 10 tracks.

I thought it was interesting that 4 of the 5 major keys that were most prominent were the easiest to play on the piano. C Major has no sharps or flats, G major has one sharp, D major has two sharps, and A major has 3 sharps. The most 4 most popular minors were simply the relative keys to C,G,A, and D major. These are a minor, e minor, and b minor, and f sharp minor. This shows that artists consistently try to keep it more simple.

Conclusion

It's hard to conclude that my hypothesis was completely right and that music has gotten more poppy, because the results were kind of mixed. As far as word count, it seems like the songs are actually less poppy because they have gotten less repetitive over time. Songs have gotten more negative over time, which is the opposite of what one things when they think of poppiness, but it makes sense that songs may be more negative as they get more poppy, simply because the songs are characteristically more emotional. The Danceability results show that songs have gotten more danceable (more poppy) over time. Energy is a bit contradictory as songs started to get more energetic, but have gone down in the last several years. Finally, I was correct in hypothesizing that music would stay in major keys, but I don't think this actually speaks to poppiness, because pop songs can be in any key. Even songs in minor keys can be danceable and fun. The New World Encyclopedia article above says that pop music has characteristically simple melodies, which could be why the music uses more simplistic keys, but it's hard to make that assumption.

Music has clearly changed so much, but based on this data, I'm not sure it's fair to make the generalization that music has gotten more poppy.