Gender Neutral Baby Names Analysis

By: Laurel Wind

Hypothesis: Baby names that were gender specific for males between 1880 and 1899 are now gender neutral and tend to favor females between 2000 and 2014.

To test this hypothesis, I chose to analyze the gender trends for the three gender-neutral names: Madison, Riley, and Morgan.

While there is no concrete evidence as to why names transitioned from gender specific to gender neutral names in the early 1980s, second-wave of feminism in the United States ended in the early 80s and can be implied the feminist movement had a direct affect on the gender trends with baby names. An in-depth definition of second-wave feminism can be accessed here.

This movement was associated with the challenging of androcentrisic beliefs, or the practice of having a male-centered view of the world. Thus, in the early 80s, women started naming their children gender-specific male names to challenge this worldview.

Other names that follow this trend include: Harper, Peyton, Aubrey, and Jamie. Other gender-specific names in the late 1880s follow the trend of becoming gender-neutral, but do not follow the trend of becoming almost exclusively gender-neutral for females. These names include: Charlie, Blaine and Jordan.

Required R Packages

This analysis utilizes R and RStudio as well as three packages that can be installed directly in the program. The required packages are: babynames, ggplot2, and dplyr.

Analysis and Code

In order to more easily view and analyze the proportion of baby names, I created a new data set, called babynamesPct and utilized the mutate function and select functions to convert the proportion to percent and to remove the proportion column from the new data set.

babynamesPct <- mutate(babynames, pct = (prop*100))

babynamesPct <- select(babynamesPct, -prop)

To test the hypothesis, we looked at the names Madison, Riley, and Morgan.

To begin my analysis, we created a subset of the data set called Madison1800 and utilized the filter function to only select the years between 1880 and 1889 and to limit the data set to males. The code can be seen below.

madison1800 <- babynamesPct %>% filter(year< 1900 , name== "Madison")

We then utilized the ggplot2 package to plot the data.

ggplot(madison1800, aes(year, pct)) + geom_line(color='turquoise3') + ggtitle("Madison 1880-1899")

Based on the graph, it can be understood between 1880 and 1899, Morgan was used as a gender specific name for males only, as there are no recoded females born between 1880 and 1899 with the name Madison.

We then created another subset of the data called Madison2000 and repeated the process for babies born with the name Madison between 2000 and 2014.

madison2000 <- babynamesPct %>% filter(year< 1999 , name== "Madison")

ggplot(madison2000, aes(year, pct)) + geom_line() + ggtitle("Madison 2000-2014")

ggplot(madison2000, aes(year, pct)) + geom_line(color='turquoise3') + ggtitle("Madison 2000-2014")

Based on the graph, I concluded while Madison was a gender specific name between 1880 and 1889, it is now gender neutral and tends to favor females, as males with the name Madison have a significantly lower percentage than females with the same name.

We then repeated the process for both Riley and Morgan. The code and graphs can be seen below.

riley1800 <- babynamesPct %>% filter(year< 1900 , name== "Riley")

ggplot(riley1800, aes(year, pct)) + geom_line(color='turquoise3') + ggtitle("Riley 1880-1899")

riley2000 <- babynamesPct %>% filter(year< 1999 , name== "Riley")

ggplot(riley2000, aes(year, pct)) + geom_line() + ggtitle("Riley 2000-2014")

Based on the graph, I saw the name Riley was gender specific for males during the selected time frame and was significantly more popular than the aforementioned name.

Based on the graph, I saw the name Riley was gender neutral in the selected time frame, however this particular name continues to be popular for both genders in the early 2000s, whereas Madison became almost gender specific for females.

morgan1800 <- babynamesPct %>% filter(year< 1900 , name== "Morgan")

ggplot(morgan1800, aes(year, pct)) + geom_line(color='turquoise3') + ggtitle("Morgan 1880-1899")

morgan2000 <- babynamesPct %>% filter(year< 1999 , name== "Morgan")

ggplot(morgan2000, aes(year, pct)) + geom_line() + ggtitle("Morgan 2000-2014")

Based on the graph, I saw the name Morgan was gender specific for males during the selected time frame and has similar popularity traits to the previously analyzed name Madison.

According to the graph, I saw the name Morgan was gender neutral in the selected time frame, however this particular name shares similar traits to Madison in the sense that it became almost exclusively gender specific for females in the early 2000s.

We then analyzed the three names collectively to find the period of time when the names began the transition into a gender specific name.

names <- babynames %>% filter(name%in%c("Madison","Riley","Morgan")

ggplot(names, aes(year,prop, color=sex)) + geom_line() + ggtitle("Names Transition Whole")

names2 <- babynames %>% filter(year> 1960 , name%in%c("Madison","Riley","Morgan")

ggplot(names2, aes(year,prop, color=sex)) + geom_line() + ggtitle("Names Transition")

The graph on the left displays the full time-series trends for the names Madison, Riley, and Morgan between 1880 and 2014. The graph on the right looks more closely into the period of time between 1960 and 2014 and highlights the shift from gender specific names to gender neutral names around 1980. Although I only analyzed three names, it can be inferred this trend continues with other names.

----------------------------------------------------------------------------------------

Conclusion

Based on the three names I analyzed, I came to the conclusion that my hypothesis was true: Baby names that were gender specific for males in the late 1800s are now gender specific and tend to favor females in the early 2000s.