Naloxone Adoption in North Carolina
by Patrick Larsen
Note: arrow symbols are difficult to transport from RStudio into Brackets, so certain assignment operators (particularly ones that point left) have been replaced with a simple =.
With the opioid epidemic tearing through the nation, more states and police departments are adopting the opioid overdose reversal drug, Naloxone. The drug is used all over the country by emergency response teams and civilians alike. Because I'm interested in the journalistic applications of a language like R, I wanted to see how I could apply it to learn more about naloxone from the data at our disposal. My ultimate goal was to find places where the data didn't follow a pattern or was abnormal. As you will see, finding this would by necessity disprove my hypothesis, which will trust expected patterns.
My basic hypothesis for this specific test is if a county has higher rates of opioid overdoses over time, then we will see more naloxone kits being sent to these counties, as well as a higher rate of overdose reversals in those areas.
I tested this using data from our friends at Guilford County, as well as the North Carolina Harm Reduction Center, or NCHRC. I used the r program "leaflet," which was easy to get acquainted with and surprisingly deep. To begin with, we'll load some packages:
Next, I cleaned a data table (DRncDeaths1416.xlsx) and narrowed a data table from Guilford County. The plan was to single out a specific category of data that would prove useful for a data analysis - in this instance, the number of deaths per county (total_deaths) within the sample.
overdose.deaths = read_excel("DRncdeaths1416.xlsx",
col_types = c("text", "date", "text", "text", "text", "text",
"text", "text", "text", "text", "text", "text", "text", "text",
"text", "numeric", "numeric", "numeric", "text", "text", "text",
"numeric", "text", "text", "text", "text", "text", "text",
"text","text", "text", "text", "text", "numeric", "numeric",
"numeric", "numeric", "numeric", "numeric", "numeric", "numeric",
"numeric", "numeric", "numeric", "text", "text", "text", "text",
"text", "text", "text", "text", "text", "text", "text"))
overdose.deaths = overdose.deaths %>%
mutate(rcounty = str_sub(rcounty, 6, length(rcounty)))
#we are also going to change a column name
nc3 = counties(37) %>%
rename(rcounty = NAMELSAD)
overdose.deaths %>% group_by(rcounty) -> overdose.deaths2
count() -> overdose.deaths3
ungroup() -> overdose.deaths4
rename(total_deaths = n) -> overdoseDeaths5
I also applied this method to another data table, this one with information on the same statistics, but from earlier years. This brings us to a point where the data is easy to read and usable. Next, we're going to go into Leaflet to see what we can do with some basic data application.
To begin with, we need to decide on what we're going for as far as look. Because it's a fairly basic analysis, I'm going to use standard markers. To give it some more flair and just to make it easier on the eyes, I'm also going to add a popup window on each map point that contains information about it. We are going to apply our code to "mapTop5" to signify that this map shows the top five counties in opioid overdose deaths in 2014-2016. This is what it should look like for one map point:
mapTop5 = leaflet() %>%
setView(lng = -79.019300, lat =35.759573, zoom = 6) %>%
addMarkers(lng = -80.854385, lat = 35.263266,
popup = paste("Mecklenburg County", "=br=",
"99-13 Mean Deaths: 30.79", "=br=",
"14-16 Mean Deaths: 114"))
From here, we use a pipe (%>%) operator at the end of each "Mean Deaths" line to continue adding markers for the rest of our top 5. After running the code, you can run just "mapTop5" and come up with this:
The map is simple, interactive, and readable. It's not the most immediately representative plot that we could get out of R, but proximity is an important factor in determing news value. This makes determining proximity a breeze for reporters and news consumers alike.
Next, we'll do the same work with data from the NCHRC. We are looking to indicate which counties had the most overdose reversals by law enforcement officials using naloxone, and see how much naloxone had been distributed to these areas. Here's the beginning of it:
mapReversals = leaflet() %>%
setView(lng = -79.019300, lat =35.759573, zoom = 6) %>%
addMarkers(lng = -80.187506, lat = 36.120000,
popup = paste("Forsyth County", "=br=",
"Over 100 Reversals Reported", "=br=",
"Over 1000 Naloxone Kits distributed"))
This follows the same steps as last time, too - keep using pipe operators to include each of the top counties in reversals. Here's what we get from running "addMarkers":
From here, we can see some discrepancies, and therefore start asking questions. The most notable question, of course, is what is going on with Mecklenburg County. Even though it sits at the top of the overdose deaths category, it doesn't even make it into the top 5 of naloxone rescues. This means that my hypothesis is not proven, which is very exciting! This leads to questions like: does Mecklenburg County have a problem with stocking naloxone? In this instance, the original data tells us that Mecklenburg has more access to naloxone than most counties. So, what's the problem - has their law enforcement received inadequate instruction or training for naloxone use? Has the problem overwhelmed their resources? These are questions that you could get a potentially eye-opening investigative report out of - thanks to a simple use of a mapping application in R.
Of course, this level of application barely scratches the surface of R's capabilities in data representation. There's so much more you can do, even within the Leaflet program, that will be engaging and visually appealing to any audience. You can also get into ggplot2, which is a great graphing program that lets you view the interactions between your data from many perspectives quickly and easily. The potential uses for a program like this in journalism and communications as a whole are unlimited. It can tell us stories that words alone might struggle to convey.
For more more information on drug overdoses in North Carolina or the United States as a whole, visit the NCHRC website.
I may have entered this project with ulterior motives. I understand that the point was more or less to get an introduction to R's capabilities with data cleaning and representation, and those are certainly the lines along which I worked - but, as I've indicated, I was most interested in how this program could be used as a tool to find leads on stories. I was pleased with the results, because I happened to run a test in which my hypothesis failed. This showed me directly that I can quickly sort through data in several different ways and find places where what I know simply can not account for what the data says. The only good way to start a story is to ask questions, and R is a portal to questions. Personally, I'm excited to continue using this program for journalism in the future, because of its ease of use and practicality in a journalistic setting.