Term Project for Tufts University's COMP119 Big Data course, Fall 2019
R API examples adapted from WaPo ARCOS stories.
ARCOS/Opioid Data courtesy Washington Post. Social Vulnerability Index from CDC. GIS shapefiles from Census Bureau.
All Python/Pyspark code and R code adaptation/troubleshooting by Michael Rogove QGIS images produced by Megana Lakshmi Padmanabhan
Michael Rogove | Megana Lakshmi Padmanabhan | Kevin Hederman
At first, it seems like the prescription opioid crisis is most pronounced in Southern New Hampshire. But normalizing per capita threw a spotlight on Northern, rural New Hampshire.
Clearly, something is fishy in Coos County, particularly in its Rite Aid pharmacies.
Draft Queries to Dive Deeper into Raw Opioid Data; Python to blend results with Social Vulnerability Index
Here, we show who the most suspicious pharmacy bought from (doubled their McKesson orders in just 7 years; added a major oxycodone souce from Eckerd). We also can see that the flood of prescription opioids correlates with rural, poorer counties with higher disability rates.
Five groups:
- Main Presentation (.pptx file).
- R analysis of API data.
- PySpark analysis of raw ARCOS data.
- SVI data and analysis.
- (Bonus: QGIS and related files)
This is what we will mostly use during final presentation. START HERE.
- COMP119F19OpioidTermProject.pptx
Jupyter notebook: "OpioidProjectNotebook"
You only need to view one of these files, in order of preference:
- ROpioidProjectNotebook.html
- ROpioidProjectNotebook.ipynb
- .pdf (just in case)
Jupyter notebook.
You only need to view one of these files, in order of preference:
- PySparkOpioidProjectNotebook.html
- PySparkOpioidProjectNotebook.ipynb
- .pdf (just in case)
Jupyter notebook and an excel file.
Excel file contains pivot tables and comparisons: NewHampshireSVI_analysis.xlsx
How we generated plot/correlation matrix:
- NH_county_summary.html
- NH_county_summary.pdf (just in case)
- zipped separately
If you are a journalist or researcher in New Hampshire or who wants to expand on this research more generally nation-wide, let us know and we can add more color on how to do this quickly and cheaply using Google Cloud Platform.