Skip to content

rnair7163/Clustering-and-Regression-Analysis-of-Gerrymandering

 
 

Repository files navigation

Clustering and Regression Analysis of Gerrymandering

There are 2 parts to this project:

  • Clustering
  • Regression

Clustering

The main goal with clustering was to create a plan for PA in which we can get considerably compact districts maintaining good distribution of population across districts. For that, we chose weighted K-means clustering in which we use population to calculate weight which will be used to make the cluster have better population distribution.

Regression

We are taking the 2012 vtd grouped by County. We are predicting the proportion of a county that voted D in the 2012 presidential election (67 counties total). For that, we are using following regression techniques:

  • Stepwise Regression
  • Best Subset Selection Regression
  • Multiple Linear Regression
  • XGBoost
  • Ridge Regression
  • LASSO Regression
  • Random Forest

Contributors:

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%