You can find my final projecte here.
This problem is drawn from the analysis of Housing data for 506 census tracts of Boston from the 1970 census.
We will be doing exploratory data analysis to compare the data set and random samples from the data. This will inform us of what we can expect when we take random samples and the full source data set, the population is not available to us as is usually the case.
We will examine the original 1979 Boston housing dataset and consists of 506 observations and 14 attributes plus one more attribute. I will include one of the added attributes, the Census Tract Codes which are unique identifiers used by the U.S. Census Bureau. This attribute identifies each tract and can act as an index.
Solutions for this project will be data visualization as it will give us a sense of the data overall. We are only looking to make generalizations about sampling.
As this is exploratory data analysis and not modeling, there is no benchmark model.
None because this is exploratory data analysis.