This repository contains an in-depth analysis of the Titanic dataset, a popular dataset used for exploring predictive modeling and statistical analysis techniques. The project focuses on understanding the factors that influenced the survival rates of passengers aboard the Titanic.
- Project Overview
- Methodology
- Installation
- Usage
- Key Visualizations
- Results and Findings
This project aims to analyze the Titanic dataset to uncover insights into the survival rates of passengers. The analysis involves data preprocessing, exploratory data analysis (EDA), statistical testing, and the use of visualizations to illustrate findings.
The analysis was conducted using the following steps:
- Data Preprocessing: Handling missing values, data reduction, and feature selection.
- Exploratory Data Analysis (EDA): Visualizing distributions and relationships between variables.
- Statistical Testing: Applying tests like t-tests and chi-square tests to validate hypotheses.
To run the analysis on your local machine, follow these steps:
- Clone the repository:
git clone https://github.com/tobibiggest/titanic-analysis.git
- Navigate to the project directory:
cd titanic-analysis
- Install the required dependencies:
pip install -r requirements.txt
To start the analysis, run the following command:
python analysis.py
Ensure that you have all necessary libraries installed and the Titanic dataset is correctly loaded into your working directory.
Below are some key visualizations from the analysis. These visualizations provide insights into the survival factors:
![Survival Rate by Passenger Class]
![Age Distribution]
![Correlation Heatmap]
To view all the visualizations, check the images
folder in the repository or run the notebook titanic_analysis.ipynb
.
The analysis revealed several key factors that contributed to passenger survival, including:
- Passenger Class: Higher-class passengers had better survival rates.
- Age: Younger passengers were more likely to survive.
- Fare: Higher fares were associated with higher survival rates.
For detailed results, refer to the notebook in the repository.
If you would like to contribute to this project, please fork the repository and submit a pull request. We welcome any improvements or additional analyses.