Skip to content

This repository contains an in-depth analysis of the Titanic dataset, a popular dataset used for exploring predictive modeling and statistical analysis techniques. The project focuses on understanding the factors that influenced the survival rates of passengers aboard the Titanic.

Notifications You must be signed in to change notification settings

Tobibiggest/Titanic-Dataset-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Titanic Dataset Analysis

This repository contains an in-depth analysis of the Titanic dataset, a popular dataset used for exploring predictive modeling and statistical analysis techniques. The project focuses on understanding the factors that influenced the survival rates of passengers aboard the Titanic.

Table of Contents

  • Project Overview
  • Methodology
  • Installation
  • Usage
  • Key Visualizations
  • Results and Findings

Project Overview

This project aims to analyze the Titanic dataset to uncover insights into the survival rates of passengers. The analysis involves data preprocessing, exploratory data analysis (EDA), statistical testing, and the use of visualizations to illustrate findings.

Methodology

The analysis was conducted using the following steps:

  1. Data Preprocessing: Handling missing values, data reduction, and feature selection.
  2. Exploratory Data Analysis (EDA): Visualizing distributions and relationships between variables.
  3. Statistical Testing: Applying tests like t-tests and chi-square tests to validate hypotheses.

Installation

To run the analysis on your local machine, follow these steps:

  1. Clone the repository:
    git clone https://github.com/tobibiggest/titanic-analysis.git
  2. Navigate to the project directory:
    cd titanic-analysis
  3. Install the required dependencies:
    pip install -r requirements.txt

Usage

To start the analysis, run the following command:

python analysis.py

Ensure that you have all necessary libraries installed and the Titanic dataset is correctly loaded into your working directory.

Key Visualizations

Below are some key visualizations from the analysis. These visualizations provide insights into the survival factors:

1. Survival Rate by Passenger Class

![Survival Rate by Passenger Class]

CountPlot1

2. Age Distribution of Passengers

![Age Distribution]

Univariate Vizz4

3. Correlation Heatmap

![Correlation Heatmap]

Correlation map

4. Other Key Vizzs

Visualizing Missing Values Andrews Curve FacetGrid DonutChart Univariate Vizz5 newplot

To view all the visualizations, check the images folder in the repository or run the notebook titanic_analysis.ipynb.

Results and Findings

The analysis revealed several key factors that contributed to passenger survival, including:

  • Passenger Class: Higher-class passengers had better survival rates.
  • Age: Younger passengers were more likely to survive.
  • Fare: Higher fares were associated with higher survival rates.

For detailed results, refer to the notebook in the repository.

Contributing

If you would like to contribute to this project, please fork the repository and submit a pull request. We welcome any improvements or additional analyses.

About

This repository contains an in-depth analysis of the Titanic dataset, a popular dataset used for exploring predictive modeling and statistical analysis techniques. The project focuses on understanding the factors that influenced the survival rates of passengers aboard the Titanic.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published