Titanic

Machine learning from disaster

This repository contains the code and related files for the "Titanic: Machine Learning from Disaster" project on Kaggle. The goal of the project is to create a machine learning model that predicts whether a passenger survived the Titanic shipwreck or not.

Competition Description

The sinking of the RMS Titanic remains an indelible mark on maritime history, standing as one of the most notorious shipwrecks to date. On the fateful day of April 15, 1912, during its inaugural journey, the Titanic met its demise when it struck an iceberg, leading to the devastating loss of 1502 lives out of the 2224 passengers and crew aboard. This heartrending incident reverberated across the globe, prompting the international community to enforce stringent safety measures for all future vessels.

Goal

The challenge is to analyze the characteristics of the passengers and develop a machine learning model that can predict the likelihood of survival. By applying the tools of machine learning, we aim to predict which passengers survived the tragedy.

Data

The repository provides two datasets: a training set (train.csv) and a test set (test.csv).

Variable Notes

pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower

age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5

sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)

parch: The dataset defines family relations in this way...

Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.

Embarked: Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton

Survival: 0 = No, 1 = Yes

Evaluation

The evaluation metric for this competition is accuracy, which measures the percentage of correctly predicted passengers.

Submission File Format

You should submit a CSV file with exactly 418 entries plus a header row. Your submission will show an error if you have extra columns (beyond PassengerId and Survived) or rows.

The file should have exactly 2 columns:

PassengerId (sorted in any order)
Survived (contains your binary predictions: 1 for survived, 0 for deceased)

PassengerId,Survived
 892,0
 893,1
 894,0

Usage

Clone the repository to your local environment:

git clone https://github.com/g3rley/titanic.git

Install the project dependencies:

pip install -r requirements.txt

Run the titanic.ipynb Jupyter Notebook to explore the data, create and evaluate the model.

Export the final model predictions to the submission.csv file.

Built With

Python - Programming language
Jupyter Notebook - Web application for creating and sharing documents that contain live code, equations, visualizations and narrative text
Pandas - Data analysis and manipulation tool
NumPy - Library for working with arrays
Matplotlib - Library for creating static, animated, and interactive visualizations
Seaborn - Data visualization library based on matplotlib
Scikit-learn - Machine learning library for the Python programming language

Authors

Gerley Adriano - g3rley

Acknowledgments

License

This project is licensed under the MIT License. See the LICENSE file for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
img		img
input		input
working		working
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
solution.ipynb		solution.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Titanic

Machine learning from disaster

Competition Description

Goal

Data

Variable Notes

Evaluation

Submission File Format

Usage

Built With

Authors

Acknowledgments

License

About

Releases

Packages

Languages

License

g3rley/titanic

Folders and files

Latest commit

History

Repository files navigation

Titanic

Machine learning from disaster

Competition Description

Goal

Data

Variable Notes

Evaluation

Submission File Format

Usage

Built With

Authors

Acknowledgments

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages