This is Python section of the capstone on Spotify, mainly to explore data analysis techniques and presentation style with python.
This serves to be a basis for the dashboards created.
This is a data visualization and analysis done with python and jupyter notbook in the file named Spotify.ipynb
The "data.csv" file contains more than 170,000 songs collected from Spotify Web API, and also you can find data grouped by artist, year, or genre in the other datasets.
This dataset is uploaded by Kaggle user Yamaç Eren Ay, you can find the original dataset here https://www.kaggle.com/yamaerenay/spotify-dataset-19212020-160k-tracks
- id (Id of track generated by Spotify)
- acousticness (Ranges from 0 to 1)
- danceability (Ranges from 0 to 1)
- energy (Ranges from 0 to 1)
- duration_ms (Integer typically ranging from 200k to 300k)
- instrumentalness (Ranges from 0 to 1)
- valence (Ranges from 0 to 1)
- popularity (Ranges from 0 to 100)
- tempo (Float typically ranging from 50 to 150)
- liveness (Ranges from 0 to 1)
- loudness (Float typically ranging from -60 to 0)
- speechiness (Ranges from 0 to 1)
- year (Ranges from 1921 to 2020)
- mode (0 = Minor, 1 = Major)
- explicit (0 = No explicit content, 1 = Explicit content)
- key (All keys on octave encoded as values ranging from 0 to 11, starting on C as 0, C# as 1 and so on…)
- artists (List of artists mentioned)
- release_date (Date of release mostly in yyyy-mm-dd format, however precision of date may vary)
- name (Name of the song)