This project explores the European greenhouse gas emissions (GHG) data which can be found here:https://ec.europa.eu/eurostat/databrowser/view/env_air_gge/default/table?lang=en
This dataset includes data on the "greenhouse gas emissions (GHG) inventory", as reported by each EU country to the European Environment Agency (EEA) between 2009-2018 and was last updated in June 2020. The GHG inventory contains data on:
-
amounts of various air pollutants and greenhouse gases emitted (in total for the EU and by country): carbon dioxide (CO2), methane (CH4), nitrous oxide (N2O), perfluorocarbons (PFCs), hydrofluorocarbons (HFCs), sulphur hexafluoride (SF6) and nitrogen trifluoride (NF3); and
-
emissions by five key GHG "source sectors": Energy, Industrial processes, Agriculture, Land use and forestry and Waste management (by country and EU total).
In this project I analysed sheet 1 (out of 99 excel sheets) - which includes the consolidated total GHG for EU countries across all air pollutants and sectors (including international aviation).
The project showcases various data manipulation techniques in pandas to pre-process and wrangle the original data, as well as how to conduct exploratory data analysis and visualise some of our findings in matplotlib, seaborn and Tableau.
Specifically, this project will demonstrate pandas best practices and the following data manipulation topics:
- Sorting and subsetting dataframes
- Aggregation and summary statistics
- Slicing and indexing data
- Pivoting tables
- Handling missing data
- Plotting/visualising data in matplotlib, seaborn & Tableau.
In the process, we'll gain some interesting insights about GHG emissions in the EU and Australia.