Hatim Alshehri
Aqar is a website specialized in Saudi real estate. The project's goal is to extracting data, for rent apartments in the Riyadh region from Aqar website, and generate a regression model to predict the prices of apartments.
Data scraped from Aqar website using Selenium tool.
The data scraped is described by 14 features as follows:
- Field Description:
Field Name | Description |
---|---|
District | Apartment districts/neighborhoods |
Category | (e.g., snigal/fmaily) |
Bedrooms | Number of bedrooms |
Livingrooms | Number of Livingrooms |
Bathrooms | Number of Bathrooms |
Furnished | Does apartment has Furnished (e.g., yes/no) |
Kitchen | Does apartment has Kitchen (e.g., yes/no) |
Garage | Does apartment has Kitchen (e.g., yes/no) |
Elevator | Does apartment has Elevator (e.g., yes/no) |
AC | Does apartment has AC (e.g., yes/no) |
Region | Apartment region in Riyadh (e.g., west/north) |
floor_number | Apartment floor number |
AGE | Property age |
Price | Apartment rent price per year |
A regression model analysis was conducted that encompasses many features and among them the apartment's price. To gather apartment data Aqar.com has been scraped, one of the top and most visited online real estate agencies in KSA. I utilized several regression models and tested for the best fit; to ensure the best predictor tool.
-
Clean: The dataset had duplicate observations, NaN values, and spaces in between the categorical features, so we used pandas library to prepare the data for the regression model.
-
Preprocessing:Used transformation methods in order to apply to standardize the values at an equivalent scale and to linearize some of the features that are not linear.
Language: Python:
-
Data Scraping libraries: Selenium
-
EDA Libraries: Pandas, numpy, seaborn, matplotlib، Missingno
-
Model Building Libraries: sklearn and Model Testing libraries sklearn