Skip to content

Latest commit

 

History

History
100 lines (57 loc) · 4.44 KB

README.md

File metadata and controls

100 lines (57 loc) · 4.44 KB

Recommendation-System

Google Collab

Building and comparing recommendation systems to scale using scikit-surprise (surprise library)

Recommender systems are one of the most common used and easily understandable applications of data science. Lots of work has been done on this topic, the interest and demand in this area remains very high because of the rapid growth of the internet and the information overload problem. It has become necessary for online businesses to help users to deal with information overload and provide personalized recommendations, content and services to them.

Two of the most popular ways to approach recommender systems are collaborative filtering and content-based recommendations. In this post, we will focus on the collaborative filtering approach, that is: the user is recommended items that people with similar tastes and preferences liked in the past. In another word, this method predicts unknown ratings by using the similarities between users.

Dataset

GroupLens Research has collected and made available rating data sets from the MovieLens web site (http://movielens.org). The data sets were collected over various periods of time, depending on the size of the set.

We are using Small: 100,000 ratings and 3,600 tag applications applied to 9,000 movies by 600 users. Last updated 9/2018.

Download: ml-latest-small.zip (size: 1 MB)

Algorithm Comparisons

Algorithm test_rmse fit_time test_time
SVDpp 0.849133 224.427830 10.164514
KNNBaseline 0.855728 0.198152 2.959032
BaselineOnly 0.861078 0.141793 0.198635
SVD 0.863973 3.470947 0.207932
KNNWithZScore 0.866793 0.142699 2.660879
KNNWithMeans 0.870065 0.101380 2.389334
SlopeOne 0.872713 1.340127 7.466537
NMF 0.901370 3.766373 0.215193
CoClustering 0.920521 1.404656 0.216376
KNNBasic 0.923332 0.088885 2.163818
NormalPredictor 1.401411 0.086856 0.249340

Algorithms Used

NormalPredictor

  • NormalPredictor algorithm predicts a random rating based on the distribution of the training set, which is assumed to be normal. This is one of the most basic algorithms that do not do much work.

BaselineOnly

  • BasiclineOnly algorithm predicts the baseline estimate for given user and item.

k-NN algorithms

KNNBasic

  • KNNBasic is a basic collaborative filtering algorithm.

KNNWithMeans

  • KNNWithMeans is basic collaborative filtering algorithm, taking into account the mean ratings of each user.

KNNWithZScore

  • KNNWithZScore is a basic collaborative filtering algorithm, taking into account the z-score normalization of each user.

KNNBaseline

  • KNNBaseline is a basic collaborative filtering algorithm taking into account a baseline rating.

Matrix Factorization-based algorithms

SVD

SVDpp

  • The SVDpp algorithm is an extension of SVD that takes into account implicit ratings.

NMF

  • NMF is a collaborative filtering algorithm based on Non-negative Matrix Factorization. It is very similar with SVD.

Slope One

Co-clustering

We use rmse as our accuracy metric for the predictions.

CREDITS

Kuldeep Singh Sidhu

Github: github/singhsidhukuldeep https://github.com/singhsidhukuldeep

Website: Kuldeep Singh Sidhu (Website) http://kuldeepsinghsidhu.com

LinkedIn: Kuldeep Singh Sidhu (LinkedIn) https://www.linkedin.com/in/singhsidhukuldeep/