Skip to content

gregwchase/amazon-mxnet-content-recsys

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Content-Based Recommendation Systems with Apache MXNet

This is the repository for the Content-Based Recommendation Systems with Apache MXNet article created for the MXNet Medium blog, in partnership with Amazon.

Prerequisites

Repo Tree & File Descriptions

.
├── data
│   └── articles_first_1000.csv
├── LICENSE
├── README.md
└── src
    ├── Content-Based Recommendation System with MXNet.ipynb - Notebook for the first 1000 articles.
    ├── download_articles.sh - Download all article data with the Kaggle API.
    ├── recsys_all_articles.py - Create recommendation system with 50,000 articles.
    └── recsys.py - Create recommendation system with first 1,000 articles.

Instructions - Library Installation

If the prerequisites aren't already installed, run the following commands.

Libraries

Kaggle API - pip install kaggle

MXNet - pip install mxnet

Spacy
pip install -U spacy

python -m spacy download en

python -m spacy download en_core_web_md

Data

  • All news article data is downloaded via download_articles.sh.

Download Data

You must have your Kaggle API credentials to proceed with the download!

bash download_articles.sh

Run The Noteboook

Launch the Content-Based Recommendation System with MXNet notebook to for building a recommendation system with the first 1000 articles.

References

Amazon - Item to Item Collaborative Filtering

Create Custom Stop-Word List

Kaggle - All The News Data Set

Kaggle API

Keras shoot-out: TensorFlow vs MXNet

Machine Learning :: Cosine Similarity for Vector Space Models (Part III)

MXNet - Sparse NDArray API

MXNet - Tutorials

Netflix Prize

Scikit-Learn: TF-IDF Vectorizer

spaCy

Stemming and Lemmatization

About

Content Based Recommendation System; partnership with Amazon MXNet team.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published