Recommendation System - RBM

Restricted Boltzmann Machine (RBM) is a generative learning model that is useful for collaborative filtering in recommendation system. RBM is much robust and makes accurate predictions compared to other models such Singular Value Decomposition (SVD). In this implementation we show the parallelization of RBM model which will be helpful in large datasets as the Markov Chain Monte Carlo (MCMC) steps in the learning algorithm is computationally expensive.

Getting Started

Clone the files to a lcoal directory or mounted directory in case of nauta.

TODO - Add requirements.txt for bare metal

General Instructions

Ensure that the rbm-recommendation directory is appended to the PYTHONPATH correctly.

This can be done by updating the sys.path in train.py and eval.py.

sys.path.append('/path/to/rbm-recommendation')

Using Nauta:

sys.path.append('/mnt/output/home/rbm-recommendation')

Data

The MovieLens datasets is used for training and evaluation. The full dataset consist of 27,000,000 ratings and 1,100,000 tag applications applied to 58,000 movies by 280,000 users.

Train

Train using 4 nodes and 4 processes each.

Export the required environment variables. This needs to be be udpated in templates values.yaml in case of nauta.

export OMP_NUM_THREADS=9
export KMP_BLOCKTIME=0

Update train.py, data_dir and output_dir paths accordingly.

mpiexec --hostfile /path/to/hostfile --map-by ppr:4:node --oversubscribe -n 16 -x OMP_NUM_THREADS -x KMP_BLOCKTIME python /path/to/train.py --hidden=100 --epochs=1 --gbz=512 --data_dir="/path/to/data/movielens_full.csv" --output_dir="/path/to/output_dir"

Using Nauta:

nctl exp submit --name rec-sys-test -t multinode-tf-training-horovod /path/to/train.py -- --hidden 100 --epochs 1 --gbz 512 --data_dir "/path/to/data/movielens_full.csv" --output_dir "/mnt/output/experiment"

Eval

Evaluate the model by loading the weights, bias hidden and bias visible files obtained from training.

mpiexec --hostfile /path/to/hostfile --map-by ppr:4:node --oversubscribe -n 16 -x OMP_NUM_THREADS -x KMP_BLOCKTIME python /path/to/eval.py --data_dir="/path/to/data/movielens_full.csv" --weights_file="/path/to/rbm_w_file.txt" --bias_hidden "/path/to/rbm_bh_file.txt" --bias_visible "/path/to/rbm_bv_file.txt" --output_dir="/path/to/output_dir"

Using Nauta:

nctl exp submit --name rec-sys-eval-test -t multinode-tf-training-horovod /path/to/eval.py -- --data_dir "/path/to/data/movielens_full.csv" --weights_file "/path/to/rbm_w_file.txt" --bias_hidden "/path/to/rbm_bh_file.txt" --bias_visible "/path/to/rbm_bv_file.txt" --output_dir "/mnt/output/experiment"

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ContrastiveDivergence		ContrastiveDivergence
LICENSE		LICENSE
README.md		README.md
eval.py		eval.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recommendation System - RBM

Getting Started

General Instructions

Data

Train

Eval

Related articles

About

Releases

Packages

Languages

License

dellemc-hpc-ai/rbm-recommendation

Folders and files

Latest commit

History

Repository files navigation

Recommendation System - RBM

Getting Started

General Instructions

Data

Train

Eval

Related articles

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages