Skip to content

An implementation of the paper "Neural Word Embedding as Implicit Matrix Factorization"

Notifications You must be signed in to change notification settings

igorsterner/paper-word2vec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Paper Implementation

This repo contains an implementation of Neural Word Embedding as Implicit Matrix Factorization by Omer Levy and Yoav Goldberg.

Citation

@inproceedings{NIPS2014_feab05aa,
 author = {Levy, Omer and Goldberg, Yoav},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {Z. Ghahramani and M. Welling and C. Cortes and N. Lawrence and K.Q. Weinberger},
 pages = {},
 publisher = {Curran Associates, Inc.},
 title = {Neural Word Embedding as Implicit Matrix Factorization},
 url = {https://proceedings.neurips.cc/paper_files/paper/2014/file/feab05aa91085b7a8012516bc3533958-Paper.pdf},
 volume = {27},
 year = {2014}
}

Table of Contents

Dataset

mkdir data
cd data
wget https://downloads.wortschatz-leipzig.de/corpora/eng_news-typical_2016_1M.tar.gz 
tar -xvzf eng_news-typical_2016_1M.tar.gz

Installation

pip install -r requirements.txt

Usage

Preparations:

python src/tools/dependency-parser.py
python src/tools/pair-data.py

Main script:

python src/main.py

Visualisation

python src/tools/theasarus.py --word money
python src/tools/theasarus.py --find_rnns 1

Examples

Words most similar to money:

[('resources', 0.12),
 ('capital', 0.11),
 ('paper', 0.11),
 ('property', 0.1),
 ('total', 0.1),
 ('value', 0.09),
 ('cards', 0.09),
 ('time', 0.09),
 ('knowledge', 0.09),
 ('products', 0.09)]

About

An implementation of the paper "Neural Word Embedding as Implicit Matrix Factorization"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages