Skip to content

revisiting nlp with emojis (a twitter analysis)

Notifications You must be signed in to change notification settings

Threynaud/emojis

Repository files navigation

emojis

In this project, I chose to analyse the use of emojis on Twitter.

I gathered more than 3 millions tweets over a week, excluding retweets, that I stored in a MongoDB database.

After cleaning them by removing punctuation, converingt uppercases to lowercases and lemmatizing, I used gensim implementation of Word2Vec to analyze the hidden semantic of emojis... Word2Vec is a very powerful tool that allow us to find the most similar words to a given word or sentence. This is how we can create a dictionnary of emojis by finding their "word synonyms!

I show in the presentation that emojis can be considered as a new language and are a powerful tool to do sentiment analysis and topic modeling on Twitter.

The map of emojis is computed with t-SNE, a projection of the 100 dimensions of W2V on a two dimensions plane.

About

revisiting nlp with emojis (a twitter analysis)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published