Tweet Sentiment Analysis:

This is a sentiment analysis project based on a twitter dataset found on Kaggle here : https://www.kaggle.com/kazanova/sentiment140 Using Word2Vec and Logistic Regression all written in Python to create our models, and Flask to create the web application.

To Run the project:

To run the project we recommend you use a virtual environment. We use Pipenv to create the environment and to install the requirements, you can see the documentation here: https://pypi.org/project/pipenv/
To create a shell inside your project directory use:
pipenv shell
To install the requirements using the requirements.txt file use:
pipenv install -r requirements.txt
Then you can simply run:
python app.py

Data processing:

In this section we'll tackle what data proccessing is done on the dataset to produce the model used in the application you can find all the details in our Data_processing.ipynb notebook

After importing the data we cleaned it by removing abreviations, stop words and ponctuation and we also lemmatized the tweets (using nltk)
We took all the words present in the dataset and trained our Word2Vec model on them with window size 5 and vector size 300
To use the model in the application we chose to use the save method on the model, this created three files one .model file and two .npy files

Creating Logistic Regression model:

After competing our data processing we move on to training our Logistic Regression model and these were the steps taken:

We represented each tweet by the sum of the vectors of each word in the tweet and we stored them into a matrix that represented our features.
We split the dataset into training and testing sets with the training set containg 30% of the data.
We trained our Logistic Regression Model using scikit-learn library and saved the model as a pkl file using the pickle library. The results of the model on testing set were as follows:
When we wanted to use the model we simply called pickle.load to regain the model and analyze the user's input

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
templates		templates
Data_processing.ipynb		Data_processing.ipynb
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
Procfile		Procfile
README.md		README.md
app.py		app.py
model_res.png		model_res.png
nltk.txt		nltk.txt
requirements.txt		requirements.txt
sentiment.pkl		sentiment.pkl
twitter-sentiment-analysis-with-naive-bayse-85-ac.ipynb		twitter-sentiment-analysis-with-naive-bayse-85-ac.ipynb
word2vec.model		word2vec.model
word2vec.model.syn1neg.npy		word2vec.model.syn1neg.npy
word2vec.model.wv.vectors.npy		word2vec.model.wv.vectors.npy
word_embeddings_subset.p		word_embeddings_subset.p

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tweet Sentiment Analysis:

To Run the project:

Data processing:

Creating Logistic Regression model:

About

Releases

Packages

Languages

AbdelGhafour69/tweet_sentiment_analysis

Folders and files

Latest commit

History

Repository files navigation

Tweet Sentiment Analysis:

To Run the project:

Data processing:

Creating Logistic Regression model:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages