Generative AI Detection

Overview

This project is a web application built using ReactJS for the frontend and Python for the backend. It aims to detect whether a given text or prompt is authored by a human or generated by an AI model. The application participates in the Voight-Kampff Generative AI Authorship Verification 2024 challenge.

Features

AI vs Human Detection: Detects whether a text is written by a human or generated by an AI.
Machine Learning Models: Utilizes various machine learning models for classification.
ReactJS Frontend: User-friendly interface built with ReactJS.
Python Backend: Backend processing with Python to handle AI detection logic.

Modules and Libraries:

1. Flask:

A lightweight Python web framework used to build web applications. It provides features for routing HTTP requests and generating responses. Flask-CORS:

A Flask extension that handles Cross-Origin Resource Sharing (CORS), enabling the server to respond to requests from different origins.

2. PyTorch:

A deep learning library used to load and run models. It is used here to run the GPT-2 model and perform computations such as calculating Perplexity.

3. Transformers (from Hugging Face):

The GPT2LMHeadModel and GPT2TokenizerFast classes from the transformers library are used to load the GPT-2 language model and tokenizer. GPT-2 is a pre-trained language model that can generate and analyze text.

4. Regular Expressions (re):

Used for text processing tasks such as extracting valid characters or splitting sentences into lines.

5. OrderedDict (from collections):

Used to maintain the order of results in a dictionary, making it easier to track and display the analysis steps.

Algorithm and Use Case:

The core functionality of the code revolves around using the GPT-2 model to calculate Perplexity and Burstiness of a given text. Here is how it works:

1. Perplexity:

A measure of how well a model predicts a given sentence. Lower perplexity indicates the text is more predictable and likely to be written by a human, while higher perplexity suggests it may have been generated by AI. The algorithm uses the GPT-2 model to compute the Perplexity score by evaluating the negative log likelihood (NLL) of each word in the input text, which is then used to calculate Perplexity.

2. Burstiness:

This measures the variation in Perplexity across different lines in the input text. Higher variation or burstiness can indicate unusual text patterns, often found in AI-generated text.

3. Thresholding for AI vs. Human:

The code uses a threshold-based decision system to categorize the text: If the Perplexity score is below a threshold (60), the text is likely AI-generated. If it falls between 60 and 80, the text is deemed "most likely AI," but it needs more text for better judgment. If the Perplexity is above 80, the text is likely to be human-written.

4. API Usage:

The code exposes a POST route ('/') where a user can submit a JSON payload with a text field. The system processes the text using the GPT2PPL class, calculates Perplexity and Burstiness, and then returns a response with a label ("AI-generated" or "Human-written") and the computed values.

5. Use Case:

The code can be used in scenarios where there is a need to determine whether a given text was written by a human or generated by an AI model. This can be useful for:

Content authenticity checks: To identify whether an article, blog post, or essay was written by a human or AI.
AI detection in education: To detect if students have submitted AI-generated text as their own work.
Content moderation: To flag AI-generated content in social media or online forums.

Summary:

Modules used: Flask, Flask-CORS, PyTorch, transformers, regex, and OrderedDict.
Algorithms used: Perplexity (text predictability), Burstiness (variation in sentence predictability), and thresholding for labeling the text as AI or human-written.
Use case: AI vs. human text detection, content authenticity verification.

Installation

To set up the project locally, follow these steps:

Clone the repository:

git clone https://github.com/AkashKobal/Generative-AI-Detection.git
cd Generative-AI-Detection

Install frontend dependencies:
```
cd frontend
npm install
```

Install backend dependencies:

cd ../server
pip install -r requirements.txt

Start the backend server:
```
python app.py
```
Start the frontend server:
```
npm start
```

Usage

Open your browser and go to http://localhost:3000.
Upload a pair of texts (one human-written and one AI-generated).
Click on the "Analyse" button to see the results.

Data

The dataset used in this project consists of a collection of human-written and AI-generated texts.
Texts are analyzed to calculate metrics like Perplexity and Burstiness to determine their likely origin (AI or human).

Evaluation Metrics

Perplexity: Measures how well the AI model predicts the next word in a text. Lower perplexity suggests human authorship, while higher perplexity suggests AI generation.
Burstiness: Measures the variation in perplexity across different lines of text. Higher burstiness often indicates AI-generated text.

Key Changes:

Reorganized the content under appropriate headings.
Added a Data section for clarity on the dataset.
Reformatting of steps in Installation to improve clarity.

Contributing

Contributions are welcome! If you have suggestions for improvements or new features, please fork the repository and submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.vscode		.vscode
public		public
server		server
src		src
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
Screenshot (366).png		Screenshot (366).png
package-lock.json		package-lock.json
package.json		package.json
test.html		test.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generative AI Detection

Overview

Table of Contents

Features

Modules and Libraries:

1. Flask:

2. PyTorch:

3. Transformers (from Hugging Face):

4. Regular Expressions (re):

5. OrderedDict (from collections):

Algorithm and Use Case:

1. Perplexity:

2. Burstiness:

3. Thresholding for AI vs. Human:

4. API Usage:

5. Use Case:

Summary:

Installation

Usage

Data

Evaluation Metrics

Key Changes:

Contributing

License

About

Languages

License

AkashKobal/Generative-AI-Detection

Folders and files

Latest commit

History

Repository files navigation

Generative AI Detection

Overview

Table of Contents

Features

Modules and Libraries:

1. Flask:

2. PyTorch:

3. Transformers (from Hugging Face):

4. Regular Expressions (re):

5. OrderedDict (from collections):

Algorithm and Use Case:

1. Perplexity:

2. Burstiness:

3. Thresholding for AI vs. Human:

4. API Usage:

5. Use Case:

Summary:

Installation

Usage

Data

Evaluation Metrics

Key Changes:

Contributing

License

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages