Skip to content

Commit

Permalink
Added Mood & Enhanced Speech Classifications
Browse files Browse the repository at this point in the history
  • Loading branch information
hoangsonww committed Mar 30, 2024
1 parent da79550 commit 98dd2bc
Show file tree
Hide file tree
Showing 20 changed files with 1,718 additions and 8 deletions.
139 changes: 139 additions & 0 deletions AI_Multitask_Classifiers.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
{
"cells": [
{
"cell_type": "markdown",
"source": [
"# AI Multitask Classifiers: From Objects to Emotions\n",
"\n",
"## Introduction\n",
"This Jupyter notebook showcases the AI Multitask Classifiers project, which includes various classifiers for object detection, face detection, character recognition, and more, using frameworks like OpenCV, TensorFlow, and PyTorch.\n",
"\n",
"## Setup\n",
"First, ensure you have all required libraries installed:\n",
"```\n",
"```python\n",
"!pip install numpy opencv-python tensorflow pytorch pytesseract\n",
"```\n",
"\n",
"### Loading Models\n",
"Here, we load the models for each classification task. Replace `'model_path'` with the actual paths to your models.\n",
"\n",
"#### Object Detection (YOLO)\n",
"```python\n",
"import cv2\n",
"yolo_model = cv2.dnn.readNetFromDarknet('yolov3.cfg', 'yolov3.weights')\n",
"```\n",
"\n",
"#### Face Detection (TensorFlow)\n",
"```python\n",
"import tensorflow as tf\n",
"tf_model = tf.keras.models.load_model('model_path')\n",
"```\n",
"\n",
"#### Mood Classification (PyTorch)\n",
"```python\n",
"import torch\n",
"torch_model = torch.load('model_path')\n",
"```\n",
"\n",
"### Running Classifications\n",
"Demonstrate how to use these models to classify images.\n",
"\n",
"#### Object Detection with YOLO\n",
"```python\n",
"def detect_objects_yolo(image_path, model):\n",
" # Image loading and processing for YOLO\n",
" return image\n",
"\n",
"yolo_result = detect_objects_yolo('path/to/object/image.jpg', yolo_model)\n",
"```\n",
"\n",
"#### Face Detection with TensorFlow\n",
"```python\n",
"def detect_faces_tf(image_path, model):\n",
" # Image loading and processing for TensorFlow\n",
" return predictions\n",
"\n",
"tf_result = detect_faces_tf('path/to/face/image.jpg', tf_model)\n",
"```\n",
"\n",
"#### Mood Classification with PyTorch\n",
"```python\n",
"def classify_mood_torch(image_path, model):\n",
" # Image loading and processing for PyTorch\n",
" return predictions\n",
"\n",
"torch_result = classify_mood_torch('path/to/mood/image.jpg', torch_model)\n",
"```\n",
"\n",
"### Results and Analysis\n",
"Discuss and display the results.\n",
"\n",
"#### YOLO Object Detection\n",
"```python\n",
"# Displaying YOLO results\n",
"```\n",
"\n",
"#### TensorFlow Face Detection\n",
"```python\n",
"# Displaying TensorFlow results\n",
"```\n",
"\n",
"#### PyTorch Mood Classification\n",
"```python\n",
"# Displaying PyTorch results\n",
"```\n",
"\n",
"### More Classifiers\n",
"For each additional classifier mentioned in the README, add similar sections as above. For example, Character Recognition (OCR), Animal Classification, etc.\n",
"\n",
"#### Character Recognition (OCR)\n",
"```python\n",
"# Load OCR model, demonstrate character recognition\n",
"```\n",
"\n",
"#### Animal Classification\n",
"```python\n",
"# Load animal classification model, demonstrate usage\n",
"```\n",
"\n",
"### Speech Recognition\n",
"```python\n",
"# Discuss the speech recognition module, demonstrate usage\n",
"```\n",
"\n",
"## Conclusion\n",
"Summarize the capabilities and findings of using these classifiers.\n",
"\n",
"## Future Work\n",
"- Enhance the models and expand the notebook with more detailed examples and analyses.\n",
"- Incorporate feedback and new classifiers as the project evolves.\n"
],
"metadata": {
"collapsed": false
},
"id": "e6141e55235e57c3"
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Binary file modified Animals-Classification/animal-classi.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
18 changes: 14 additions & 4 deletions Animals-Classification/animal_classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,20 +17,26 @@ def classify_image(model, image):
image_array = preprocess_input(image_array)

predictions = model.predict(image_array)
decoded_predictions = decode_predictions(predictions, top=1)[0]
decoded_predictions = decode_predictions(predictions, top=5)[0]

return decoded_predictions


def annotate_image(image, predictions):
draw = ImageDraw.Draw(image)
font = ImageFont.load_default()
font_size = 20 # Increase font size
font = ImageFont.truetype("arial.ttf", font_size)
text_y = 10

for i, (id, label, prob) in enumerate(predictions):
text = f"{label} ({prob * 100:.2f}%)"
draw.text((10, text_y), text, fill="red", font=font)
text_y += 20
text_y += font_size + 10

# Draw rectangle for the object
# Since we're using MobileNetV2 without specific object localization, we draw a generic box
draw.rectangle([10, text_y, 200, text_y + font_size], outline="red", width=2)
text_y += font_size + 10 # Update y coordinate for text

return image

Expand All @@ -56,7 +62,7 @@ def process_input(source, model):
key = cv2.waitKey(1) & 0xFF
if key in [27, ord('q')]: # ESC or Q key to exit
break
if cv2.getWindowProperty("Image", cv2.WND_PROP_VISIBLE) < 1: # Check if the window is closed
if cv2.getWindowProperty("Image", cv2.WND_PROP_VISIBLE) < 1:
break

cap.release()
Expand All @@ -71,10 +77,14 @@ def process_input(source, model):
choice = input("Enter 'image', 'video', or 'webcam': ").lower()
if choice == 'image':
image_path = input("Enter the image path: ")
print("Check the popup window for detailed results.")
image = Image.open(image_path)
predictions = classify_image(model, image)
annotated_image = annotate_image(image, predictions)
annotated_image.show()
elif choice in ['video', 'webcam']:
source = 0 if choice == 'webcam' else input("Enter the video path: ")
print("Check the popup window for detailed results.")
process_input(source, model)
else:
print("Invalid choice. Please enter 'image', 'video', or 'webcam'.")
Binary file added Animals-Classification/coyotes.mp4
Binary file not shown.
Binary file added Character-Recognition/chars.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Character-Recognition/chars.mp4
Binary file not shown.
Binary file added Character-Recognition/letters.mp4
Binary file not shown.
5 changes: 4 additions & 1 deletion Character-Recognition/ocr.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,15 +57,18 @@ def main():

if choice == 'image':
image_path = input("Enter the image path: ")
print("Check the popup window for detailed results.")
image = cv2.imread(image_path)
detected_text, annotated_image = perform_ocr_on_image(image)
print("Detected text:", detected_text)
cv2.imshow('Annotated Image', annotated_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
elif choice in ['video', 'webcam']:
source = 0 if choice == 'webcam' else input("Enter the video path: ")
print("Check the popup window for detailed results.")
process_input(source)
else:
print("Invalid choice. Please enter 'image', 'video', or 'webcam'.")


if __name__ == "__main__":
Expand Down
20 changes: 20 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Use an official Python runtime as a parent image
FROM python:3.8-slim

# Set the working directory in the container
WORKDIR /usr/src/app

# Copy the current directory contents into the container at /usr/src/app
COPY . .

# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Make port 80 available to the world outside this container
EXPOSE 80

# Define environment variable
ENV NAME AI-Classifiers

# Run app.py when the container launches
CMD ["python", "./main.py"]
5 changes: 5 additions & 0 deletions Flowers-Classification/flower_classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,12 +81,17 @@ def process_input(source, model):
model = load_model()

choice = input("Enter 'image', 'video', or 'webcam': ").lower()

if choice == 'image':
image_path = input("Enter the image path: ")
print("Check the popup window for the results.")
image = Image.open(image_path)
predictions = classify_image(model, image)
annotated_image = annotate_image(image, predictions)
annotated_image.show()
elif choice in ['video', 'webcam']:
source = 0 if choice == 'webcam' else input("Enter the video path: ")
print("Check the popup window for the results.")
process_input(source, model)
else:
print("Invalid choice. Please enter 'image', 'video', or 'webcam'.")
Original file line number Diff line number Diff line change
Expand Up @@ -112,12 +112,15 @@ def main():
"Do you want to use the webcam, classify a video, or classify an image? (webcam/video/image): ").strip().lower()

if choice == 'webcam':
print("Check the popup window for the results.")
annotate_video(None, face_net, age_net, gender_net, use_webcam=True)
elif choice == 'video':
video_path = input("Enter the path to the video file: ")
print("Check the popup window for the results.")
annotate_video(video_path, face_net, age_net, gender_net)
elif choice == 'image':
image_path = input("Enter the path to the image file: ")
print("Check the popup window for the results.")
classify_image(image_path, face_net, age_net, gender_net)
else:
print("Invalid choice. Exiting.")
Expand Down
15 changes: 15 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
setup:
python3 -m venv venv
. venv/bin/activate; pip install -r requirements.txt

run:
python3 main.py

test:
python3 -m unittest discover -s tests

docker-build:
docker build -t ai-classifier .

docker-run:
docker run -p 5000:80 ai-classifier
4 changes: 4 additions & 0 deletions Mood-Classification/mood_classifier.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,14 @@ def process_input(source):

if choice == 'image':
image_path = input("Enter the image path: ")
print("Check the popup window for the results.")
image = cv2.imread(image_path)
analyze_and_show(image)
cv2.waitKey(0)
cv2.destroyAllWindows()
elif choice in ['video', 'webcam']:
source = 'webcam' if choice == 'webcam' else input("Enter the video path: ")
print("Check the popup window for the results.")
process_input(source)
else:
print("Invalid choice. Please enter 'image', 'video', or 'webcam'.")
Binary file modified Object-Classification/object-classi.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions Object-Classification/object_classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,10 @@ def main():
print(f"{i + 1}: {label} ({prob * 100:.2f}%)")
elif choice in ['video', 'webcam']:
video_source = 0 if choice == 'webcam' else input("Enter the video path: ")
print("Check the popup window for the results.")
process_video(model, video_source)
else:
print("Invalid choice. Please choose 'image', 'video', or 'webcam'.")


if __name__ == "__main__":
Expand Down
23 changes: 23 additions & 0 deletions Pipfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true

[dev-packages]

[packages]
opencv-python = "*"
tensorflow = "*"
torch = "*"
torchvision = "*"
numpy = "*"
scipy = "*"
matplotlib = "*"
pandas = "*"
scikit-learn = "*"
pytesseract = "*"
speechrecognition = "*"
pyaudio = "*"

[requires]
python_version = "3.8"
Loading

0 comments on commit 98dd2bc

Please sign in to comment.