Merge pull request #2 from hoangsonww/bug-fixes

Bug Fixes & Classifiers Enhancements
hoangsonww · Mar 30, 2024 · 7efca9b · 7efca9b
2 parents fb53090 + 98dd2bc
commit 7efca9b
Show file tree

Hide file tree

Showing 23 changed files with 1,754 additions and 25 deletions.
diff --git a/AI_Multitask_Classifiers.ipynb b/AI_Multitask_Classifiers.ipynb
@@ -0,0 +1,139 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "source": [
+    "# AI Multitask Classifiers: From Objects to Emotions\n",
+    "\n",
+    "## Introduction\n",
+    "This Jupyter notebook showcases the AI Multitask Classifiers project, which includes various classifiers for object detection, face detection, character recognition, and more, using frameworks like OpenCV, TensorFlow, and PyTorch.\n",
+    "\n",
+    "## Setup\n",
+    "First, ensure you have all required libraries installed:\n",
+    "```\n",
+    "```python\n",
+    "!pip install numpy opencv-python tensorflow pytorch pytesseract\n",
+    "```\n",
+    "\n",
+    "### Loading Models\n",
+    "Here, we load the models for each classification task. Replace `'model_path'` with the actual paths to your models.\n",
+    "\n",
+    "#### Object Detection (YOLO)\n",
+    "```python\n",
+    "import cv2\n",
+    "yolo_model = cv2.dnn.readNetFromDarknet('yolov3.cfg', 'yolov3.weights')\n",
+    "```\n",
+    "\n",
+    "#### Face Detection (TensorFlow)\n",
+    "```python\n",
+    "import tensorflow as tf\n",
+    "tf_model = tf.keras.models.load_model('model_path')\n",
+    "```\n",
+    "\n",
+    "#### Mood Classification (PyTorch)\n",
+    "```python\n",
+    "import torch\n",
+    "torch_model = torch.load('model_path')\n",
+    "```\n",
+    "\n",
+    "### Running Classifications\n",
+    "Demonstrate how to use these models to classify images.\n",
+    "\n",
+    "#### Object Detection with YOLO\n",
+    "```python\n",
+    "def detect_objects_yolo(image_path, model):\n",
+    "    # Image loading and processing for YOLO\n",
+    "    return image\n",
+    "\n",
+    "yolo_result = detect_objects_yolo('path/to/object/image.jpg', yolo_model)\n",
+    "```\n",
+    "\n",
+    "#### Face Detection with TensorFlow\n",
+    "```python\n",
+    "def detect_faces_tf(image_path, model):\n",
+    "    # Image loading and processing for TensorFlow\n",
+    "    return predictions\n",
+    "\n",
+    "tf_result = detect_faces_tf('path/to/face/image.jpg', tf_model)\n",
+    "```\n",
+    "\n",
+    "#### Mood Classification with PyTorch\n",
+    "```python\n",
+    "def classify_mood_torch(image_path, model):\n",
+    "    # Image loading and processing for PyTorch\n",
+    "    return predictions\n",
+    "\n",
+    "torch_result = classify_mood_torch('path/to/mood/image.jpg', torch_model)\n",
+    "```\n",
+    "\n",
+    "### Results and Analysis\n",
+    "Discuss and display the results.\n",
+    "\n",
+    "#### YOLO Object Detection\n",
+    "```python\n",
+    "# Displaying YOLO results\n",
+    "```\n",
+    "\n",
+    "#### TensorFlow Face Detection\n",
+    "```python\n",
+    "# Displaying TensorFlow results\n",
+    "```\n",
+    "\n",
+    "#### PyTorch Mood Classification\n",
+    "```python\n",
+    "# Displaying PyTorch results\n",
+    "```\n",
+    "\n",
+    "### More Classifiers\n",
+    "For each additional classifier mentioned in the README, add similar sections as above. For example, Character Recognition (OCR), Animal Classification, etc.\n",
+    "\n",
+    "#### Character Recognition (OCR)\n",
+    "```python\n",
+    "# Load OCR model, demonstrate character recognition\n",
+    "```\n",
+    "\n",
+    "#### Animal Classification\n",
+    "```python\n",
+    "# Load animal classification model, demonstrate usage\n",
+    "```\n",
+    "\n",
+    "### Speech Recognition\n",
+    "```python\n",
+    "# Discuss the speech recognition module, demonstrate usage\n",
+    "```\n",
+    "\n",
+    "## Conclusion\n",
+    "Summarize the capabilities and findings of using these classifiers.\n",
+    "\n",
+    "## Future Work\n",
+    "- Enhance the models and expand the notebook with more detailed examples and analyses.\n",
+    "- Incorporate feedback and new classifiers as the project evolves.\n"
+   ],
+   "metadata": {
+    "collapsed": false
+   },
+   "id": "e6141e55235e57c3"
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 2
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython2",
+   "version": "2.7.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/Animals-Classification/animal-classi.png b/Animals-Classification/animal-classi.png
diff --git a/Animals-Classification/animal_classification.py b/Animals-Classification/animal_classification.py
@@ -17,20 +17,26 @@ def classify_image(model, image):
     image_array = preprocess_input(image_array)
 
     predictions = model.predict(image_array)
-    decoded_predictions = decode_predictions(predictions, top=1)[0]
+    decoded_predictions = decode_predictions(predictions, top=5)[0]
 
     return decoded_predictions
 
 
 def annotate_image(image, predictions):
     draw = ImageDraw.Draw(image)
-    font = ImageFont.load_default()
+    font_size = 20  # Increase font size
+    font = ImageFont.truetype("arial.ttf", font_size)
     text_y = 10
 
     for i, (id, label, prob) in enumerate(predictions):
         text = f"{label} ({prob * 100:.2f}%)"
         draw.text((10, text_y), text, fill="red", font=font)
-        text_y += 20
+        text_y += font_size + 10
+
+        # Draw rectangle for the object
+        # Since we're using MobileNetV2 without specific object localization, we draw a generic box
+        draw.rectangle([10, text_y, 200, text_y + font_size], outline="red", width=2)
+        text_y += font_size + 10  # Update y coordinate for text
 
     return image
 
@@ -56,7 +62,7 @@ def process_input(source, model):
         key = cv2.waitKey(1) & 0xFF
         if key in [27, ord('q')]:  # ESC or Q key to exit
             break
-        if cv2.getWindowProperty("Image", cv2.WND_PROP_VISIBLE) < 1:  # Check if the window is closed
+        if cv2.getWindowProperty("Image", cv2.WND_PROP_VISIBLE) < 1:
             break
 
     cap.release()
@@ -71,10 +77,14 @@ def process_input(source, model):
     choice = input("Enter 'image', 'video', or 'webcam': ").lower()
     if choice == 'image':
         image_path = input("Enter the image path: ")
+        print("Check the popup window for detailed results.")
         image = Image.open(image_path)
         predictions = classify_image(model, image)
         annotated_image = annotate_image(image, predictions)
         annotated_image.show()
     elif choice in ['video', 'webcam']:
         source = 0 if choice == 'webcam' else input("Enter the video path: ")
+        print("Check the popup window for detailed results.")
         process_input(source, model)
+    else:
+        print("Invalid choice. Please enter 'image', 'video', or 'webcam'.")
diff --git a/Animals-Classification/coyotes.mp4 b/Animals-Classification/coyotes.mp4
diff --git a/Character-Recognition/chars.jpg b/Character-Recognition/chars.jpg
diff --git a/Character-Recognition/chars.mp4 b/Character-Recognition/chars.mp4
diff --git a/Character-Recognition/letters.mp4 b/Character-Recognition/letters.mp4
diff --git a/Character-Recognition/ocr.py b/Character-Recognition/ocr.py
@@ -57,15 +57,18 @@ def main():
 
     if choice == 'image':
         image_path = input("Enter the image path: ")
+        print("Check the popup window for detailed results.")
         image = cv2.imread(image_path)
         detected_text, annotated_image = perform_ocr_on_image(image)
-        print("Detected text:", detected_text)
         cv2.imshow('Annotated Image', annotated_image)
         cv2.waitKey(0)
         cv2.destroyAllWindows()
     elif choice in ['video', 'webcam']:
         source = 0 if choice == 'webcam' else input("Enter the video path: ")
+        print("Check the popup window for detailed results.")
         process_input(source)
+    else:
+        print("Invalid choice. Please enter 'image', 'video', or 'webcam'.")
 
 
 if __name__ == "__main__":

diff --git a/Dockerfile b/Dockerfile
@@ -0,0 +1,20 @@
+# Use an official Python runtime as a parent image
+FROM python:3.8-slim
+
+# Set the working directory in the container
+WORKDIR /usr/src/app
+
+# Copy the current directory contents into the container at /usr/src/app
+COPY . .
+
+# Install any needed packages specified in requirements.txt
+RUN pip install --no-cache-dir -r requirements.txt
+
+# Make port 80 available to the world outside this container
+EXPOSE 80
+
+# Define environment variable
+ENV NAME AI-Classifiers
+
+# Run app.py when the container launches
+CMD ["python", "./main.py"]
diff --git a/Flowers-Classification/flower_classification.py b/Flowers-Classification/flower_classification.py
@@ -81,12 +81,17 @@ def process_input(source, model):
     model = load_model()
 
     choice = input("Enter 'image', 'video', or 'webcam': ").lower()
+
     if choice == 'image':
         image_path = input("Enter the image path: ")
+        print("Check the popup window for the results.")
         image = Image.open(image_path)
         predictions = classify_image(model, image)
         annotated_image = annotate_image(image, predictions)
         annotated_image.show()
     elif choice in ['video', 'webcam']:
         source = 0 if choice == 'webcam' else input("Enter the video path: ")
+        print("Check the popup window for the results.")
         process_input(source, model)
+    else:
+        print("Invalid choice. Please enter 'image', 'video', or 'webcam'.")
diff --git a/...ce-Classification/faces-classification.py → ...ce-Classification/faces_classification.py b/...ce-Classification/faces-classification.py → ...ce-Classification/faces_classification.py
@@ -112,12 +112,15 @@ def main():
         "Do you want to use the webcam, classify a video, or classify an image? (webcam/video/image): ").strip().lower()
 
     if choice == 'webcam':
+        print("Check the popup window for the results.")
         annotate_video(None, face_net, age_net, gender_net, use_webcam=True)
     elif choice == 'video':
         video_path = input("Enter the path to the video file: ")
+        print("Check the popup window for the results.")
         annotate_video(video_path, face_net, age_net, gender_net)
     elif choice == 'image':
         image_path = input("Enter the path to the image file: ")
+        print("Check the popup window for the results.")
         classify_image(image_path, face_net, age_net, gender_net)
     else:
         print("Invalid choice. Exiting.")

diff --git a/Makefile b/Makefile
@@ -0,0 +1,15 @@
+setup:
+	python3 -m venv venv
+	. venv/bin/activate; pip install -r requirements.txt
+
+run:
+	python3 main.py
+
+test:
+	python3 -m unittest discover -s tests
+
+docker-build:
+	docker build -t ai-classifier .
+
+docker-run:
+	docker run -p 5000:80 ai-classifier
diff --git a/Mood-Classification/mood_classifier.py b/Mood-Classification/mood_classifier.py
@@ -59,10 +59,14 @@ def process_input(source):
 
     if choice == 'image':
         image_path = input("Enter the image path: ")
+        print("Check the popup window for the results.")
         image = cv2.imread(image_path)
         analyze_and_show(image)
         cv2.waitKey(0)
         cv2.destroyAllWindows()
     elif choice in ['video', 'webcam']:
         source = 'webcam' if choice == 'webcam' else input("Enter the video path: ")
+        print("Check the popup window for the results.")
         process_input(source)
+    else:
+        print("Invalid choice. Please enter 'image', 'video', or 'webcam'.")
diff --git a/Object-Classification/balls.mp4 b/Object-Classification/balls.mp4
diff --git a/Object-Classification/object-classi.png b/Object-Classification/object-classi.png
diff --git a/Object-Classification/object_classification.py b/Object-Classification/object_classification.py
@@ -81,6 +81,9 @@ def main():
     choice = input("Choose 'image', 'video', or 'webcam': ").lower()
     if choice == 'image':
         image_path = input("Enter the image path: ")
+
+        print("Check the popup window for the results.")
+
         results, image = classify_image(model, image_path)
         annotated_image = annotate_image(image, results)
         annotated_image.show()
@@ -90,7 +93,10 @@ def main():
             print(f"{i + 1}: {label} ({prob * 100:.2f}%)")
     elif choice in ['video', 'webcam']:
         video_source = 0 if choice == 'webcam' else input("Enter the video path: ")
+        print("Check the popup window for the results.")
         process_video(model, video_source)
+    else:
+        print("Invalid choice. Please choose 'image', 'video', or 'webcam'.")
 
 
 if __name__ == "__main__":

diff --git a/Pipfile b/Pipfile
@@ -0,0 +1,23 @@
+[[source]]
+name = "pypi"
+url = "https://pypi.org/simple"
+verify_ssl = true
+
+[dev-packages]
+
+[packages]
+opencv-python = "*"
+tensorflow = "*"
+torch = "*"
+torchvision = "*"
+numpy = "*"
+scipy = "*"
+matplotlib = "*"
+pandas = "*"
+scikit-learn = "*"
+pytesseract = "*"
+speechrecognition = "*"
+pyaudio = "*"
+
+[requires]
+python_version = "3.8"