A Python based project to train a Machine Learning model to detect different hand shapes in real time with multi-threading, using Computer Vision, to control the PC.
Run main.py to use. Wait till "CALIBRATED" is shown.
It uses a K-Nearest Neighbors Model with 1 Nearest Neighbor, Support Vector Machine model with Linear kernel and a Random Forest Model with 55 estimators as classifiers in conjunction to make predictions. The Mode of the three classifiers' predictions is considered the accurate prediction.
Interpretation when hand is in relevant area (marked by green box):
- No Hand
0
- High Five
1
- Middle Finger
2
- V Sign
3
Use 64-bit Python in case of Memory Error.
- Can detect 3 different Hand Gestures and lack thereof. Train Model for more.
- Run
main.py
to use with Model stored inKNN.pkl
to use. - Use Less number of images during training(with Detector(value) where value ~ 80) to resolve out of memory issue.
- Can detect 3 different Hand Gestures and lack thereof. Train Model for more.
- Run
main.py
to use with Model stored insvm_linear.pkl
to use. - Use Less number of images during training(with Detector(value) where value ~ 80) to resolve out of memory issue.
- Can detect 3 different Hand Gestures and lack thereof. Train Model for more.
- Run
main.py
to use with Model stored inrf.pkl
to use.
- Perfectly predicts hand doing High Five gesture or lack thereof.
- Run
main_logistic_regress.py
to use. - Model parameters stored in
lregression_parameters.npy
.
dataset_manips.py:
Python script containing functions to build new datasets, clear existing datasets, arrange existing dataset files for more serialized naming etc in theDatasets
folder.directgameinp.py:
Best solution to translate Models' predictions into useful input. Functions in it can be used to send input to games(or other applications) using functions like KeyDown() and KeyUp().
This can also be replaced with thePyDirectInput
Python Library.
Note: Input might still not be noticed by Games incorporating Direct Input protection. Haven't found any working alternative for them, other than programming a custom Keyboard Driver or Virtual Controller simulation with keyboard key binding. Feel free to suggest alternatives.gesture_ML.py:
Trains a KNN model and/or SVM model and/or Random Forest model based on the functions called with the image Datasets present in the different folders in theDatasets
folder, enumerated based on the order they have been trained from. Then, it saves the resulting model asKNN.pkl
and/orsvm_lin.pkl
and/orrf.pkl
.gesture_ML_logistic_regress.py:
Trains a Logistic Regression model based on the image Datasets present in the different folders in theDatasets
folder, enumerated based on the order they have been trained from. For best possible accuracy using Sigmoid, it only supports 2 possible classes enumerated as0
and1
based on order of training. Then, it saves the resulting model's parameters aslregression_parameters.npy
.main.py:
Incorporates pre-trained SVM, KNN and Random Forest Model together to detect No hand0
, High five1
, Middle Finger2
and Ok sign3
with OpenCV using device camera.main_logistic_regress.py:
Incorporates pre-trained Logistic Regression Model Parameters with Sigmoid function to detect No hand0
or Hand1
with OpenCV using device camera.presskey.ahk or presskey.exe:
Alternative solution to translate Models' predictions into useful input. It uses AutoHotkey scripting to accomplish this. AutoHotkey may be downloaded to edit the .ahk script as needed. Then simply call the AutoHotkey script(with AutoHotkey installed)/executable frommain.py
ormain_logistic_regress.py
.
Using this is equivalent to simply using Python Libraries likepynput
,pyautogui
orkeyboard
, and that would be much simpler.
Note: This method will fail in Almost all DirectX based and DirectInput Games and Applications.visualizer.py:
Heart of the software, called by all other scripts to isolate the hand from background through OpenCV contour detection using device camera, then use it to build dataset (which will then be used to train models) or classify gesture.
KNN.pkl:
K-Nearest Neighbors Classifier Model Object stored as a Binary joblib Pickle Dump. Usejoblib.load
to load it into your scripts and use it's predict() method to classify 240x255x3 Black(0) and White(255) images into below mentioned classes.lregression_parameters.npy:
Contains the W and b parameters for a Sigmoid-based Logistic Regression model, to accurately predict whether a High Five gesture is present in a Black(0) and White(255) 240x255x3 Image. It has been stored as a Numpy save Dump usingnumpy.save
. Usenumpy.load
withallow_pickle=True
parameter to load the parameters into your scripts as a length 2 numpy array. Feed the resulting Linear equation formed from X as a suitable image into a Sigmoid function for classificationrf.pkl:
Random Forest Classifier Model Object stored as a Binary joblib Pickle Dump. Usejoblib.load
to load it into your scripts and use it's predict() method to classify 240x255x3 Black(0) and White(255) images into below mentioned classes. Usenumpy.load
withallow_pickle=True
parameter to load the parameters into your scripts as a length 2 numpy array. Feed the resulting Linear equation formed from X as a suitable image into a Sigmoid function for classificationsvm_lin.pkl:
Linear kernel Secure Vector Machine Model Object stored as a Binary joblib Pickle Dump. Usejoblib.load
to load it into your scripts and use it's predict() method to classify 240x255x3 Black(0) and White(255) images into below mentioned classes.
All Image Datasets stored in the Datasets
folder are self created using dataset_manips.py
, incorporating visualizer.py
. They currently contain 4 different types of Hand Gestures that are ready and to train models on:
- No hand
- High Five
- Middle Finger
- V Sign
- Ok Sign (Not trained by models)
The Examples
folder contains two video examples of main.py
in action in a Game and for Spotify song changing, all through key presses.
- Changing songs in Spotify
- Usage in Games (Game Used: Orcs Must Die 2):
Substitue Mouse Input
Substitute Keyboard and Mouse Input