A Python based project to train a Machine Learning model to detect different hand shapes in real time with multi-threading, using Computer Vision, to control the PC.
Run main.py to use. Wait till "CALIBRATED" is shown.
It uses a K-Nearest Neighbors Model with 1 Nearest Neighbor, Support Vector Machine model with Linear kernel and a Random Forest Model with 55 estimators as classifiers in conjunction to make predictions. The Mode of the three classifiers' predictions is considered the accurate prediction.
Interpretation when hand is in relevant area (marked by green box):
- No Hand
0 - High Five
1 - Middle Finger
2 - V Sign
3
Use 64-bit Python in case of Memory Error.
- Can detect 3 different Hand Gestures and lack thereof. Train Model for more.
- Run
main.pyto use with Model stored inKNN.pklto use. - Use Less number of images during training(with Detector(value) where value ~ 80) to resolve out of memory issue.
- Can detect 3 different Hand Gestures and lack thereof. Train Model for more.
- Run
main.pyto use with Model stored insvm_linear.pklto use. - Use Less number of images during training(with Detector(value) where value ~ 80) to resolve out of memory issue.
- Can detect 3 different Hand Gestures and lack thereof. Train Model for more.
- Run
main.pyto use with Model stored inrf.pklto use.
- Perfectly predicts hand doing High Five gesture or lack thereof.
- Run
main_logistic_regress.pyto use. - Model parameters stored in
lregression_parameters.npy.
dataset_manips.py:Python script containing functions to build new datasets, clear existing datasets, arrange existing dataset files for more serialized naming etc in theDatasetsfolder.directgameinp.py:Best solution to translate Models' predictions into useful input. Functions in it can be used to send input to games(or other applications) using functions like KeyDown() and KeyUp().
This can also be replaced with thePyDirectInputPython Library.
Note: Input might still not be noticed by Games incorporating Direct Input protection. Haven't found any working alternative for them, other than programming a custom Keyboard Driver or Virtual Controller simulation with keyboard key binding. Feel free to suggest alternatives.gesture_ML.py:Trains a KNN model and/or SVM model and/or Random Forest model based on the functions called with the image Datasets present in the different folders in theDatasetsfolder, enumerated based on the order they have been trained from. Then, it saves the resulting model asKNN.pkland/orsvm_lin.pkland/orrf.pkl.gesture_ML_logistic_regress.py:Trains a Logistic Regression model based on the image Datasets present in the different folders in theDatasetsfolder, enumerated based on the order they have been trained from. For best possible accuracy using Sigmoid, it only supports 2 possible classes enumerated as0and1based on order of training. Then, it saves the resulting model's parameters aslregression_parameters.npy.main.py:Incorporates pre-trained SVM, KNN and Random Forest Model together to detect No hand0, High five1, Middle Finger2and Ok sign3with OpenCV using device camera.main_logistic_regress.py:Incorporates pre-trained Logistic Regression Model Parameters with Sigmoid function to detect No hand0or Hand1with OpenCV using device camera.presskey.ahk or presskey.exe:Alternative solution to translate Models' predictions into useful input. It uses AutoHotkey scripting to accomplish this. AutoHotkey may be downloaded to edit the .ahk script as needed. Then simply call the AutoHotkey script(with AutoHotkey installed)/executable frommain.pyormain_logistic_regress.py.
Using this is equivalent to simply using Python Libraries likepynput,pyautoguiorkeyboard, and that would be much simpler.
Note: This method will fail in Almost all DirectX based and DirectInput Games and Applications.visualizer.py:Heart of the software, called by all other scripts to isolate the hand from background through OpenCV contour detection using device camera, then use it to build dataset (which will then be used to train models) or classify gesture.
KNN.pkl:K-Nearest Neighbors Classifier Model Object stored as a Binary joblib Pickle Dump. Usejoblib.loadto load it into your scripts and use it's predict() method to classify 240x255x3 Black(0) and White(255) images into below mentioned classes.lregression_parameters.npy:Contains the W and b parameters for a Sigmoid-based Logistic Regression model, to accurately predict whether a High Five gesture is present in a Black(0) and White(255) 240x255x3 Image. It has been stored as a Numpy save Dump usingnumpy.save. Usenumpy.loadwithallow_pickle=Trueparameter to load the parameters into your scripts as a length 2 numpy array. Feed the resulting Linear equation formed from X as a suitable image into a Sigmoid function for classificationrf.pkl:Random Forest Classifier Model Object stored as a Binary joblib Pickle Dump. Usejoblib.loadto load it into your scripts and use it's predict() method to classify 240x255x3 Black(0) and White(255) images into below mentioned classes. Usenumpy.loadwithallow_pickle=Trueparameter to load the parameters into your scripts as a length 2 numpy array. Feed the resulting Linear equation formed from X as a suitable image into a Sigmoid function for classificationsvm_lin.pkl:Linear kernel Secure Vector Machine Model Object stored as a Binary joblib Pickle Dump. Usejoblib.loadto load it into your scripts and use it's predict() method to classify 240x255x3 Black(0) and White(255) images into below mentioned classes.
All Image Datasets stored in the Datasets folder are self created using dataset_manips.py, incorporating visualizer.py. They currently contain 4 different types of Hand Gestures that are ready and to train models on:
- No hand
- High Five
- Middle Finger
- V Sign
- Ok Sign (Not trained by models)
The Examples folder contains two video examples of main.py in action in a Game and for Spotify song changing, all through key presses.
- Changing songs in Spotify
- Usage in Games (Game Used: Orcs Must Die 2):
Substitue Mouse Input
Substitute Keyboard and Mouse Input