You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Removed score texts from videos for better shot analysis.
Applied horizontal flips to represent different batting styles.
Divided the dataset into training, validation, and testing sets with a 70-20-10 split for effective model evaluation.
✯ Model Architecture
Utilizes a CNN-based feature extractor from the EfficientNet family, applied in a time-distributed block to maintain temporal information across video frames.
Incorporates a Global Average Pooling layer within the time-distributed block to condense features.
Employs GRU units to capture and analyze temporal dependencies between frames, enhancing the model's understanding of motion and sequence.
Concludes with dense layers topped with a softmax activation for classifying shots into distinct categories based on learned features.
✯ Model Training
Model
Training Accuracy
Validation Accuracy
EfficientNetB0
100%
85.80%
EfficientNetV2B0
100%
77.01%
EfficientNetB4
100%
72.86%
Built three model variants, each with a distinct feature extractor head to evaluate performance variations.
Trained all models for 20 epochs using batch sizes of 16, processing 30 frames per video to capture temporal dynamics.
Utilized the Adam optimizer, configured with a learning rate of 0.001, to efficiently converge to optimal weights.
Employed sparse categorical cross-entropy as the loss function, for handling class labels as integers.
✯ Optimizing Performance with Genetic Algorithm-Based Hyperparameter Tuning
Each individual in the population represents a set of model hyperparameters, such as learning rate and epochs.
Individuals are assessed based on the validation accuracy of the model trained with their hyperparameters.
Randomly selects small groups of individuals, with the best-performing individual from each group chosen to continue to the next generation.
Combines and modifies selected individuals' hyperparameters to explore new solutions and improve model performance.
The stagnation limit is set to 10, meaning the genetic algorithm halts if there's no improvement in the best fitness score after 10 consecutive generations.
The learning rate ranges between 0.0001 and 0.02, and the epochs range from 1 to 20, ensuring a comprehensive exploration of the hyperparameter space.
✯ Model Evaluation
Model
Testing Accuracy
Precision
Recall
F1 Score
EfficientNetB0
94%
94%
94%
94%
EfficientNetV2B0
81%
82%
81%
81%
EfficientNetB4
74%
75%
74%
74%
All three models were evaluated on the test set.
Accuracy, Precision, Recall, and F1-score were the metrics used for evaluation.
The model with EfficientNet B0 backbone outperformed the other two models.
✯ Analyzing and Assessing Cricket Shot Similarities
Extracted features from the convolutional block of the EfficientNet backbone, mapping them into a concise vector representation.
Calculated cosine distance between feature vectors to assess similarities across different video inputs.
Utilized this distance metric to determine the degree of similarity between two cricket shot videos.
Confirmed model accuracy with a 100% similarity score for identical input videos, validating the effectiveness of the feature extraction and comparison approach.
✯ Contributors
This project is a collaborative effort between Ritik Bompilwar and Pratheesh. Equal contributions were made in developing and optimizing the model to enhance accuracy and performance.
A. Sen, K. Deb, P. K. Dhar, and T. Koshiba, "CricShotClassify: An Approach to Classifying Batting Shots from Cricket Videos Using a Convolutional Neural Network and Gated Recurrent Unit," Sensors, vol. 21, no. 8, Art. no. 2846, 2021. [Online]. Available: https://doi.org/10.3390/s21082846.
M. Tan and Q. V. Le, "EfficientNet: Rethinking model scaling for convolutional neural networks," in Proc. 36th Int. Conf. Mach. Learn., Long Beach, CA, USA, 2019, vol. 97, pp. 6105–6114. [Online]. Available: http://proceedings.mlr.press/v97/tan19a.html
K. Cho et al., "Learning phrase representations using RNN encoder-decoder for statistical machine translation," arXiv preprint arXiv:1406.1078, 2014. [Online]. Available: https://arxiv.org/abs/1406.1078
M. Abadi et al., "TensorFlow: Large-scale machine learning on heterogeneous systems," 2016. [Online]. Software available: https://www.tensorflow.org/
"Streamlit: The fastest way to build custom ML tools," Streamlit. Accessed: Apr. 17, 2024. [Online]. Available: https://www.streamlit.io/