This project implements an audio emotion recognition system using a Random Forest Classifier to classify audio files based on their emotional content. The system extracts features from audio files and builds a model that can predict the emotion represented in new audio samples.
- Audio Feature Extraction: Utilizes the Mel-frequency cepstral coefficients (MFCC) from audio files to capture melodic and rhythmic characteristics.
- Random Forest Classifier: Employs a machine learning model for classification, offering robustness and adaptability to various data patterns.
- Model Evaluation: Includes metrics such as classification report and confusion matrix to assess model performance.
The dataset should consist of .wav
audio files that contain emotional speech or sounds. Each file should have a naming convention that includes the emotion label, e.g., sound_happy.wav
.
To get this project running locally, follow these steps:
Ensure you have the following installed on your machine:
- Python 3.x
- Required Python packages (can be installed via pip)
Example Output The output will include label counts from the dataset, a detailed classification report, a confusion matrix, and a bar chart visualizing the predicted distribution of emotions.
Contributing Contributions to the project are welcome. Feel free to fork the repository and submit a pull request with your enhancements.