This project implements a deep learning-based approach for removing noise from audio files. It uses a Convolutional Neural Network (CNN) model to process the spectrogram of noisy audio and generate a cleaned version.
- Features
- Requirements
- Installation
- Usage
- Model Architecture
- Training
- Evaluation
- Results
- Contributing
- License
- Remove background noise from audio files
- Process multiple audio files in batch
- Generate spectrograms and waveforms for input and output audio
- Support for various audio formats (WAV, MP3, etc.)
- Python 3.7+
- TensorFlow 2.x
- NumPy
- SciPy
- Matplotlib
- Librosa
- tqdm
-
Clone this repository:
git clone https://github.com/yourusername/audio-noise-removal.git cd audio-noise-removal
-
Install the required packages:
pip install -r requirements.txt
-
Download the pre-trained model:
# Add instructions for downloading the model file
To process a single audio file:
from modules import process_audio
input_file = "path/to/your/noisy_audio.wav"
output_dir = "path/to/output/directory"
process_audio(input_file, output_dir)
To process multiple audio files:
import os
from modules import process_audio
input_dir = "path/to/noisy/audio/files"
output_dir = "path/to/output/directory"
for file_name in os.listdir(input_dir):
if file_name.endswith(".wav"):
input_file = os.path.join(input_dir, file_name)
process_audio(input_file, output_dir)
The noise removal model is a Convolutional Neural Network (CNN) with the following architecture:
- Input layer: 128x128x1 (spectrogram)
- Multiple convolutional and pooling layers
- Skip connections for better feature preservation
- Output layer: 128x128x1 (cleaned spectrogram)
For more details, refer to the CNNmodel()
function in modules.py
.
The model is trained on pairs of noisy and clean audio samples. To train the model:
- Prepare your dataset of noisy and clean audio pairs.
- Update the data loading and preprocessing steps in
modules.py
. - Run the training script:
python train.py
Training parameters can be adjusted in the train.py
file.
To evaluate the model's performance:
- Prepare a test set of noisy audio files.
- Run the evaluation script:
python evaluate.py --test_dir path/to/test/files --output_dir path/to/output
This will process the test files and generate cleaned versions along with spectrograms and waveform visualizations.
The model's performance can be assessed by:
- Listening to the original noisy audio and the processed clean audio.
- Comparing the spectrograms of the input and output.
- Calculating objective metrics such as Signal-to-Noise Ratio (SNR) improvement.
Example results and visualizations can be found in the results
directory.
Contributions to this project are welcome! Please follow these steps:
- Fork the repository
- Create a new branch for your feature
- Commit your changes
- Push to your fork
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.