A repos for USTH Digital Signal Processing 2020 Group 3 project. It's quite obvious in the title.
What is digital signal processing
This project harness the power of function mfcc from python_speech_features and model gmm from sklearn.
Read more about Mel frequency cepstrum coefficients and Gaussian Mixture model.
This is the datasets. Remember to read AudioInfo.txt in Sunday datasets before processing.
135 .wav files of each person are 135 lines in transcripts/random_sentences.txt.
Note that Friday datasets is just an archive of Sunday datasets. Please use Sunday datasets.
Each Sunday_datasets/mix, Sunday_datasets/low, Sunday_datasets/high, I take 100 out of 135 .wav files of each person then I fit these files into a model which will represent that person's unique voice features. The rest 35 .wav files of each person are used to test the system of models.
100 .wav files are be shuffled to show that order of files is not important.
Plan:
- Train models with
Sunday_datasets/mixfolder. - Train models with
Sunday_datasets/lowfolder. - Train models with
Sunday_datasets/highfolder. - Then test each system of models on
Sunday_datasets/mix,Sunday_datasets/low,Sunday_datasets/highfolders.
Read our report for more details.
To have clear view of folders and files
+--venv/
|
+--transcripts/
| +--usth.txt
| +--random_sentences.txt
|
|--datasets/
| +--mix/
| | +AudioInfo.txt
| |
| +--low/
| | +AudioInfo.txt
| |
| +--high/
| +AudioInfo.txt
|
|--source_code/
| +--Friday_script_models/ # Ignorable
| +--models/ # Where models are saved as binary files
| +--mfcc_gmm_func.py # Script of functions to call mfcc and gmm
| +--requirements.txt # pip install -r requirements.txt
| +--train_models.py
| +--try_models.py
|
+--LICENSE
+--README.md
+--.gitignore