opus/dnn/torch/neural-pitch
Jean-Marc Valin ddd5669e79
Pitch and fargan model updates
Removing one of the 2d conv layers for pitch estimation reduces
complexity without noticeable degradation. FARGAN model has more
adversarial training.
Also, no need for the double precision in the low-pass filter.
2023-10-28 23:33:47 -04:00
..
data_augmentation.py Python code for neural pitch 2023-09-26 12:12:47 -04:00
download_demand.sh Python code for neural pitch 2023-09-26 12:12:47 -04:00
evaluation.py refactoring and cleanup 2023-09-29 15:31:45 +02:00
experiments.py Python code for neural pitch 2023-09-26 12:12:47 -04:00
export_neuralpitch_weights.py Pitch and fargan model updates 2023-10-28 23:33:47 -04:00
models.py Pitch and fargan model updates 2023-10-28 23:33:47 -04:00
neural_pitch_update.py refactoring and cleanup 2023-09-29 15:31:45 +02:00
ptdb_process.sh Python code for neural pitch 2023-09-26 12:12:47 -04:00
README.md Python code for neural pitch 2023-09-26 12:12:47 -04:00
run_crepe.py Script to compute the groundtruth data using CREPE 2023-09-27 13:00:12 -04:00
training.py Switching to neural pitch estimator 2023-10-06 03:14:56 -04:00
utils.py Python code for neural pitch 2023-09-26 12:12:47 -04:00

Neural Pitch Estimation

  • Dataset Installation

    1. Download and unzip PTDB Dataset: wget https://www2.spsc.tugraz.at/databases/PTDB-TUG/SPEECH_DATA_ZIPPED.zip unzip SPEECH_DATA_ZIPPED.zip

    2. Inside "SPEECH DATA" above, run ptdb_process.sh to combine male/female

    3. To Download and combine demand, simply run download_demand.sh

  • LPCNet preparation

    1. To extract xcorr, add lpcnet_extractor.c and add relevant functions to lpcnet_enc.c, add source for headers/c files and Makefile.am, and compile to generate ./lpcnet_xcorr_extractor object
  • Dataset Augmentation and training (check out arguments to each of the following)

    1. Run data_augmentation.py
    2. Run training.py using augmented data
    3. Run experiments.py