opus/dnn/torch/osce
Jan Buethe 7db26934e4
fix for buffer size calculation in osce.c
update to osce evaluation script
2024-10-16 11:01:21 +02:00
..
data Merge LACE/NoLACE under OSCE framework 2023-12-20 03:42:44 -05:00
engine Updated LACE and NoLACE models to version 2 2024-01-20 14:44:22 +01:00
losses added some bwe-related stuff 2024-04-24 12:17:51 +02:00
models fixes in osce python code 2024-01-23 17:10:34 +01:00
resources added list of training/validation files used for osce training 2024-03-07 15:15:37 +01:00
scripts README related osce fixes 2024-04-29 15:56:46 +02:00
stndrd fix for buffer size calculation in osce.c 2024-10-16 11:01:21 +02:00
utils added some bwe-related stuff 2024-04-24 12:17:51 +02:00
adv_train_model.py Updated LACE and NoLACE models to version 2 2024-01-20 14:44:22 +01:00
adv_train_vocoder.py Updated LACE and NoLACE models to version 2 2024-01-20 14:44:22 +01:00
create_testvectors.py Merge LACE/NoLACE under OSCE framework 2023-12-20 03:42:44 -05:00
export_model_weights.py disabled sparse option in osce export script 2024-02-15 15:25:06 +01:00
make_default_setup.py added more enhancement stuff 2023-09-12 14:50:24 +02:00
README.md README related osce fixes III 2024-04-29 17:13:45 +02:00
requirements.txt Bump gitpython from 3.1.36 to 3.1.41 in /dnn/torch/osce 2024-03-12 13:34:10 -04:00
silk_16_to_48.py added some bwe-related stuff 2024-04-24 12:17:51 +02:00
test_model.py Opus ng lace 2023-06-30 21:15:56 +00:00
test_vocoder.py added more enhancement stuff 2023-09-12 14:50:24 +02:00
train_model.py Updated LACE and NoLACE models to version 2 2024-01-20 14:44:22 +01:00
train_vocoder.py Updated LACE and NoLACE models to version 2 2024-01-20 14:44:22 +01:00

Opus Speech Coding Enhancement

This folder hosts models for enhancing Opus SILK.

Environment setup

The code is tested with python 3.11. Conda setup is done via

conda create -n osce python=3.11

conda activate osce

python -m pip install -r requirements.txt

Generating training data

First step is to convert all training items to 16 kHz and 16 bit pcm and then concatenate them. A convenient way to do this is to create a file list and then run

python scripts/concatenator.py filelist 16000 dataset/clean.s16 --db_min -40 --db_max 0

which on top provides some random scaling. Data is taken from the datasets listed in dnn/datasets.txt and the exact list of items used for training and validation is located in dnn/torch/osce/resources.

Second step is to run a patched version of opus_demo in the dataset folder, which will produce the coded output and add feature files. To build the patched opus_demo binary, check out the exp-neural-silk-enhancement branch and build opus_demo the usual way. Then run

cd dataset && <path_to_patched_opus_demo>/opus_demo voip 16000 1 9000 -silk_random_switching 249 clean.s16 coded.s16

The argument to -silk_random_switching specifies the number of frames after which parameters are switched randomly.

Regression loss based training

Create a default setup for LACE or NoLACE via

python make_default_setup.py model.yml --model lace/nolace --path2dataset <path2dataset>

Then run

python train_model.py model.yml <output folder> --no-redirect

for running the training script in foreground or

nohup python train_model.py model.yml <output folder> &

to run it in background. In the latter case the output is written to <output folder>/out.txt.

Adversarial training (NoLACE only)

Create a default setup for NoLACE via

python make_default_setup.py nolace_adv.yml --model nolace --adversarial --path2dataset <path2dataset>

Then run

python adv_train_model.py nolace_adv.yml <output folder> --no-redirect

for running the training script in foreground or

nohup python adv_train_model.py nolace_adv.yml <output folder> &

to run it in background. In the latter case the output is written to <output folder>/out.txt.

Inference

Generating inference data is analogous to generating training data. Given an item 'item1.wav' run mkdir item1.se && sox item1.wav -r 16000 -e signed-integer -b 16 item1.raw && cd item1.se && <path_to_patched_opus_demo>/opus_demo voip 16000 1 <bitrate> ../item1.raw noisy.s16

The folder item1.se then serves as input for the test_model.py script or for the --testdata argument of train_model.py resp. adv_train_model.py

autogen.sh downloads pre-trained model weights to the subfolder dnn/models of the main repo.