From 0563d71b255c2ef0cb65aab706ecbd44e0328c8d Mon Sep 17 00:00:00 2001 From: Jan Buethe Date: Sat, 7 Oct 2023 18:52:38 +0200 Subject: [PATCH] updated osce readme --- dnn/torch/osce/README.md | 53 ++++++++++++++++++++++++++++++++++++++-- 1 file changed, 51 insertions(+), 2 deletions(-) diff --git a/dnn/torch/osce/README.md b/dnn/torch/osce/README.md index b1475d91..40cf72f8 100644 --- a/dnn/torch/osce/README.md +++ b/dnn/torch/osce/README.md @@ -1,7 +1,6 @@ # Opus Speech Coding Enhancement -This folder hosts models for enhancing Opus SILK. See related Opus repo https://gitlab.xiph.org/xiph/opus/-/tree/exp-neural-silk-enhancement -for feature generation. +This folder hosts models for enhancing Opus SILK. ## Environment setup The code is tested with python 3.11. Conda setup is done via @@ -12,3 +11,53 @@ The code is tested with python 3.11. Conda setup is done via `conda activate osce` `python -m pip install -r requirements.txt` + + +## Generating training data +First step is to convert all training items to 16 kHz and 16 bit pcm and then concatenate them. A convenient way to do this is to create a file list and then run + +`python scripts/concatenator.py filelist 16000 dataset/clean.s16 --db_min -40 --db_max 0` + +which on top provides some random scaling. + +Second step is to run a patched version of opus_demo in the dataset folder, which will produce the coded output and add feature files. To build the patched opus_demo binary, check out the exp-neural-silk-enhancement branch and build opus_demo the usual way. Then run + +`cd dataset && /opus_demo voip 16000 1 9000 -silk_random_switching 249 clean.s16 coded.s16 ` + +The argument to -silk_random_switching specifies the number of frames after which parameters are switched randomly. + +## Generating inference data +Generating inference data is analogous to generating training data. Given an item 'item1.wav' run +`mkdir item1.se && sox item1.wav -r 16000 -e signed-integer -b 16 item1.raw && cd item1.se && /opus_demo voip 16000 1 ../item1.raw noisy.s16` + +The folder item1.se then serves as input for the test_model.py script or for the --testdata argument of train_model.py resp. adv_train_model.py + +## Regression loss based training +Create a default setup for LACE or NoLACE via + +`python make_default_setup.py model.yml --model lace/nolace --path2dataset ` + +Then run + +`python train_model.py model.yml --no-redirect` + +for running the training script in foreground or + +`nohup python train_model.py model.yml &` + +to run it in background. In the latter case the output is written to `/out.txt`. + +## Adversarial training (NoLACE only) +Create a default setup for NoLACE via + +`python make_default_setup.py nolace_adv.yml --model nolace --adversarial --path2dataset ` + +Then run + +`python adv_train_model.py nolace_adv.yml --no-redirect` + +for running the training script in foreground or + +`nohup python adv_train_model.py nolace_adv.yml &` + +to run it in background. In the latter case the output is written to `/out.txt`. \ No newline at end of file