opus/dnn
2018-10-13 14:52:30 -04:00
..
_kiss_fft_guts.h Importing DSP code from RNNoise 2018-06-24 02:41:36 -04:00
arch.h Importing DSP code from RNNoise 2018-06-24 02:41:36 -04:00
causalconv.py first wavenet implementation 2018-07-13 02:44:43 -04:00
celt_lpc.c fix pitch 2018-06-25 02:10:31 -04:00
celt_lpc.h fix pitch 2018-06-25 02:10:31 -04:00
common.h Importing DSP code from RNNoise 2018-06-24 02:41:36 -04:00
compile.sh missing script 2018-10-09 01:01:18 -04:00
COPYING add license 2018-10-10 17:28:14 -04:00
denoise.c more pcm outputs 2018-08-03 01:59:29 -04:00
gatedconv.py wip... 2018-07-23 17:05:21 -04:00
kiss_fft.c Importing DSP code from RNNoise 2018-06-24 02:41:36 -04:00
kiss_fft.h Importing DSP code from RNNoise 2018-06-24 02:41:36 -04:00
lpcnet.py ... 2018-10-10 18:05:27 -04:00
mdense.py initial commit 2018-06-21 20:45:54 -04:00
opus_types.h Importing DSP code from RNNoise 2018-06-24 02:41:36 -04:00
pitch.c Fix NaN issue 2018-07-11 17:41:35 -04:00
pitch.h fix pitch 2018-06-25 02:10:31 -04:00
README Adding some instructions 2018-10-13 03:54:58 -04:00
rnnoise.h Importing DSP code from RNNoise 2018-06-24 02:41:36 -04:00
test_lpcnet.py more pcm outputs 2018-08-03 01:59:29 -04:00
test_wavenet_audio.py Reduce sampling temperature for voiced frames 2018-10-13 14:52:30 -04:00
train_lpcnet.py stashing stuff here 2018-07-12 18:20:25 -04:00
train_wavenet.py more pcm outputs 2018-08-03 01:59:29 -04:00
train_wavenet_audio.py second RNN 2018-10-10 18:05:27 -04:00
ulaw.py mu-law code cleanup 2018-10-09 02:39:12 -04:00
wavenet.py Add input embedding 2018-07-27 16:33:01 -04:00

In the src/ directory, run ./compile.sh to compile the data processing program.
Then, run the resulting executable:
./dump_data input.s16 exc.s8 features.f32 pred.s16 pcm.s16

where the first file contains 16 kHz 16-bit raw PCM audio (no header)
and the other files are output files. The input file I'm using currently
is 6 hours long, but you may be able to get away with less (and you can
always use ±5% or 10% resampling to augment your data).

Now that you have your files, you can do the training with:
./train_wavenet_audio.py exc.s8 features.f32 pred.s16 pcm.s16
and it will generate a wavenet*.h5 file for each iteration.

You can do the synthesis with:
./test_wavenet_audio.py features.f32 > pcm.txt

If you're lucky, you may be able to get the current model at:
https://jmvalin.ca/misc_stuff/lpcnet_models/