opus/dnn at d75b51b18a1b62d9a8ecd135a8fdad16f91494d4 - eden-emu/opus

mirror of https://github.com/xiph/opus.git synced 2025-06-05 23:10:54 +00:00

History

Jean-Marc Valin d75b51b18a Reduce sampling temperature for voiced frames		2018-10-13 14:52:30 -04:00
..
_kiss_fft_guts.h	Importing DSP code from RNNoise	2018-06-24 02:41:36 -04:00
arch.h	Importing DSP code from RNNoise	2018-06-24 02:41:36 -04:00
causalconv.py	first wavenet implementation	2018-07-13 02:44:43 -04:00
celt_lpc.c	fix pitch	2018-06-25 02:10:31 -04:00
celt_lpc.h	fix pitch	2018-06-25 02:10:31 -04:00
common.h	Importing DSP code from RNNoise	2018-06-24 02:41:36 -04:00
compile.sh	missing script	2018-10-09 01:01:18 -04:00
COPYING	add license	2018-10-10 17:28:14 -04:00
denoise.c	more pcm outputs	2018-08-03 01:59:29 -04:00
gatedconv.py	wip...	2018-07-23 17:05:21 -04:00
kiss_fft.c	Importing DSP code from RNNoise	2018-06-24 02:41:36 -04:00
kiss_fft.h	Importing DSP code from RNNoise	2018-06-24 02:41:36 -04:00
lpcnet.py	...	2018-10-10 18:05:27 -04:00
mdense.py	initial commit	2018-06-21 20:45:54 -04:00
opus_types.h	Importing DSP code from RNNoise	2018-06-24 02:41:36 -04:00
pitch.c	Fix NaN issue	2018-07-11 17:41:35 -04:00
pitch.h	fix pitch	2018-06-25 02:10:31 -04:00
README	Adding some instructions	2018-10-13 03:54:58 -04:00
rnnoise.h	Importing DSP code from RNNoise	2018-06-24 02:41:36 -04:00
test_lpcnet.py	more pcm outputs	2018-08-03 01:59:29 -04:00
test_wavenet_audio.py	Reduce sampling temperature for voiced frames	2018-10-13 14:52:30 -04:00
train_lpcnet.py	stashing stuff here	2018-07-12 18:20:25 -04:00
train_wavenet.py	more pcm outputs	2018-08-03 01:59:29 -04:00
train_wavenet_audio.py	second RNN	2018-10-10 18:05:27 -04:00
ulaw.py	mu-law code cleanup	2018-10-09 02:39:12 -04:00
wavenet.py	Add input embedding	2018-07-27 16:33:01 -04:00

README

In the src/ directory, run ./compile.sh to compile the data processing program.
Then, run the resulting executable:
./dump_data input.s16 exc.s8 features.f32 pred.s16 pcm.s16

where the first file contains 16 kHz 16-bit raw PCM audio (no header)
and the other files are output files. The input file I'm using currently
is 6 hours long, but you may be able to get away with less (and you can
always use ±5% or 10% resampling to augment your data).

Now that you have your files, you can do the training with:
./train_wavenet_audio.py exc.s8 features.f32 pred.s16 pcm.s16
and it will generate a wavenet*.h5 file for each iteration.

You can do the synthesis with:
./test_wavenet_audio.py features.f32 > pcm.txt

If you're lucky, you may be able to get the current model at:
https://jmvalin.ca/misc_stuff/lpcnet_models/