mirror of
https://github.com/xiph/opus.git
synced 2025-06-05 23:10:54 +00:00
.. | ||
_kiss_fft_guts.h | ||
arch.h | ||
causalconv.py | ||
celt_lpc.c | ||
celt_lpc.h | ||
common.h | ||
compile.sh | ||
COPYING | ||
denoise.c | ||
gatedconv.py | ||
kiss_fft.c | ||
kiss_fft.h | ||
lpcnet.py | ||
mdense.py | ||
opus_types.h | ||
pitch.c | ||
pitch.h | ||
README | ||
rnnoise.h | ||
test_lpcnet.py | ||
test_wavenet_audio.py | ||
train_lpcnet.py | ||
train_wavenet.py | ||
train_wavenet_audio.py | ||
ulaw.py | ||
wavenet.py |
In the src/ directory, run ./compile.sh to compile the data processing program. Then, run the resulting executable: ./dump_data input.s16 exc.s8 features.f32 pred.s16 pcm.s16 where the first file contains 16 kHz 16-bit raw PCM audio (no header) and the other files are output files. The input file I'm using currently is 6 hours long, but you may be able to get away with less (and you can always use ±5% or 10% resampling to augment your data). Now that you have your files, you can do the training with: ./train_wavenet_audio.py exc.s8 features.f32 pred.s16 pcm.s16 and it will generate a wavenet*.h5 file for each iteration. You can do the synthesis with: ./test_wavenet_audio.py features.f32 > pcm.txt If you're lucky, you may be able to get the current model at: https://jmvalin.ca/misc_stuff/lpcnet_models/