Jean-Marc Valin
|
2f5b51c94a
|
Avoiding symbol clashes with Opus
|
2022-01-24 23:21:31 -05:00 |
|
Jean-Marc Valin
|
805fed733a
|
Fix warnings
|
2022-01-24 16:33:32 -05:00 |
|
Jean-Marc Valin
|
57f5681987
|
Add swish activation support
|
2022-01-24 16:22:29 -05:00 |
|
Jean-Marc Valin
|
60a009b457
|
Making codebase C90-compliant
|
2022-01-19 18:10:44 -05:00 |
|
Jean-Marc Valin
|
3a47548536
|
Using KISS99 (taken from Daala) as RNG
|
2021-11-10 17:58:51 -05:00 |
|
Jean-Marc Valin
|
e4b4613d05
|
Fix signed-unsigned biases
|
2021-09-02 02:34:08 -04:00 |
|
Jean-Marc Valin
|
51ef273e06
|
Using 8-bit recurrent weights for GRU B
|
2021-09-02 02:33:55 -04:00 |
|
Jean-Marc Valin
|
8bdbbfa18d
|
Support for sparse GRU B input matrices
Only on the C side, no sparse GRU B training yet
|
2021-07-16 03:07:26 -04:00 |
|
Jean-Marc Valin
|
c74330e850
|
Pre-compute GRU B conditioning
Adapted from PR: https://github.com/mozilla/LPCNet/pull/134
by zhuxiaoxu <zhuxiaoxu@ainirobot.com>
but had to be reworked due to previous weight quantization changes.
|
2021-07-15 16:06:56 -04:00 |
|
Jean-Marc Valin
|
7d8b00f11d
|
Sampling directly from the logit
Avoids having to compute a sigmoid
|
2021-07-10 01:59:49 -04:00 |
|
Jean-Marc Valin
|
7cef98ec8c
|
Minor optimization: merging all 3 embeddings
|
2021-07-10 01:59:49 -04:00 |
|
Jean-Marc Valin
|
006556036a
|
Cleaning up the sparse GRU
It no longer overwrites its input vector
|
2021-07-10 01:59:49 -04:00 |
|
Jean-Marc Valin
|
d332100808
|
Representing output pdf as binary probability tree
Saves on the MDense/softmax computation since we only need to compute
8 values instead of 256.
|
2021-07-10 01:59:49 -04:00 |
|
Jean-Marc Valin
|
8c4b88cfab
|
Using a bisection search for sampling
|
2021-06-30 18:14:12 -04:00 |
|
Jean-Marc Valin
|
e35441f2cc
|
Faster activation functions for AVX
Using rational function approximation for tanh() and sigmoid.
|
2021-06-29 04:05:48 -04:00 |
|
Jean-Marc Valin
|
5571ef1b8e
|
minor optimization: removing some copying
|
2021-06-26 01:27:03 -04:00 |
|
Jean-Marc Valin
|
83657d0e43
|
Dot product AVX2 code for non-sparse multiply
|
2021-01-16 02:11:21 -05:00 |
|
Jean-Marc Valin
|
40b309d92b
|
WIP: 8-bit SIMD for GRU B
|
2021-01-16 02:11:21 -05:00 |
|
Jean-Marc Valin
|
06489b42dd
|
oops, fix number of columns
|
2021-01-16 02:11:20 -05:00 |
|
Jean-Marc Valin
|
bce779886d
|
WIP: signed*unsigned arithmetic
|
2021-01-16 02:11:20 -05:00 |
|
Jean-Marc Valin
|
73a05f55c7
|
wip 8x4
|
2021-01-16 02:11:19 -05:00 |
|
Jean-Marc Valin
|
14fb264a0f
|
Fix sampling bug for 16-bit rand()
According to David Rowe, when rand() returns RAND_MAX (which is likely
for 16-bit output), we end up producing a click.
|
2020-06-20 23:47:04 -04:00 |
|
David Rowe
|
7dc696b9a4
|
refactored for different machines, sgemv_accum16 using NEON intrisics
Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
|
2018-12-10 21:28:29 -05:00 |
|
Jean-Marc Valin
|
771cc7868a
|
Support for plain AVX with no FMA
|
2018-12-04 07:58:13 -05:00 |
|
Jean-Marc Valin
|
b05f950e38
|
Using the right name: s/gemm/sgemv/
|
2018-11-30 10:56:44 -05:00 |
|
Jean-Marc Valin
|
c395a68b7d
|
moving code around
|
2018-11-30 10:46:32 -05:00 |
|
Jean-Marc Valin
|
05f4851dcd
|
Making the code work even without AVX2/FMA
|
2018-11-30 10:32:04 -05:00 |
|
Jean-Marc Valin
|
d7f0abcd19
|
Delaying the softmax() to avoid the pow()
Now at 5x real-time, with all the low-hanging fruit done.
|
2018-11-29 20:09:36 -05:00 |
|
Jean-Marc Valin
|
faf3fe3d24
|
gemm_accum16() doesn't need a multiple of 16 columns (just lines).
|
2018-11-29 19:50:09 -05:00 |
|
Jean-Marc Valin
|
7ee79b63df
|
Add AXV versions of exp(), tanh() and sigmoid()
Now 3x faster than real-time
|
2018-11-29 19:43:59 -05:00 |
|
Jean-Marc Valin
|
4de3e53a73
|
Adding some sparse GRU support
Still need to properly dump as sparse.
|
2018-11-28 18:49:19 -05:00 |
|
Jean-Marc Valin
|
ec671ed90e
|
Quick and dirty AVX2 implementation of gemm_accum
Brings us very close to real-time
|
2018-11-28 14:57:22 -05:00 |
|
Jean-Marc Valin
|
732fce9ab2
|
Pre-computing GRU_A's input contribution.
|
2018-11-28 14:05:36 -05:00 |
|
Jean-Marc Valin
|
040aa437c3
|
Simper GRU implementation just for reset_after.
|
2018-11-28 12:37:18 -05:00 |
|
Jean-Marc Valin
|
36a0bf8c75
|
Wow, managed two bugs in a 25-character line
|
2018-11-27 14:50:38 -05:00 |
|
Jean-Marc Valin
|
c7b978b923
|
Fix reset_after GRU
|
2018-11-27 14:37:10 -05:00 |
|
Jean-Marc Valin
|
4ccfbdff04
|
Frame network seems to be working
|
2018-11-26 18:41:54 -05:00 |
|
Jean-Marc Valin
|
538f25565a
|
Starting to actually test this -- fix a few OOB reads
|
2018-11-26 16:02:49 -05:00 |
|
Jean-Marc Valin
|
575d8d6fa4
|
Adding sampling
|
2018-11-26 11:04:41 -05:00 |
|
Jean-Marc Valin
|
7119eaf33b
|
Plumbing for the frame rate network
|
2018-11-25 17:20:24 -05:00 |
|
Jean-Marc Valin
|
141830ce5a
|
Fixing includes
|
2018-11-24 16:00:30 -05:00 |
|
Jean-Marc Valin
|
37fbcaee0b
|
mdense max size
|
2018-11-24 15:51:08 -05:00 |
|
Jean-Marc Valin
|
94ac0841df
|
Precomputing sizes
|
2018-11-24 15:47:48 -05:00 |
|
Jean-Marc Valin
|
c025744e34
|
Fix conv1d, default to size 384
|
2018-11-24 15:30:17 -05:00 |
|
Jean-Marc Valin
|
66486004ba
|
Implement MDense
|
2018-11-24 12:23:11 -05:00 |
|
Jean-Marc Valin
|
d4046036a9
|
Dump Conv1D (didn't check weight ordering at all)
|
2018-11-24 11:32:01 -05:00 |
|
Jean-Marc Valin
|
b9cd61be8b
|
Work in progress translation to C
|
2018-11-23 19:43:58 -05:00 |
|