Jean-Marc Valin
e9f8402a71
Handle float matrices with multiple of 8 rows
2023-08-01 19:16:27 -04:00
Jean-Marc Valin
4710bdf712
Add SSE2 support
...
Not so much for old machines, as for getting decent performance
when not setting -march= (SSE2 is part of the amd64 ABI).
2023-07-22 14:56:05 -04:00
Jean-Marc Valin
62cd1c963b
Transition to LinearLayer and remove unused code
2023-07-20 01:01:34 -04:00
Jean-Marc Valin
f5a68a41b0
Add generic linear layer
...
Should be able to handle all previous GRU variants and more.
2023-07-20 01:01:32 -04:00
xnorpx
7122abde59
Rename celt_exp to lpcnet_exp
...
Depending on what defines are set there is collisions with the ones
in Opus. To avoid these errors we rename the exp functions and
macros.
Signed-off-by: Jean-Marc Valin <jmvalin@amazon.com>
2023-05-24 00:46:20 -04:00
Jan Buethe
2112f3dd76
some fixes
2022-10-19 10:58:24 +02:00
Jean-Marc Valin
60a009b457
Making codebase C90-compliant
2022-01-19 18:10:44 -05:00
Jean-Marc Valin
4298f2f9e1
Adding support for SSE2 and SSSE3
2021-07-11 03:36:20 -04:00
Jean-Marc Valin
116bcb38fb
Adding SSE 4.1 for older platforms
...
AVX without AVX2 should now work again too.
2021-07-10 14:08:01 -04:00
Jean-Marc Valin
54abdb6f5d
Sparse matrix indexing optimization
...
The 4* is now stored in the table to avoid computing it in the loop
2021-07-10 01:59:49 -04:00
Jean-Marc Valin
d332100808
Representing output pdf as binary probability tree
...
Saves on the MDense/softmax computation since we only need to compute
8 values instead of 256.
2021-07-10 01:59:49 -04:00
Jean-Marc Valin
c1535c8ccf
Adding option to disable int8 dot products
2021-06-24 17:31:05 -04:00
Jean-Marc Valin
b214e684c1
Neon WIP: Compiles but very slow
2021-01-16 02:11:21 -05:00
Jean-Marc Valin
8c3fe6f31d
Cleaning up float version
2021-01-16 02:11:21 -05:00
Jean-Marc Valin
83657d0e43
Dot product AVX2 code for non-sparse multiply
2021-01-16 02:11:21 -05:00
Jean-Marc Valin
1707b960de
cleanup, add signed-unsigned biases
2021-01-16 02:11:21 -05:00
Jean-Marc Valin
40b309d92b
WIP: 8-bit SIMD for GRU B
2021-01-16 02:11:21 -05:00
Jean-Marc Valin
e695355ba5
some cleanup
2021-01-16 02:11:20 -05:00
Jean-Marc Valin
be392e3857
WIP: Got some AVX2 code working
2021-01-16 02:11:20 -05:00
Jean-Marc Valin
bce779886d
WIP: signed*unsigned arithmetic
2021-01-16 02:11:20 -05:00
Jean-Marc Valin
11736ca9e3
WIP: 8-bit mul
2021-01-16 02:11:19 -05:00
Jean-Marc Valin
73a05f55c7
wip 8x4
2021-01-16 02:11:19 -05:00
Jean-Marc Valin
a8fb25f11c
Remove NaN checks
2019-03-20 13:36:42 -04:00
David Rowe
7dc696b9a4
refactored for different machines, sgemv_accum16 using NEON intrisics
...
Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
2018-12-10 21:28:29 -05:00