Commit graph

25 commits

Author SHA1 Message Date
David Rowe
7dc696b9a4 refactored for different machines, sgemv_accum16 using NEON intrisics
Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
2018-12-10 21:28:29 -05:00
Jean-Marc Valin
771cc7868a Support for plain AVX with no FMA 2018-12-04 07:58:13 -05:00
Jean-Marc Valin
b05f950e38 Using the right name: s/gemm/sgemv/ 2018-11-30 10:56:44 -05:00
Jean-Marc Valin
c395a68b7d moving code around 2018-11-30 10:46:32 -05:00
Jean-Marc Valin
05f4851dcd Making the code work even without AVX2/FMA 2018-11-30 10:32:04 -05:00
Jean-Marc Valin
d7f0abcd19 Delaying the softmax() to avoid the pow()
Now at 5x real-time, with all the low-hanging fruit done.
2018-11-29 20:09:36 -05:00
Jean-Marc Valin
faf3fe3d24 gemm_accum16() doesn't need a multiple of 16 columns (just lines). 2018-11-29 19:50:09 -05:00
Jean-Marc Valin
7ee79b63df Add AXV versions of exp(), tanh() and sigmoid()
Now 3x faster than real-time
2018-11-29 19:43:59 -05:00
Jean-Marc Valin
4de3e53a73 Adding some sparse GRU support
Still need to properly dump as sparse.
2018-11-28 18:49:19 -05:00
Jean-Marc Valin
ec671ed90e Quick and dirty AVX2 implementation of gemm_accum
Brings us very close to real-time
2018-11-28 14:57:22 -05:00
Jean-Marc Valin
732fce9ab2 Pre-computing GRU_A's input contribution. 2018-11-28 14:05:36 -05:00
Jean-Marc Valin
040aa437c3 Simper GRU implementation just for reset_after. 2018-11-28 12:37:18 -05:00
Jean-Marc Valin
36a0bf8c75 Wow, managed two bugs in a 25-character line 2018-11-27 14:50:38 -05:00
Jean-Marc Valin
c7b978b923 Fix reset_after GRU 2018-11-27 14:37:10 -05:00
Jean-Marc Valin
4ccfbdff04 Frame network seems to be working 2018-11-26 18:41:54 -05:00
Jean-Marc Valin
538f25565a Starting to actually test this -- fix a few OOB reads 2018-11-26 16:02:49 -05:00
Jean-Marc Valin
575d8d6fa4 Adding sampling 2018-11-26 11:04:41 -05:00
Jean-Marc Valin
7119eaf33b Plumbing for the frame rate network 2018-11-25 17:20:24 -05:00
Jean-Marc Valin
141830ce5a Fixing includes 2018-11-24 16:00:30 -05:00
Jean-Marc Valin
37fbcaee0b mdense max size 2018-11-24 15:51:08 -05:00
Jean-Marc Valin
94ac0841df Precomputing sizes 2018-11-24 15:47:48 -05:00
Jean-Marc Valin
c025744e34 Fix conv1d, default to size 384 2018-11-24 15:30:17 -05:00
Jean-Marc Valin
66486004ba Implement MDense 2018-11-24 12:23:11 -05:00
Jean-Marc Valin
d4046036a9 Dump Conv1D (didn't check weight ordering at all) 2018-11-24 11:32:01 -05:00
Jean-Marc Valin
b9cd61be8b Work in progress translation to C 2018-11-23 19:43:58 -05:00