eden-emu/opus - Eden Git: A Community Emulator

Author	SHA1	Message	Date
Timothy B. Terriberry	59dc75fa97	Rework 32-bit SSE loads yet again. The existing code in vec_avx.h produced warning: dereferencing type-punned pointer will break strict-aliasing rules with gcc 6.4.0. We already had a macro to work around this within the rules of the C standard, but trying to use that here does not get optimized into a single MOVD like we were hoping. Replacing it with memcpy() instead does get optimized correctly, but requires switching from a macro to an inline function in order to be able to declare a local variable and return a value. We already have such an inline function in NSQ_del_dec_avx2.c, so hoist that out and use it everywhere, and then convert vec_avx.h to use it also.	2024-02-23 02:23:37 -05:00
Jean-Marc Valin	2e034f6f31	Adding RTCD for DNN code Starting with compute_linear()	2023-11-15 23:45:32 -05:00
Jean-Marc Valin	58923f61c2	Fix non-AVX builds	2023-11-11 03:24:21 -05:00
Jean-Marc Valin	1ada7d4d6f	Vectorizing sgemv for multiples of 4 with SSE	2023-11-03 02:48:38 -04:00
Jean-Marc Valin	62b546436f	Speed up general case for float matrix multiply	2023-10-30 00:08:53 -04:00
Jean-Marc Valin	88c58cfaf3	nnet.h no longer needs to #include "vec.h"	2023-10-20 17:25:27 -04:00
Jean-Marc Valin	81624caf9c	Silencing alignment warnings on x86 intrinsics Those intrinsics don't actually require alignment so we're OK	2023-10-07 17:45:39 -04:00
Michael Klingbeil	d431c321f1	Fixes vnni macro redefinition with clang	2023-09-01 23:18:21 -04:00
Jean-Marc Valin	e9f8402a71	Handle float matrices with multiple of 8 rows	2023-08-01 19:16:27 -04:00
Jean-Marc Valin	8f7c72a662	Always define USE_SU_BIAS in vec_avx.h	2023-07-22 14:56:05 -04:00
Jean-Marc Valin	4710bdf712	Add SSE2 support Not so much for old machines, as for getting decent performance when not setting -march= (SSE2 is part of the amd64 ABI).	2023-07-22 14:56:05 -04:00
Jean-Marc Valin	9261eb5c37	Refactoring to make VNNI and SSE2 easier	2023-07-22 14:56:04 -04:00
Jean-Marc Valin	62cd1c963b	Transition to LinearLayer and remove unused code	2023-07-20 01:01:34 -04:00
Jean-Marc Valin	f5a68a41b0	Add generic linear layer Should be able to handle all previous GRU variants and more.	2023-07-20 01:01:32 -04:00
xnorpx	7122abde59	Rename celt_exp to lpcnet_exp Depending on what defines are set there is collisions with the ones in Opus. To avoid these errors we rename the exp functions and macros. Signed-off-by: Jean-Marc Valin <jmvalin@amazon.com>	2023-05-24 00:46:20 -04:00
xnorpx	879084f6f0	Fix some of C4244 double to float warnings	2023-05-24 00:30:19 -04:00
xnorpx	702fffb70a	Include math.h to make header self-contained. Signed-off-by: Jean-Marc Valin <jmvalin@amazon.com>	2023-05-23 11:24:35 -04:00
xnorpx	5b96946277	Use pragma message instead of warning on MSVC Signed-off-by: Jean-Marc Valin <jmvalin@amazon.com>	2023-05-23 02:31:09 -04:00
Jan Buethe	d80f99f78b	added void to shut up missing prototype warning	2022-10-21 15:33:41 +00:00
Jean-Marc Valin	60a009b457	Making codebase C90-compliant	2022-01-19 18:10:44 -05:00
Jean-Marc Valin	4298f2f9e1	Adding support for SSE2 and SSSE3	2021-07-11 03:36:20 -04:00
Jean-Marc Valin	116bcb38fb	Adding SSE 4.1 for older platforms AVX without AVX2 should now work again too.	2021-07-10 14:08:01 -04:00
Jean-Marc Valin	e8f70128d5	same conversion cleanup as 3206cec for sgemv_accum8x4()	2021-07-10 01:59:49 -04:00
Jean-Marc Valin	714380e71b	More manual unrolling	2021-07-10 01:59:49 -04:00
Jean-Marc Valin	44fe055682	cleanup float<->int conversions	2021-07-10 01:59:49 -04:00
Jean-Marc Valin	60d6eab63d	Doing a bit of unrolling to speed things up	2021-07-10 01:59:49 -04:00
Jean-Marc Valin	54abdb6f5d	Sparse matrix indexing optimization The 4* is now stored in the table to avoid computing it in the loop	2021-07-10 01:59:49 -04:00
Jean-Marc Valin	d332100808	Representing output pdf as binary probability tree Saves on the MDense/softmax computation since we only need to compute 8 values instead of 256.	2021-07-10 01:59:49 -04:00
Jean-Marc Valin	e35441f2cc	Faster activation functions for AVX Using rational function approximation for tanh() and sigmoid.	2021-06-29 04:05:48 -04:00
Jean-Marc Valin	c1535c8ccf	Adding option to disable int8 dot products	2021-06-24 17:31:05 -04:00
Jean-Marc Valin	0b9f6bab81	Remove unnecessary mask in exp() approximation This isn't necessary since valid exponents can't flip the sign bit	2021-06-21 01:34:38 -04:00
Jean-Marc Valin	ae2ae5ead6	Remove useless multiply by one See `bffdcee95 (commitcomment-46372726)`	2021-06-21 01:30:51 -04:00
Jean-Marc Valin	8c3fe6f31d	Cleaning up float version	2021-01-16 02:11:21 -05:00
Jean-Marc Valin	83657d0e43	Dot product AVX2 code for non-sparse multiply	2021-01-16 02:11:21 -05:00
Jean-Marc Valin	e695355ba5	some cleanup	2021-01-16 02:11:20 -05:00
Jean-Marc Valin	d87f974431	Vectorizing conversion	2021-01-16 02:11:20 -05:00
Jean-Marc Valin	6b582edbed	WIP: remove scalar code from AVX2 code	2021-01-16 02:11:20 -05:00
Jean-Marc Valin	be392e3857	WIP: Got some AVX2 code working	2021-01-16 02:11:20 -05:00
Jean-Marc Valin	2b4652f9f6	WIP: cleanup	2021-01-16 02:11:20 -05:00
Jean-Marc Valin	bce779886d	WIP: signed*unsigned arithmetic	2021-01-16 02:11:20 -05:00
Jean-Marc Valin	11736ca9e3	WIP: 8-bit mul	2021-01-16 02:11:19 -05:00
Jean-Marc Valin	c045702e51	Add non-dot-product AVX code	2021-01-16 02:11:19 -05:00
Jean-Marc Valin	8e405b44e0	Improve accuracy of AVX sigmoid Reciprocal approximation could cause the sigmoid output to be greater than 1.0.	2021-01-16 01:51:39 -05:00
David Rowe	7dc696b9a4	refactored for different machines, sgemv_accum16 using NEON intrisics Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>	2018-12-10 21:28:29 -05:00

44 commits