Jean-Marc Valin
ee1bb69f2d
Only force auto-vectorization for GCC >= 5.1
2023-11-27 17:55:27 -05:00
Jean-Marc Valin
7cc30ec681
Force vectorization for DNN primitives
...
Avoids having to write intrinsics for simple loops
2023-11-27 16:44:11 -05:00
Jean-Marc Valin
db6dad446c
Fix ARMv7 optimizations for DNN code
2023-11-26 22:21:29 -05:00
Jean-Marc Valin
cc11c078cd
First step towards DNN optimization for ARMv7 Neon
...
Still missing some intrinsics
2023-11-26 03:36:46 -05:00
Jean-Marc Valin
c9af8f80f7
Fix potential read out of bounds in fargan
2023-11-26 03:16:34 -05:00
Jean-Marc Valin
5c3795b287
Adding dotprod instruction to ARM rtcd
...
Used for DNN matrix multiplies
2023-11-25 03:15:51 -05:00
Jean-Marc Valin
984f35b313
Speed up cross-correlation normalization
2023-11-24 18:28:08 -05:00
Jean-Marc Valin
d65b7de3c5
Use arch-specific celt_inner_prod() for features
2023-11-24 18:08:01 -05:00
Jean-Marc Valin
ddbdbec444
Optimize biquad() to reduce dependency chains
2023-11-24 18:02:35 -05:00
Jean-Marc Valin
176507e4fc
Remove process_single_frame()
...
Code moved to compute_frame_features()
2023-11-24 13:33:04 -05:00
Jean-Marc Valin
9d0425d88b
Remove feature writing (fwrite()) from libopus
2023-11-24 13:23:52 -05:00
Jean-Marc Valin
f42940bef9
Make sure weights files are marked as modified
2023-11-20 14:13:23 -05:00
Jean-Marc Valin
a93b09e241
Adding RTCD for compute_conv2d()
2023-11-17 14:20:09 -05:00
Jean-Marc Valin
7f7b2a1c66
Smaller version of fargan
...
800k parameters, 600 MFLOPS, with a receptive field of 3 feature vectors
2023-11-16 02:06:14 -05:00
Jean-Marc Valin
4bfc0f8555
Adding RTCD for compute_activation()
2023-11-15 23:46:01 -05:00
Jean-Marc Valin
2e034f6f31
Adding RTCD for DNN code
...
Starting with compute_linear()
2023-11-15 23:45:32 -05:00
Jean-Marc Valin
b0620c0bf9
Using sparse GRUs in DRED decoder
...
Saves ~270 kB of weights in the decoder
2023-11-15 04:08:50 -05:00
Jean-Marc Valin
58923f61c2
Fix non-AVX builds
2023-11-11 03:24:21 -05:00
Jean-Marc Valin
77594bf158
Dumping RDOVAE stats from XML
2023-11-08 17:32:43 -05:00
Jean-Marc Valin
222662dac8
DRED: quantize scale and dead zone to 8 bits
2023-11-07 18:10:50 -05:00
Jan Buethe
4e104555e9
added weight export script for LACE/NoLACE
2023-11-07 15:12:12 +01:00
Jan Buethe
8af5c6b4a1
added transposed 1d convolutions to wexchange
2023-11-07 11:54:22 +01:00
Jean-Marc Valin
b6095cf22d
DRED code cleanup
...
Removing some indirections
2023-11-07 02:52:40 -05:00
Jean-Marc Valin
0ab0640d4a
Split stats in two and remove useless dimensions
2023-11-07 00:07:14 -05:00
Jan Buethe
2386a60ec6
updated moc to match results in ietf118 presentation
2023-11-06 17:50:48 +01:00
Jean-Marc Valin
544b3e576c
DRED: quantize r and p0 parameters with 8 bits
...
Only code non-degenerate symbols, which makes the encoder faster
2023-11-06 03:16:43 -05:00
Jean-Marc Valin
1ada7d4d6f
Vectorizing sgemv for multiples of 4 with SSE
2023-11-03 02:48:38 -04:00
Jan Buethe
da60266f6e
updated moc method
2023-11-02 16:52:50 +01:00
Jean-Marc Valin
feb3282887
Don't try to use models that aren't loaded
2023-10-30 14:08:07 -04:00
Jean-Marc Valin
62b546436f
Speed up general case for float matrix multiply
2023-10-30 00:08:53 -04:00
Jean-Marc Valin
61fb3b1689
Don't use reserved identifiers for include guards
2023-10-29 21:19:51 -04:00
Jean-Marc Valin
d53531d0bd
Update blob loading code
2023-10-29 18:06:18 -04:00
Jean-Marc Valin
0b75501270
Use log approximation when possible
2023-10-29 02:38:21 -04:00
Jean-Marc Valin
4259d354df
Reusing already-optimized celt_fir()
2023-10-29 02:20:35 -04:00
Jean-Marc Valin
b22b11a412
Silence some warnings
...
Including removing useless code
2023-10-29 00:12:58 -04:00
Jean-Marc Valin
ddd5669e79
Pitch and fargan model updates
...
Removing one of the 2d conv layers for pitch estimation reduces
complexity without noticeable degradation. FARGAN model has more
adversarial training.
Also, no need for the double precision in the low-pass filter.
2023-10-28 23:33:47 -04:00
Jean-Marc Valin
ccb244a732
cleanup
2023-10-24 09:27:31 -04:00
Jean-Marc Valin
bc102f5fab
Slightly more continuous analysis
2023-10-24 09:19:51 -04:00
Jean-Marc Valin
64236e5201
Removing more useless code
2023-10-21 02:26:44 -04:00
Jean-Marc Valin
ef8115bd9a
Stop using tansig_table.h (both copies)
2023-10-20 22:07:58 -04:00
Jean-Marc Valin
88c58cfaf3
nnet.h no longer needs to #include "vec.h"
2023-10-20 17:25:27 -04:00
Jean-Marc Valin
1032e47d3f
more cleanup
2023-10-20 15:13:43 -04:00
Jean-Marc Valin
7f0d456c4b
Remove unneeded functions in nnet.c
2023-10-20 15:05:14 -04:00
Jean-Marc Valin
4598fe5409
Quantizing pitchdnn and rdovae weights
2023-10-20 12:54:13 -04:00
Jan Buethe
290be25b98
added 16kHz version of opus_compare in python
2023-10-20 14:24:27 +02:00
Jan Buethe
1accd2472e
finalized quantization option in export_rdovae_weights.py
2023-10-20 14:14:31 +02:00
Jean-Marc Valin
88c8b30785
Doing some unrolling on ARM/Neon
2023-10-20 03:28:17 -04:00
Jean-Marc Valin
f512c9206b
Unroll the 3x3 convolution case
...
Gets us about 2x speedup on x86
2023-10-20 01:33:49 -04:00
Jean-Marc Valin
d720955d61
Marking RDOVAE layers to quantize
2023-10-19 16:06:52 -04:00
Jan Buethe
60ac1c6c99
prepared quantization implementation for DRED
2023-10-19 21:54:39 +02:00