Newer versions of MSVC are unhappy with the strategy of the build
environment redefining "inline" (even though they don't support the
actual keyword). Instead we define OPUS_INLINE to the right thing
in opus_defines.h.
This is the same approach we use for restrict.
This lets us cut out a bunch of work in the large _n, small _k case
where most of the dimensions won't have any pulses.
It also gets rid of all remaining usage of CELT_PVQ_U() in cwrsi(),
leaving just a single test instead of lots of mins and maxes, and
makes a bunch of the jump threading more obvious.
This is a 1.6% decoder speedup on a 96 kbps comp48-stereo encode on
a Cortex A8.
58.4% speedup (2.4x faster) on test_unit_cwrs32 (no custom modes).
Gives a 3.2% speedup on
./opus_demo restricted-lowdelay 48000 2 96000 comp48-stereo.sw /dev/null
on a 600 MHz Cortex A8.