58.4% speedup (2.4x faster) on test_unit_cwrs32 (no custom modes). Gives a 3.2% speedup on ./opus_demo restricted-lowdelay 48000 2 96000 comp48-stereo.sw /dev/null on a 600 MHz Cortex A8.
Also added 3rd clause to "master" COPYING file