* Move back to C++ for OpenCL
* Refactor OpenCL code to work more like the CUDA code, add missing functions
* Deduplicate dequant kernels
* Add OpenCL compile options
* Use compile args for preprocessing constants
* Restore default platform + device selection by id behavior
---------
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
Co-authored-by: Henri Vasserman <henv@hot.ee>