I want to deploy a neural network model on a Samsung smartphone and run it on the NPU or DSP backend because the computational power of the GPU and CPU is insufficient, making model inference very slow. I have looked into many solutions, such as Neural SDK, ONE, ENN, and NNAPI, but it seems that none of them can run the model on the NPU. I would like to ask if there is any way to run model inference on the NPU or DSP?
You can check here.