Hi Salvatore, we were speaking at SYCLcon about your work with SYCL. You were having some problems with register pressure for the CUDA backend.
I found this compiler flag which may be of interest:
-mllvm -nvptx-sched4reg=true
This allows PTX code to be rearranged/rescheduled in an attempt to lower register usage. Let me know if this helps at all.
All the best!