Skip to content

PyTorch backend device mismatch problem #426

@KDH136

Description

@KDH136

Checklist

  • I have searched the existing issues and discussions for a similar question or problem.
  • I have read the documentation and tried to find an answer there.
  • I am using the latest version of Optiland (if not, please update and retry).
  • I have tried to reproduce or debug the issue myself before opening this.
  • I have included all necessary context, such as version info, error messages, or minimal reproducible examples.

Thanks for taking the time to go through this — it really helps us help you!

Bug Report

Describe the bug
Hi everyone.
Thank you for building such a great project!
Since I'm not a skilled python user, I'm not sure whether this bug is due to my fault or not.
Also, I'm not an English native speaker, there might be some awkward sentences.
I apologize for that in advance.

Recently, I found same issues with the discussion #409.

If I set my backend as 'torch' and device as 'cuda', I found that some method such as "Optic.trace()" / "Optic.draw()" shows the error message "Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!"

For example, if I run a example code "Gradient-based Optimization" inside the example gallery, "optic.draw(num_rays=10)" command shows device mismatch error message.

I tried to found which instance variables are on 'cpu' and I figured out that the instance variables of y, u, x, z, L, M, N, intensity, aoi, opd are on 'cpu' instead of 'cuda' (The instance variables in Optic.surface_group_surfaces[i])

These variables are also not affected by "be.set_precision()" and "be.grad_mode.enable()".

Since all the other variables are on GPU and affected by precision control / gradient control method, I tried to find what's the difference between those variables and others.

Those variables above seems to be initialized by "be.array()" and "be.empty()".

Inside the "torch_backend.py" file, I found that if I plug tensor x into be.array(x) function, it return x without modifying device, dtype and requires_grad.

Also, torch.empty() is not predefined.

I modify array function and add function inside "torch_backend.py".
After that, I cannot see error I mentioned before.

I attached the screenshot how I modified the code inside the "torch_backend.py".

I'm not sure whether this is due to my fault or not.

Pleas examine my solution.

Environment

  • Optiland Version: 0.5.8
  • Python Version: 3.12.9
  • OS: Windows
Image Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions