Skip to content

Conversation

@Ambar-13
Copy link

@Ambar-13 Ambar-13 commented Oct 30, 2025

This PR fixes InvalidIRError: unsupported use of double value crashes encountered when using EnsembleGPUKernel on the Metal.jl backend, which was originally reported in Issue #379.

The fixes are in two main parts:

  1. Insrc/ensemblegpukernel/nlsolve/utils.jl to resolve issues related to Metal and AMDGPU.
    nlsolve Stiff Solver Bug: The unsupported dynamic function invocation error (which was failing AMDGPU and Metal) is fixed by replacing norm() with the GPU-safe diffeqgpunorm(..., t). The successful Metal, AMDGPU (and OneAPI Julia v1) builds show that the previous issues have been resolved.

2.saveat Type Instability (Fixes #379):
* In src/ensemblegpukernel/lowerlevel_solve.jl, the saveat argument (when a Number or AbstractRange) was being converted into a StepRangeLen which had internal Float64 fields, crashing the Float32 Metal kernel.
* This PR changes the logic to convert saveat into a Vector{Tt} (where Tt is the problem's time type) before passing it to the kernel.
* This also includes fixes for UndefVarError (by initializing saveat_converted = nothing) and adds the missing adapt(backend, ...) call to the SDE solver.
Numeric Literal Instability:
* In src/ensemblegpukernel/perform_step/gpu_tsit5_perform_step file, hardcoded numeric literals (e.g., 1.0f-14) were used.
* These have been wrapped with T(...) (e.g., T(1.0e-14)) to ensure they match the kernel's float type, preventing type instability.

These changes ensure robust type handling for Float32 problems on backends without Float64 support, while remaining compatible with standard Float64 usage on CUDA and CPU.

Note for Maintainers: I do not have access to Apple Metal hardware, so I was unable to test this fix locally. Requesting verification from a maintainer or user with a Metal-capable device.

Fixes type instability errors (`InvalidIRError: unsupported use of double value`) encountered when using `EnsembleGPUKernel` on the `Metal.jl` backend (Apple M-series GPUs), specifically:

1.  Issue SciML#379: Corrects the handling of the `saveat` argument in `src/ensemblegpukernel/lowerlevel_solve.jl`. Converts `saveat` (whether a `Number`, `AbstractRange`, or `AbstractArray`) into a `Vector{Tt}` matching the problem's time type (`Tt = eltype(prob.tspan)`) before passing it to the GPU kernel. This prevents the internal `Float64` fields within `StepRangeLen` from causing compilation errors. Includes edge case handling for `saveat=0.0`.

2.  Related Literal Fix: Wraps hardcoded numeric literals (e.g., `1e-7`, `1e-14`) with `Tt(...)` in various `src/ensemblegpukernel/perform_step/` files (like `gpu_em_perform_step.jl`, `gpu_tsit5_perform_step.jl`, etc.) to ensure type consistency within the GPU kernels, addressing issues similar to those discussed on Zulip/Slack.

These changes ensure robust type handling for `Float32` problems on backends without `Float64` support, while remaining compatible with standard `Float64` usage on CUDA and CPU.
Fix: Resolve Metal.jl type instability for saveat and literals
Fixes type instability errors (`InvalidIRError: unsupported use of double value`) encountered when using `EnsembleGPUKernel` on the `Metal.jl` backend (Apple M-series GPUs), specifically:

1.  Issue SciML#379: Corrects the handling of the `saveat` argument in `src/ensemblegpukernel/lowerlevel_solve.jl`. Converts `saveat` (whether a `Number`, `AbstractRange`, or `AbstractArray`) into a `Vector{Tt}` matching the problem's time type (`Tt = eltype(prob.tspan)`) before passing it to the GPU kernel. This prevents the internal `Float64` fields within `StepRangeLen` from causing compilation errors. Includes edge case handling for `saveat=0.0`.

2.  Related Literal Fix: Wraps hardcoded numeric literals (e.g., `1e-7`, `1e-14`) with `Tt(...)` in various `src/ensemblegpukernel/perform_step/` files (like `gpu_em_perform_step.jl`, `gpu_tsit5_perform_step.jl`, etc.) to ensure type consistency within the GPU kernels, addressing issues similar to those discussed on Zulip/Slack.

These changes ensure robust type handling for `Float32` problems on backends without `Float64` support, while remaining compatible with standard `Float64` usage on CUDA and CPU.
Fixes SciML#379
Initializes `saveat_converted = nothing` in both `vectorized_solve` functions (for ODEs and SDEs) to fix an `UndefVarError`.

Also adds the missing `adapt(backend, ...)` call in the SDE function to move the converted `saveat` vector to the GPU.
@Ambar-13
Copy link
Author

Hello!
I’ve pushed a few additional commits to address the remaining issues.

The latest CI run shows that Metal and AMDGPU tests are now passing, confirming that the fixes in this branch successfully resolve the backend-related issues.
The remaining failures on CUDA and oneAPI (Julia v1.1) appear to be pre-existing CI issues (cuMemFreeAsync and OpenSSL_jll errors).

Thank you for your time and for maintaining this package!

— Ambar

@Ambar-13 Ambar-13 changed the title Resolve Metal.jl type instability for saveat and literals Resolve Metal.jl type instability for saveat and literals; Metal and AMDGPU build fix Oct 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

saveat option does not work with Metal

1 participant