Resolve Metal.jl type instability for saveat and literals; Metal and AMDGPU build fix #381
+144
−143
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes
InvalidIRError: unsupported use of double valuecrashes encountered when usingEnsembleGPUKernelon theMetal.jlbackend, which was originally reported in Issue #379.The fixes are in two main parts:
src/ensemblegpukernel/nlsolve/utils.jlto resolve issues related to Metal and AMDGPU.nlsolve Stiff Solver Bug: The unsupported dynamic function invocation error (which was failing AMDGPU and Metal) is fixed by replacing norm() with the GPU-safe diffeqgpunorm(..., t). The successful Metal, AMDGPU (and OneAPI Julia v1) builds show that the previous issues have been resolved.
2.
saveatType Instability (Fixes #379):* In
src/ensemblegpukernel/lowerlevel_solve.jl, thesaveatargument (when aNumberorAbstractRange) was being converted into aStepRangeLenwhich had internalFloat64fields, crashing theFloat32Metal kernel.* This PR changes the logic to convert
saveatinto aVector{Tt}(whereTtis the problem's time type) before passing it to the kernel.* This also includes fixes for
UndefVarError(by initializingsaveat_converted = nothing) and adds the missingadapt(backend, ...)call to the SDE solver.Numeric Literal Instability:
* In
src/ensemblegpukernel/perform_step/gpu_tsit5_perform_stepfile, hardcoded numeric literals (e.g.,1.0f-14) were used.* These have been wrapped with
T(...)(e.g.,T(1.0e-14)) to ensure they match the kernel's float type, preventing type instability.These changes ensure robust type handling for
Float32problems on backends withoutFloat64support, while remaining compatible with standardFloat64usage on CUDA and CPU.Note for Maintainers: I do not have access to Apple Metal hardware, so I was unable to test this fix locally. Requesting verification from a maintainer or user with a Metal-capable device.