Skip to content

Releases: ROCm/rocFFT

rocfft 1.0.34 for ROCm 7.1.0

30 Oct 05:52

Choose a tag to compare

rocFFT code for ROCm 7.1.0 did not change. The library was rebuilt for the updated ROCm 7.1.0 stack.

rocfft 1.0.34 for ROCm 7.0.2

10 Oct 12:12

Choose a tag to compare

rocFFT code for ROCm 7.0.2 did not change. The library was rebuilt for the updated ROCm 7.0.2 stack.

rocfft 1.0.34 for ROCm 7.0.1

17 Sep 16:37

Choose a tag to compare

rocFFT code for ROCm 7.0.1 did not change. The library was rebuilt for the updated ROCm 7.0.1 stack.

rocFFT 1.0.34 for ROCm 7.0.0

16 Sep 06:32

Choose a tag to compare

Added

  • Added gfx950 support.

Removed

  • Removed rocfft-rider legacy compatibility from clients
  • Removed support for the gfx940 and gfx941 targets from the client programs.
  • Removed backward compatibility symlink for include directories.

Optimized

  • Removed unnecessary HIP event/stream allocation and synchronization during MPI transforms.
  • Implemented single-precision 1D kernels for lengths:
    • 4704
    • 5488
    • 6144
    • 6561
    • 8192
  • Implemented single-kernel plans for some large 1D problem sizes, on devices with at least 160KiB of LDS.

Resolved issues

  • Fixed kernel faults on multi-device transforms that gather to a single device, when the input/output bricks are not
    contiguous.

rocFFT 1.0.32 for ROCm 6.4.4

24 Sep 14:02
d790d3e

Choose a tag to compare

rocFFT code for ROCm 6.4.4 did not change. The library was rebuilt for the updated ROCm 6.4.4 stack.

rocFFT 1.0.32 for ROCm 6.4.3

07 Aug 14:20
d790d3e

Choose a tag to compare

rocFFT code for ROCm 6.4.3 did not change. The library was rebuilt for the updated ROCm 6.4.3 stack.

rocFFT 1.0.32 for ROCm 6.4.2

21 Jul 16:54
d790d3e

Choose a tag to compare

rocFFT code for ROCm 6.4.2 did not change. The library was rebuilt for the updated ROCm 6.4.2 stack.

rocFFT 1.0.32 for ROCm 6.4.1

20 May 13:16
058ba87

Choose a tag to compare

rocFFT code for ROCm 6.4.1 did not change. The library was rebuilt for the updated ROCm 6.4.1 stack.

rocFFT 1.0.32 for ROCm 6.4.0

11 Apr 13:35
058ba87

Choose a tag to compare

Changed

  • Building with the address sanitizer option sets xnack+ on relevant GPU
    architectures and adds address-sanitizer support to runtime-compiled
    kernels.
  • The AMDGPU_TARGETS build variable should be replaced with GPU_TARGETS. AMDGPU_TARGETS is deprecated.

Removed

  • Removed ahead-of-time compiled kernels for the gfx906, gfx940, and gfx941 architectures. These architectures still
    function the same, but kernels for them are now compiled at runtime.
  • Removed consumer GPU architectures from the precompiled kernel cache that ships with
    rocFFT. rocFFT continues to ship with a cache of precompiled RTC kernels for data-center
    and workstation architectures. As before, user-level caches can be enabled by setting the
    environment variable ROCFFT_RTC_CACHE_PATH to a writeable file location.

Optimized

  • Improved MPI transform performance by using all-to-all communication for global transpose operations.
    Point-to-point communications are still used when all-to-all is not possible.
  • Improved the performance of unit-strided, complex interleaved, forward and inverse, length (64,64,64) FFTs.

Resolved issues

  • Fixed incorrect results from 2-kernel 3D FFT plans that used non-default output strides. For more information, see the rocFFT GitHub issue.
  • Plan descriptions can be reused with different strides for different plans. For more information, see the rocFFT GitHub issue.
  • Fixed client packages to depend on hipRAND instead of rocRAND.
  • Fixed potential integer overflows during large MPI transforms.

rocFFT 1.0.31 for ROCm 6.3.3

19 Feb 17:47
3806d68

Choose a tag to compare

rocFFT code for ROCm 6.3.3 did not change. The library was rebuilt for the updated ROCm 6.3.3 stack.