[release/2.8][ROCm][inductor] Improved fast_tanh code generation #2803

naromero77amd · 2025-11-13T01:44:45Z

In the ROCm fork of PyTorch 2.8, Inductor currently has codegen support for fast_tanhf. However, there were some NaN issues in the original Triton implementation of fast_tanhf .

Upstream Triton has an improved fast_tanhf where the NaN issues are now fixed. This upstream commit has been backported to ROCm fork of Triton (see code comments).

A bump in the Triton commit is also needed.

Other notes:

In support of SWDEV-560271
Triton 3.4 backport of upstream Triton commit [AMD] reimplement fast_tanhf() to avoid overflow (#8551) triton#900
Similar to [release/2.7][ROCm][inductor] Improved fast_tanh code generation #2802, [release/2.9][ROCm][inductor] Improved fast_tanh code generation #2804
Related to [ROCm][inductor] Codegen support for fast_tanhf pytorch/pytorch#162052

(cherry picked from commit 7c5277f)

naromero77amd · 2025-11-13T01:45:13Z

I have confirmed that it resolves the reproducer in the Jira.

rocm-repo-management-api · 2025-11-13T01:49:27Z

Jenkins build for 084d7b39ee03b12ab04873ab83bd5d270e241f5a commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

jataylo

Lets just conditionalise on >=(3,3) here too

(cherry picked from commit f416c71)

rocm-repo-management-api · 2025-11-14T20:47:14Z

Jenkins build for 7cc238e2838296552a9075e186cdbafb4d519346 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

In the ROCm fork of PyTorch 2.7, Inductor currently has codegen support for fast_tanhf. However, it is currently guarded by `TORCHINDUCTOR_USE_FAST_MATH` environment variable due to some NaN issues in the original Triton implementation of fast_tanhf. Upstream Triton has an improved fast_tanhf where the NaN issues are now fixed. This upstream commit has been backported to ROCm fork of Triton (see code comments). Thus, I have removed the conditionalization on Triton versions as well. A bump in the Triton commit is also needed. Other notes: - In support of [SWDEV-560271](https://ontrack-internal.amd.com/browse/SWDEV-560271) - Triton 3.3 backport of upstream Triton commit ROCm/triton#902 - Similar to #2803, #2804 - Related to pytorch#162052

In the ROCm fork of PyTorch 2.9, Inductor currently has codegen support for fast_tanhf. However, there were some NaN issues in the original Triton implementation of fast_tanhf . Upstream Triton has an improved fast_tanhf where the NaN issues are now fixed. This upstream commit has been backported to ROCm fork of Triton (see code comments). A bump in the Triton commit is also needed. Other notes: - In support of [SWDEV-560271](https://ontrack-internal.amd.com/browse/SWDEV-560271) - Triton 3.5 backport of upstream Triton commit ROCm/triton#901 - Similar to #2802, #2803 - Related to pytorch#162052

naromero77amd added 2 commits November 13, 2025 00:44

On ROCm, always use fast_tanhf for triton codegen.

78f604a

(cherry picked from commit 7c5277f)

Bump up Triton commit to support fast_tanhf.

084d7b3

naromero77amd requested review from jataylo, jeffdaily, jithunnair-amd and pruthvistony as code owners November 13, 2025 01:44

This was referenced Nov 13, 2025

[release/2.7][ROCm][inductor] Improved fast_tanh code generation #2802

Merged

[release/2.9][ROCm][inductor] Improved fast_tanh code generation #2804

Merged

jataylo requested changes Nov 14, 2025

View reviewed changes

Conditionalize fast_tanhf on triton_version.

7cc238e

(cherry picked from commit f416c71)

pruthvistony approved these changes Nov 17, 2025

View reviewed changes

pruthvistony merged commit cba8b9d into release/2.8 Nov 17, 2025
6 of 8 checks passed

pruthvistony deleted the release_/2.8_new_fast_tanh branch November 17, 2025 18:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[release/2.8][ROCm][inductor] Improved fast_tanh code generation #2803

[release/2.8][ROCm][inductor] Improved fast_tanh code generation #2803

Uh oh!

naromero77amd commented Nov 13, 2025 •

edited

Loading

Uh oh!

naromero77amd commented Nov 13, 2025

Uh oh!

rocm-repo-management-api bot commented Nov 13, 2025 •

edited

Loading

Uh oh!

jataylo left a comment

Uh oh!

rocm-repo-management-api bot commented Nov 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[release/2.8][ROCm][inductor] Improved fast_tanh code generation #2803

[release/2.8][ROCm][inductor] Improved fast_tanh code generation #2803

Uh oh!

Conversation

naromero77amd commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

naromero77amd commented Nov 13, 2025

Uh oh!

rocm-repo-management-api bot commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jataylo left a comment

Choose a reason for hiding this comment

Uh oh!

rocm-repo-management-api bot commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

naromero77amd commented Nov 13, 2025 •

edited

Loading

rocm-repo-management-api bot commented Nov 13, 2025 •

edited

Loading

rocm-repo-management-api bot commented Nov 14, 2025 •

edited

Loading