Commit a6a3069
[release/2.9][ROCm][inductor] Improved fast_tanh code generation (#2804)
In the ROCm fork of PyTorch 2.9, Inductor currently has codegen support
for fast_tanhf. However, there were some NaN issues in the original
Triton implementation of fast_tanhf .
Upstream Triton has an improved fast_tanhf where the NaN issues are now
fixed. This upstream commit has been backported to ROCm fork of Triton
(see code comments).
A bump in the Triton commit is also needed.
Other notes:
- In support of
[SWDEV-560271](https://ontrack-internal.amd.com/browse/SWDEV-560271)
- Triton 3.5 backport of upstream Triton commit
ROCm/triton#901
- Similar to #2802,
#2803
- Related to pytorch#1620521 parent a2b0fd7 commit a6a3069
File tree
2 files changed
+8
-3
lines changed- .ci/docker/ci_commit_pins
- torch/_inductor/codegen
2 files changed
+8
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | | - | |
| 29 | + | |
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| |||
1315 | 1315 | | |
1316 | 1316 | | |
1317 | 1317 | | |
1318 | | - | |
| 1318 | + | |
| 1319 | + | |
| 1320 | + | |
| 1321 | + | |
| 1322 | + | |
| 1323 | + | |
1319 | 1324 | | |
1320 | 1325 | | |
1321 | 1326 | | |
| |||
0 commit comments