Skip to content

Conversation

@StareAtYou
Copy link
Contributor

@StareAtYou StareAtYou commented Nov 7, 2025

Motivation

1、need to support wint4 quantization on the maca platform
2、when enabling wint4 quantization mode on the maca platform, the attention quantization method can be wint4

Modifications

1、adpat wint4 quantization on the maca platform
2、support switching the attention quantization mode to wint4 on the maca platform

Usage or Command

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link

paddle-bot bot commented Nov 7, 2025

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Nov 7, 2025
@StareAtYou StareAtYou force-pushed the develop branch 4 times, most recently from 3a30005 to fb4cc9e Compare November 10, 2025 07:10
@Kane2011 Kane2011 self-requested a review November 10, 2025 07:15
Copy link
Collaborator

@Kane2011 Kane2011 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

quant_config_name = "mix_quant"

if current_platform.is_maca():
metax_dense_quant_type = os.getenv("FD_METAX_DENSE_QUANT_TYPE")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

添加环境变量需要统一在envs.py里注册,禁止私自添加env。尽管如此,针对这里的case,也不建议通过这种环境变量的方式来控制 dense_quant_type

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

模型的config.json里是可以配置moe_quant_type和dense_quant_type的

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的明白,以按照要求重新修改,提新的 pr,链接如下:
#4946

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants