Skip to content

Conversation

@thpereir
Copy link

@thpereir thpereir commented Dec 3, 2025

Add Quark GPT-OSS support

  • General support for QMoE zero point/asymmetric quantization
  • New layers used by Quark quantized models
  • Packing used for gate_up proj and down_proj inside Experts

gate_up_proj_transposed = mlp.experts.gate_up_proj.transpose(-1, -2)
down_proj_transposed = mlp.experts.down_proj.transpose(-1, -2)
if has_quark_experts:
gate_up_proj_transposed = torch.empty(0)

Check notice

Code scanning / CodeQL

Unused local variable

Variable gate_up_proj_transposed is not used.
down_proj_transposed = mlp.experts.down_proj.transpose(-1, -2)
if has_quark_experts:
gate_up_proj_transposed = torch.empty(0)
down_proj_transposed = torch.empty(0)

Check notice

Code scanning / CodeQL

Unused local variable

Variable down_proj_transposed is not used.
- General support for QMoE zero point/asymmetric quantization
- New layers used by Quark quantized models
- Packing used for gate_up proj and down_proj inside Experts
@thpereir
Copy link
Author

thpereir commented Dec 9, 2025

@microsoft-github-policy-service agree company="

@microsoft-github-policy-service agree company="AMD"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant