Optimal p* Value Across Layers
log scale · click legend to toggle layers · select seeds below
p → 1
= SGD-like ·
p → ∞
= Muon-like · p
max
= 650