Super HN
New
Show
Towards Greater Leverage: Scaling Laws for Efficient MoE Language Models
(arxiv.org)
4 points by Anon84 22 hours ago