Super HN

New Show
   Towards Greater Leverage: Scaling Laws for Efficient MoE Language Models (arxiv.org)