Letian Ruan
Menu
Home
Publications
Blogs
2026
训推秒传万亿参数的开源实现
May 01, 2026
SGLang
RL Infra
Network
去年开始参与RL infra方面的工作。 当时的系统,想要更新推理测的参数需要经过硬盘 — 可想而知更新速度惨不忍睹,各个推理单元争夺硬盘带宽,直到系统的其他部分崩溃罢工。 那时候拜读了乐群的“跨机秒传RL模型参数更新的一些探索”, 惊为天人。既然自己不擅长造轮子(手搓不了新的rdma通讯库),觉得能把轮子装上,给开源社区助助力也在作出贡献。
Read more
Updating 1T parameters in seconds — P2P weight transfer in Large-Scale Distributed RL
April 29, 2026
SGLang
RL Infra
Network
A study from the SGLang-RL team on peer-to-peer weight transfer that updates trillion-parameter models in seconds during large-scale distributed RL training.
Read more
Forge: Scalable Agent RL Framework and Algorithm
February 01, 2026
MiniMax
Agent RL
Post Training
Forge is a scalable Agent RL framework and algorithm released by MiniMax, powering breakthrough capabilities in the M2~M3 series models.
Read more
2026