LaunchAI Models
3 hours ago
Featured·
TMax: open-source RL recipe for terminal agents
Nathan Lambert
@natolambert.bsky.socialA LLN - large language Nathan - (RL, RLHF, society, robotics), athlete, yogi, chef Writes http://interconnects.ai Prev Ai2/Olmo, HuggingFace, Berkeley, and normal places
Nathan Lambert
@natolambert.bsky.social
Excited to share a new open-source, RL recipe paper! TMax is the best openly available terminal-bench style training data, establishing the open frontier of small terminal agents with RL training. Many great insights into training in the work led by Hamish Ivison and Oscar Yin.
·
3 hours ago