1. Core problem - Deploying post-training pipelines on open-weights models like Qwen3 need SFT+RL on many diff environments. expensive and annoying.

Solutions:

  1. DreamGym - Train in Dreams via code world model
  2. Autoharness - RLMs, have the agents write their own testing environments
    1. Similar to above, but deploy preset github environments instead

Benchmark to beat:
5. NeMo-Gym - the standardized benchmark

Powered by Forestry.md