Each experiment has a run command that OpenResearch executes when you click Run. This can be anything: a one-liner, a script path, a full pipeline. The command runs in the experiment's working directory on your compute instance.
Your run command should write an EVAL.md file to the working directory root. This is how OpenResearch captures your results. Whatever you write there shows up in the Results tab for that run, and is what the agent reads when analyzing experiments. Without it, OpenResearch has no way to compare results.
Put whatever matters: metrics, accuracy numbers, loss curves, error breakdowns. The more structured and consistent your EVAL.md is across runs, the better the analysis will be.
A simple one-liner:
python train.py --epochs 10 && python eval.py --output EVAL.md
A script:
bash speedrun.sh
Example of what that script might look like:
#!/bin/bash set -e python train.py --epochs 10 python eval.py --output EVAL.md
When you set up an experiment on an instance, OpenResearch checks for common script paths and pre-fills the run command if one is found. The paths checked, in order:
.openresearch/speedrun.shspeedrun.shscripts/speedrun.shbin/speedrun.shruns/speedrun.shYou can always change the command before running.