OpenResearch

Run Commands

Each experiment has a run command that OpenResearch executes when you click Run. This can be anything: a one-liner, a script path, a full pipeline. The command runs in the experiment's working directory on your compute instance.

EVAL.md

Your run command should write an EVAL.md file to the working directory root. This is how OpenResearch captures your results. Whatever you write there shows up in the Results tab for that run, and is what the agent reads when analyzing experiments. Without it, OpenResearch has no way to compare results.

Put whatever matters: metrics, accuracy numbers, loss curves, error breakdowns. The more structured and consistent your EVAL.md is across runs, the better the analysis will be.

How it works

  • The run command is set per experiment, visible in the experiment sidebar as an editable input
  • Child experiments inherit their parent's run command
  • Exit code 0 = success, non-zero = failure
  • Runs have a maximum duration of 7 days

Examples

A simple one-liner:

bash
python train.py --epochs 10 && python eval.py --output EVAL.md

A script:

bash
bash speedrun.sh

Example of what that script might look like:

bash
#!/bin/bash
set -e

python train.py --epochs 10
python eval.py --output EVAL.md

Auto-detection

When you set up an experiment on an instance, OpenResearch checks for common script paths and pre-fills the run command if one is found. The paths checked, in order:

  1. .openresearch/speedrun.sh
  2. speedrun.sh
  3. scripts/speedrun.sh
  4. bin/speedrun.sh
  5. runs/speedrun.sh

You can always change the command before running.