evaluation_pi0_pi05

Evaluation code for π0 and π0.5 (OpenPI) on ManiSkill robotic manipulation tasks.

This repository contains two branches, one per model family:

Branch	Model	Source
`pi0.5`	π0.5 (OpenPI inference)	yqi19/Openpi_inference
`pi0`	π0 (OpenPI finetune)	yqi19/pi0_finetune

Supported Evaluation Tasks

Both branches contain inference/evaluation code for the following ManiSkill experiments:

Directory	Description
`examples/maniskill/`	General random-attribute ManiSkill evaluation
`examples/maniskill_conflict/`	OOD conflict experiment evaluation (factor dominance)
`examples/maniskill_attribute/`	Single-attribute pairwise factor evaluations
`examples/maniskill_full_factor/`	Full-factor (all_factor) evaluation
`examples/maniskill_verb_color_object/`	Verb × color × object 3-factor evaluation
`examples/maniskill_verb_size/`	Verb × size 2-factor evaluation

Branch: `pi0.5`

π0.5 evaluation code using the OpenPI inference server.

Quickstart

# 1. Start the policy server (pi0.5)
uv run scripts/serve_policy.py --env MANISKILL \
    policy:checkpoint \
    --policy.config pi05_maniskill \
    --policy.dir <checkpoint_dir>

# 2. Run conflict experiment evaluation
bash examples/maniskill_conflict/run_ood_experiment_inference.sh \
    <num_episodes> <results_file>

# 3. Run verb×color×object evaluation
bash examples/maniskill_verb_color_object/run_verb_color_object_inference.sh \
    <num_episodes> <results_file>

# 4. Run full-factor evaluation
bash examples/maniskill_full_factor/run_full_factor_inference.sh

# 5. Run all factor evaluations (multi-seed)
bash run_multi_seed_evals.sh
bash scripts/eval_openpi_port8000.sh   # port 8000 worker
bash scripts/eval_openpi_port8001.sh   # port 8001 worker

VLM Evaluation

python scripts/vlm_eval_all_conflicts.py

Branch: `pi0`

π0 evaluation code.

Quickstart

# 1. Start the policy server (pi0)
uv run scripts/serve_policy.py --env MANISKILL \
    policy:checkpoint \
    --policy.config pi0_maniskill \
    --policy.dir <checkpoint_dir>

# 2. Run conflict experiment evaluation
bash run_pi0_conflict_eval.sh

# 3. Run all-factor evaluation scripts
bash eval_all_9_ckpts.sh       # all 9 checkpoints
bash eval_all_ckpts.sh         # all available checkpoints
bash run_all_3seeds.sh         # 3-seed evaluation

# 4. Run full-factor evaluation
bash examples/maniskill_full_factor/run_full_factor_inference.sh

VLM Evaluation

python vlm_eval_pi0_conflicts.py

Notes

Checkpoint paths must be specified at runtime; no model weights are included in this repo.
Both branches share the same core src/openpi/ library and examples/ structure.
Evaluation results are written to a path you specify; no results data is committed here.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

evaluation_pi0_pi05

Supported Evaluation Tasks

Branch: pi0.5

Quickstart

VLM Evaluation

Branch: pi0

Quickstart

VLM Evaluation

Notes

Branch: `pi0.5`

Branch: `pi0`