PolaRiS
Scalable Real-to-Sim Evaluations for Generalist Robot Policies
Real to Sim Reconstruction Simulation Co-training Zero-shot Evaluation
Arhan Jain1*, Mingtong Zhang2*, Kanav Arora1, William Chen3, Marcel Torne4,
Muhammad Zubair Irshad5, Sergey Zakharov5, Yue Wang6, Sergey Levine3,8, Chelsea Finn4,8, Wei-Chiu Ma7,
Dhruv Shah2, Abhishek Gupta1,5‡, Karl Pertsch3,4,8‡
*Equal contribution, ‡Equal advising 1University of Washington, 2Princeton University, 3University of California, Berkeley, 4Stanford University,
5Toyota Research Institute, 6University of Southern California, 7Cornell University, 8Physical Intelligence
PolaRiS teaser figure

PolaRiS is a real-to-sim approach for constructing high-fidelity simulated environments for scalable evaluation. PolaRiS’s 3D Gaussian splatting–based framework quickly turns a short video of a real-world scene into a simulation environment. We use PolaRiS to create a diverse suite of simulated environments and demonstrate strong correlations to real-world evaluations for generalist robot policies.

The PolaRiS Pipeline: From Video to Eval
Step 1 — Environment Scan

A 2–5 minute camera scan captures geometry and appearance. ChArUco for metric scale; minimal user effort.

Step 2 — Neural Reconstruction

2D Gaussian Splatting (2DGS) recovers photorealistic visuals. We extract meshes for collision and contact-aware simulation.

Step 3 — Object & Robot Insertion

Robots are articulated with kinematics-aware splats. Objects are generated from multiview images (TRELLIS) and made physics-ready.

Interactive Reconstruction Viewer

Explore Gaussian Splat reconstructions and TRELLIS-generated object meshes side by side.

Object Mesh Generation
Showing: Cleaner
Object Scan
Object Mesh
Gaussian Splat Viewer (Drag and Scroll to rotate and zoom)
Loading viewer...
Results

All performance metrics are based on normalized task progress scores, with 20 rollouts in real and 50 rollouts in sim per policy-task pair.

Real-Sim Correlation

PolaRiS evaluations correlate strongly with real-world performance. See comparison rollouts below for PolaRiS and Ctrl-World.

RoboArena Correlation

PolaRiS evaluation scores also correlate with policy scores from RoboArena.

This validates that PolaRiS rankings also align with real-world human judgments of policy progress and quality.

RoboArena paper figure
Learn more about RoboArena →
RoboArena correlation with PolaRiS evaluations
Evaluation Videos

Sample rollouts for the same policies and environments.

Loading videos…
Key Insight: Simulation Data Co-Training
Data Mix Ablation

Including simulation data in co-training improves correlation.

Co-Training Samples

Diverse simulated rollouts for visual alignment.

Resources

Compose Environments

An interactive tool for composing and configuring PolaRiS simulation environments from reconstructed scenes.

Launch Tool

PolaRiS Hub

Download our evaluation environments or contribute your own. See the contribution guide for more details.

Hugging Face
BibTeX
Reference
@misc{jain2025polarisscalablerealtosimevaluations,
      title={PolaRiS: Scalable Real-to-Sim Evaluations for Generalist Robot Policies}, 
      author={Arhan Jain and Mingtong Zhang and Kanav Arora and William Chen and Marcel Torne and Muhammad Zubair Irshad and Sergey Zakharov and Yue Wang and Sergey Levine and Chelsea Finn and Wei-Chiu Ma and Dhruv Shah and Abhishek Gupta and Karl Pertsch},
      year={2025},
      eprint={2512.16881},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2512.16881}, 
}