Learning Getting-Up Policies for Real-World Humanoid Robots




This project presents an RL-based approach that first achieves real-world humanoids getting up from arbitrary lying postures and terrains.

Abstract

Automatic fall recovery is a crucial prerequisite before humanoid robots can be reliably deployed. Hand-designing controllers for getting up is difficult because of the varied configurations a humanoid can end up in after a fall and the challenging terrains humanoid robots are expected to operate on. This paper develops a learning framework to produce controllers that enable humanoid robots to get up from varying configurations on varying terrains. Unlike previous successful applications of humanoid locomotion learning, the getting-up task involves complex contact patterns, which necessitates accurately modeling the collision geometry and sparser rewards. We address these challenges through a two-phase approach that follows a curriculum. The first stage focuses on discovering a good getting-up trajectory under minimal constraints on smoothness or speed / torque limits. The second stage then refines the discovered motions into deployable (i.e. smooth and slow) motions that are robust to variations in initial configuration and terrains. We find these innovations enable a real-world G1 humanoid robot to get up from two main situations that we considered: a) lying face up and b) lying face down, both tested on flat, deformable, slippery surfaces and slopes (e.g., sloppy grass and snowfield). To the best of our knowledge, this is the first successful demonstration of learned getting-up policies for human-sized humanoid robots in the real world.

Getting Up From Supine Poses (Lying Facing Up)

Concrete Path

Brick Surfaces

Stone Tile

Muddy Grass

Grass Slope (~10°)

Rolling Over from Prone Poses (Lying Facing Down)

Concrete Path

Brick Surfaces

Stone Tile

Muddy Grass

Grass Slope (~10°)

Snow Field

Getting Up from Prone Poses (Lying Facing Down)

HumanUP: Sim-to-Real Humanoid Getting-Up Policy Learning

Our getting-up policy is trained in simulation using two-stage RL training, after which it is directly deployed in the real world. (a) Stage I learns a discovery policy f that figures out a getting-up trajectory with minimal deployment constraints. (b) Stage II converts the trajectory discovered by Stage I into a policy π that is deployable, robust, and generalizable. (c) The two-stage training induces a curriculum.

comparison

Motion Comparison between Stage I and Stage II

Stage I policy motion is fast and unsafe, while Stage II policy's motion is slower and safer.

Stage I Getting Up

Stage II Getting Up

Stage I Rolling Over

Stage II Rolling Over

Sim2Sim Transfer to MuJoco

Getting Up

Rolling Over

Comparison with Baseline G1 Handcrafted Motion

(a), (b), and (c) record the corresponding mean motor temperature of the upper body, lower body, and waist, respectively. G1's default controller's execution causes the arm motors to heat up significantly, whereas our policy makes more use of the leg motors that are larger (higher torque limit of 83N as opposed to 25N for the arm motors) and thus able to take more load.

G1 Controller Baseline

HumanUP (Ours)


comparison

Failure Modes Analysis

G1 controller baseline's handcrafted motion cannot get up from the lying face up on the grass slope, the sloping ground prevents it from getting to the full squatting pose due to high friction and weak waist torques to move against the dumping tendency.

G1 Conrtoller Baseline

HumanUP (Ours)

On the most challenging terrain, the snowfield, G1 controller baseline's handcrafted motion and ours may both fail due to the slippery and deformable ground.

G1 Conrtoller Baseline

HumanUP (Ours)

BibTeX

@article{humanup25,
  title={Learning Getting-Up Policies for Real-World Humanoid Robots},
  author={He, Xialin and Dong, Runpei and Chen, Zixuan and Gupta, Saurabh},
  journal={arXiv preprint arXiv:2502.12152},
  year={2025}
}