TOP:
Time Optimization Policy for Robust and Accurate Humanoid Standing Manipulation

Abstract

Humanoid robots are expected to perform diverse manipulation tasks, yet this relies on a standing controller that is both robust and precise. Existing methods often fail to accurately control high-dimensional upper-body motions or to guarantee stability when these motions are fast. We propose a Time Optimization Policy (TOP) that enables humanoid robots to achieve balance, precision, and time efficiency simultaneously by optimizing the time trajectories of upper-body motions, rather than solely enhancing lower-body disturbance rejection. Our framework integrates three key components: (i) a variational autoencoder (VAE) to encode motion priors and improve coordination between upper and lower body, (ii) a decoupled controller consisting of an upper-body PD controller for precision and a lower-body RL controller for robustness, and (iii) joint training of the controller with TOP to mitigate destabilization caused by fast upper-body motions. We validate our approach in both simulation and real-world experiments, demonstrating superior stability and accuracy in humanoid standing manipulation tasks.

Real World Manipulation

Robots with loco-manipulation tasks with our TOP methods.


Robots standing in place and manipulating objects with our TOP methods.



More precise manipulation tasks with our methods.




Ablation Experiment of TOP

Without TOP methods, the robot may take a step or fall, because of the impact of momentum changes.


With the help of TOP methods, the balance burden and unpredicted inference caused by the lower-body can be reduced, which enhances both the balance stability and the precision of upper-body motions.


Robustness of RL


External Disturbance Tests


Generalization of Motion Priors

Motions in training Dataset M
Motions in unseen Dataset T

Simulation

conduct with TOP
conduct without TOP

RL robustness