Locomotion & Legged Robots
Further Reading
Why Legs?
Wheels are efficient on flat surfaces but fail on stairs, rubble, forest floors, and the interiors of human buildings. The built environment — with its steps, curbs, narrow doorways, and uneven surfaces — was designed for bipeds. If robots are to operate where humans live and work, legs aren't a luxury; they're a requirement.
The engineering challenge is formidable. A wheeled robot is statically stable — it won't tip over when it stops. A biped is dynamically stable at best, meaning it must constantly adjust to stay upright, much like a human standing still is actually making hundreds of micro-corrections per second. A quadruped has it somewhat easier (four contact points allow static gaits), but dynamic quadrupedal locomotion — galloping, bounding, recovering from shoves — remains deeply complex.
The Control Theory Era
Early legged robots relied on hand-crafted control: engineers specified exactly when each joint should move, how much torque to apply, and how to shift the robot's center of mass during each gait phase. Marc Raibert's work at MIT in the 1980s — which led to the founding of Boston Dynamics — established the foundational framework of hopping, balancing, and running with simplified models (the spring-loaded inverted pendulum). Honda's ASIMO (2000) demonstrated bipedal walking using Zero Moment Point (ZMP) control, keeping the robot's projected center of gravity within its support polygon at all times.
These approaches worked but were brittle. ZMP walking is slow and shuffling because the robot must always maintain static stability. Model Predictive Control (MPC) improved things by optimizing over a short future horizon, but hand-tuned locomotion controllers required months of engineering per terrain type and couldn't generalize to novel surfaces.
The Reinforcement Learning Revolution
The breakthrough came from training locomotion policies in physics simulation using deep reinforcement learning (RL). The workflow: build a simulated version of the robot in a physics engine like NVIDIA Isaac Sim, MuJoCo, or PyBullet; randomize terrain, friction, mass properties, and sensor noise (domain randomization); train a neural network policy that maps proprioceptive observations to joint commands; then deploy the learned policy on the real robot via Sim-to-Real Transfer.
The results have been transformative. In 2023, researchers at ETH Zurich demonstrated ANYmal navigating hiking trails, climbing over rubble, and recovering from falls — all using a policy trained entirely in simulation. Boston Dynamics' Atlas (electric, 2024) performs parkour, backflips, and dynamic manipulation while walking, combining RL-trained locomotion with classical trajectory optimization. Unitree's Go2 quadruped ships with an RL locomotion stack that lets a $1,600 robot traverse terrain that would have required a $100K research platform a decade ago.
Key to these advances is the concept of a locomotion policy — a neural network that takes in sensor readings (joint positions, angular velocity, IMU data, and sometimes depth camera input) and outputs joint-level torque or position targets at 50–500 Hz. The policy implicitly learns terrain estimation, balance recovery, and gait selection without being explicitly programmed for any of them.
Bipedal vs. Quadrupedal
Quadrupeds currently outperform bipeds in robustness and real-world deployment. Boston Dynamics' Spot has logged tens of thousands of hours in industrial inspection, construction monitoring, and nuclear decommissioning. Unitree's Go2 and B2 are used in research labs worldwide. The four-legged configuration offers inherent stability — three legs can support the body while one repositions — making quadrupeds practical for outdoor and industrial use today.
Bipedal locomotion remains harder but strategically important. Humanoid form factors are needed to navigate human spaces (stairs designed for human stride length, doors designed for human width), operate human tools, and eventually enter the consumer market. Agility Robotics' Digit is the furthest along in commercial bipedal deployment, operating in Amazon warehouses. Tesla's Optimus, Figure's 02, and Apptronik's Apollo are all racing to solve robust bipedal locomotion at consumer price points.
The gap is closing. Recent RL-based bipedal controllers from UC Berkeley (Cassie running a 5K), Agility, and several Chinese labs (including Unitree's H1 humanoid) demonstrate outdoor walking, stair climbing, and even jogging that would have been inconceivable with classical controllers alone.
Terrain Perception and Adaptation
Pure proprioceptive locomotion — controlling legs using only joint sensors and an IMU, with no vision — is surprisingly effective. RL policies learn to probe terrain by feel, adjusting gait when they sense unexpected resistance or softness. This "blind" locomotion works on moderately uneven terrain and has the advantage of not requiring expensive or fragile vision systems.
For extreme terrain (stairs, gaps, debris fields), exteroceptive locomotion adds depth cameras or LiDAR to build a local elevation map. The locomotion policy then plans footsteps based on perceived terrain geometry. NVIDIA's Isaac Lab provides a standard pipeline for training vision-conditioned locomotion policies, and ETH Zurich's elevation-mapping stack has become a de facto standard for quadrupedal terrain perception.
Whole-Body Control and Loco-Manipulation
The frontier is loco-manipulation: simultaneously walking and using arms/hands to interact with the environment. A humanoid carrying a box up stairs, a quadruped opening a door while maintaining balance, or a construction robot walking across a beam while welding — these tasks require tight coordination between locomotion and manipulation controllers.
Current approaches split the problem into a locomotion layer (keeping the robot upright and moving) and a manipulation layer (controlling arms and hands), with a whole-body controller mediating between them. Hierarchical RL — where a high-level policy commands locomotion direction and arm targets while low-level policies execute — is showing promise. Boston Dynamics' Atlas and Figure AI's Figure 02 both demonstrate early loco-manipulation capabilities.
The Road Ahead
Locomotion is rapidly becoming a "solved" subproblem in the same way computer vision "solved" image classification — not perfectly, but well enough to build on. The remaining challenges are energy efficiency (current humanoids manage 1–4 hours of battery life under walking loads), speed (most bipeds walk at ~1.5 m/s vs. the human average of ~1.4 m/s, but can't sustain running), robustness in truly adversarial conditions (ice, steep slopes, high winds), and the loco-manipulation integration that will define the next generation of useful humanoids.
The convergence of cheap simulation (NVIDIA Isaac, MuJoCo going open-source), affordable hardware (Unitree's sub-$20K humanoids), and RL-based control means that locomotion is no longer the bottleneck it once was. The bottleneck has shifted upstream — to the VLA models and world models that decide where the robot should walk and why, not how.