Seamless multi-skill learning: learning and transitioning non-similar skills in quadruped robots with limited data.

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

In multi-skill imitation learning for robots, expert datasets with complete motion features are crucial for enabling robots to learn and transition between different skills. However, such datasets are often difficult to obtain. As an alternative, datasets constructed using only joint positions are more accessible, but they are incomplete and lack details, making it challenging for existing methods to effectively learn and model skill transitions. To address these challenges, this study introduces the Seamless Multi-Skill Learning (SMSL) framework. Integrated within the Adversarial Motion Priors framework and incorporating self-trajectory augmentation techniques, SMSL effectively utilizes high-quality historical experiences to guide agents in learning skills and generating smooth, natural transitions between them, addressing the learning difficulties caused by incomplete expert datasets. Additionally, the research incorporates an adaptive command sampling mechanism to balance the training opportunities for skills of various difficulties and prevent catastrophic forgetting. Our experiments highlight potential issues with baseline methods when imitating incomplete expert datasets and demonstrate the superior performance of the SMSL framework. Sim-to-real experiments on real Solo8 robots further validate the effectiveness of SMSL. Overall, this study confirms the SMSL framework's capability in real robotic applications and underscores its potential for autonomous skill learning and generation from minimal data.

Similar Papers
  • Supplementary Content
  • 10.25534/tuprints-00014184
Interactive Machine Learning for Assistive Robots
  • Oct 16, 2020
  • TUbilio (Technical University of Darmstadt)
  • Dorothea Koert

Interactive Machine Learning for Assistive Robots

  • Research Article
  • Cite Count Icon 1
  • 10.1109/tsmc.2025.3604639
Gait Adaptation and Iterative Control: A Switched Systems Optimization Framework for Quadrupedal Robots
  • Nov 1, 2025
  • IEEE Transactions on Systems, Man, and Cybernetics: Systems
  • Pietro Gori + 3 more

One of the primary challenges in quadrupedal locomotion pertains to the robot’s ability to adapt its gait to the surrounding environment and the desired task. This capability allows quadrupedal robots to select suitable foothold locations and adjust their gait for optimal performance. We address the problem of gait adaptation using trajectory optimization (TO), which takes into account the simplified switched system’s dynamics and optimizes the different phases of motion in which we split the robot’s movement. The robot dynamic model is a single rigid body (SRB) with a rigid contact model and foot positions. We apply contact and friction cone constraints to ensure a physically feasible motion of the real robot. We tackle the optimization using the direct multiple shooting (DMS) method. Leveraging kinematic inversion to map the base and feet positions into joint positions, velocities, and accelerations, we design a controller that combines iterative learning control (ILC) and proportional derivative (PD) feedback control. The iterative controller compensates for the sim-to-real gap, allowing the real robot to learn the task during the execution of the latter. We evaluate the performance of the proposed approach on two different quadrupedal robots and on different terrains.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/iccre51898.2021.9435727
Design of an Unconventional Bionic Quadruped Robot with Low-degree-freedom of Movement
  • Apr 16, 2021
  • Dong Zhang + 3 more

The current quadruped robots are designed based on the quadruped animals in the natural world, and use complex control algorithms to realize the movement and walking of the quadruped robots, which cause problems such as difficult design of bionic robots and high production costs. In order to solve the above-mentioned problems, this paper will design a four-legged sports robot that can walk quickly, turn and cross obstacles at low cost and simple control. First, establish a simplified threedimensional model of the quadruped motion robot designed and a three-dimensional model of the specific mechanical structure in SOLIDWORKS to ensure the rationality of the mechanism design. Secondly, import the simplified model into ADAMS for kinematics simulation, complete the setting and definition of the parameters of the quadruped robot model, and verify the feasibility of the mechanical structure scheme. Then, according to the complex mechanical structure environment, some key design parts are imported into ANASY for structural finite element analysis to verify whether the stiffness and strength of the key parts of the mechanism meet the design requirements. Finally, a real quadruped robot was built in combination with theoretical design and put into operation in a real environment. The simulation results were compared to verify the feasibility and rationality of the quadruped robot program as a whole.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-3-540-89933-4_9
A Behavior Based Control and Learning Approach to Real Robots
  • Dec 1, 2009
  • Dongbing Gu + 2 more

Programming a real robot to do a given task in unstructured dynamic environments is very challenge. Incomplete information, large learning space, and uncertainty are major obstacles for control in real robots. When programming a real robot in unstructured dynamic environments, it is impossible to predict all the potential situation robots may encounter and specify all robot behaviors optimally in advance. Robots have to learn from, and adapt to their operating environment.In this chapter, we propose to use fuzzy logic to design robot behaviors and use a Markov decision process to model the coordination mechanism in the control and learning of real autonomous robotic systems. Based on the model, a Q-learning approach can be used to learn the behavior coordination. Two real robot applications are implemented by using such an approach, one is a Sony quadruped robot for soccer playing and another is a robotic fish for entertainment. Real robot testing results are provided to verify the proposed approach.KeywordsFuzzy LogicFuzzy RuleFuzzy ControllerAutonomous RobotReal RobotThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

  • Research Article
  • Cite Count Icon 22
  • 10.1089/end.2015.0774
Baseline Laparoscopic Skill May Predict Baseline Robotic Skill and Early Robotic Surgery Learning Curve.
  • Mar 23, 2016
  • Journal of Endourology
  • Ruaidhri Mcvey + 7 more

Robotic surgery is associated with a learning curve unique to each trainee. Knowledge about a trainee's baseline skill and learning curve would facilitate the development of a more individualized training curriculum. The aim of our study was to determine whether baseline laparoscopic skill is predictive of one's baseline robotic skill and short-term learning curve. Trainees from four different surgical specialties were included in the study. Each trainee participated in a 4-week, simulation-based robotic surgery basic skills training course. Precourse, baseline laparoscopic and robotic skills were assessed using validated test tasks; a basic peg transfer (PT) and an advanced intracorporeal suturing and knot tying (ISKT) task. Trainee robotic skill was assessed again 1 week postcourse. Each task performance was video recorded and scored by two blinded expert surgeons. A total of 32 trainees were included; 14 urology, 7 gynecology, 8 thoracic Sx, 3 general Sx. Most (91%) were senior residents or clinical fellows and 50% had no prior robotic experience. There were no differences in baseline laparoscopic and robotic skill related to reported prior robotic experience. Between specialties, no differences were seen on baseline laparoscopic skill and only small differences were seen on baseline robotic skill. Both baseline Lap PT (p = 0.01) and Lap ISKT (p = 0.01) performances correlated with baseline robotic ISKT performance, but not robotic PT scores. Only baseline Lap ISKT performance correlated with postcourse robotic PT (p = 0.01) and ISKT (p < 0.01) performance. Baseline robotic ISKT scores, but not PT scores, correlated with postcourse robotic performance (p = 0.02 for PT, p < 0.01 for ISKT). In this study, a trainee's baseline laparoscopic skill correlated with certain baseline robotic skills. Better baseline performance on an advanced, but not basic, laparoscopic and robotic skill task may correlate with a shorter learning curve for basic robotic skills. Further exploration of this finding may yield better training curricula.

  • Conference Article
  • Cite Count Icon 52
  • 10.1109/icra48506.2021.9561926
Circus ANYmal: A Quadruped Learning Dexterous Manipulation with Its Limbs
  • May 30, 2021
  • Fan Shi + 8 more

Quadrupedal robots are skillful at locomotion tasks while lacking manipulation skills, not to mention dexterous manipulation abilities. Inspired by the animal behavior and the duality between multi-legged locomotion and multi-fingered manipulation, we showcase a circus ball challenge on a quadrupedal robot, ANYmal. We employ a model-free reinforcement learning approach to train a deep policy that enables the robot to balance and manipulate a light-weight ball robustly using its limbs without any contact measurement sensor. The policy is trained in the simulation, in which we randomize many physical properties with additive noise and inject random disturbance force during manipulation, and achieves zero-shot deployment on the real robot without any adjustment. In the hardware experiments, dynamic performance is achieved with a maximum rotation speed of 15 °/s, and robust recovery is showcased under external poking. To our best knowledge, it is the first work that demonstrates the dexterous dynamic manipulation on a real quadrupedal robot.

  • Research Article
  • 10.4028/www.scientific.net/amm.433-435.138
A Surrogate Model Based Gait Learning for Biped Robot
  • Oct 15, 2013
  • Applied Mechanics and Materials
  • Ding Sheng Luo + 2 more

Gait learning is usually under a so-called simulation based framework, where a simulation platform is firstly setup, and then based on which the gait pattern is learned via some learning algorithm. For the reason that there exist big differences between simulation platform and real circumstances, an additional adapting procedure is always required when learned gait pattern is applied to a real robot. This case turns out to be more critical for a biped robot, because its control appears more difficult than others, such as a quadruped robot. This leads the new scheme that the gait is directly learned on real robot to be attractive. However, under this real robot based learning scheme, most of those learning algorithms that commonly used under simulation based framework appear to be trivial, since they always needs too many learning trials which may wear out the robot hardware. Faced to this situation, in this paper, a surrogate model based gait learning approach for biped robot is proposed. And the experimental results on a real humanoid robot PKU-HR3 show the effectiveness of the proposed approach.

  • Research Article
  • Cite Count Icon 4
  • 10.1016/j.cjche.2020.10.048
Distributed model predictive control based on adaptive sampling mechanism
  • Apr 20, 2021
  • Chinese Journal of Chemical Engineering
  • Zhen Wang + 2 more

Distributed model predictive control based on adaptive sampling mechanism

  • Research Article
  • Cite Count Icon 13
  • 10.1109/tcds.2021.3118294
A Novel Simulation-Reality Closed-Loop Learning Framework for Autonomous Robot Skill Learning
  • Dec 1, 2022
  • IEEE Transactions on Cognitive and Developmental Systems
  • Rong Jiang + 5 more

In recent years, data-driven learning methods have been widely studied for autonomous robot skill learning. However, these methods rely on large amounts of robot–environment interaction data for training, which largely prevents them from being applied to real-world robots. To address this problem, this article proposes a novel simulation-reality closed-loop learning framework for autonomous robot skill learning that can improve data efficiency, enhance policy stability, and achieve effective policy simulation-to-reality (sim2real) transfer. First, a hybrid control model combining the asymmetric deep deterministic policy gradients (Asym-DDPGs) model and the forward prediction control (FPC) model is proposed to learn vision-based manipulation policies in simulations, which can decompose complex tasks to improve learning efficiency. Second, a novel pixel-level domain adaptation method named Position-CycleGAN is designed to translate real images to simulated images while also preserving the task-related information. The policy trained in simulations can be directly migrated into real robots in a reverse reality-to-simulation manner using the Position-CycleGAN model. The experimental results validate the effectiveness of the proposed framework. This work provides an efficient and feasible path for achieving autonomous skill learning.

  • Research Article
  • Cite Count Icon 64
  • 10.1016/j.robot.2021.103844
Overcoming some drawbacks of Dynamic Movement Primitives
  • Jul 8, 2021
  • Robotics and Autonomous Systems
  • Michele Ginesi + 2 more

Overcoming some drawbacks of Dynamic Movement Primitives

  • Research Article
  • Cite Count Icon 8
  • 10.1108/ir-06-2018-0119
Turning strategies for the bounding quadruped robot with an active spine
  • Oct 16, 2018
  • Industrial Robot: An International Journal
  • Zhong Wei + 5 more

Purpose This paper aims to study the turning strategies for the bounding quadruped robot with an active spine and explore the significant role of the spine in the turning locomotion. Design/methodology/approach Firstly, the bounding gait combining the pitch motion of the spine with the leg motion is presented. In this gait, the spine moves in phase with the front legs. All the joints of the legs and spine are controlled by cosine signals to simplify the control, and the initial position and oscillation amplitude of the joints can be tuned. To verify the effectiveness of the proposed gait, the spine joints are set with different initial positions and oscillation amplitudes, and the initial position and oscillation amplitude of the leg joints are tuned to make the virtual model do the best locomotion in terms of the speed and stability in the simulation. The control signals are also used to control a real robot called Transleg. Then, three different turning strategies are proposed, including driving the left and right legs with different strides, swaying the spine in the yaw direction and combining the above two methods. Finally, these strategies are tested on the real robot. Findings The stable bounding locomotion can be achieved using the proposed gait. With the spine motion, the speed of the bounding locomotion is increased; the turning radius is reduced; and the angular velocity is increased. Originality/value A simple and flexible planning of the bounding gait and three turning strategies for the bounding quadruped robot are proposed. The effectiveness of the proposed bounding gait, along with the beneficial effect of the spine motion in the yaw direction on the turning locomotion is demonstrated with the computer simulations and robot experiments. This will be instructive for the designing and actuating of the other quadruped robots.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/robio55434.2022.10011765
Quadruped Reinforcement Learning without Explicit State Estimation
  • Dec 5, 2022
  • Qikai Li + 5 more

Reinforcement learning is a promising approach to developing legged robot locomotion controllers. The gen-eral process of development is: large-scale training in the virtual simulation environment to obtain reliable control policy network, and then the policy network is deployed to real legged robot. In the training procedure, a complete robot state increases the speed of training and the stability of policy. The robot's states like body velocities are easily available in simulation training, but they are difficult to obtain in the real robot, hence specifically designed robot state estimators are needed. However, the development of state estimators requires expert knowledge related to control theory and robotics, limiting the direct application of reinforcement learning to robots. To take advantage of the end-to-end mapping of artificial neural networks, we simplified the existing reinforcement learning process for quadruped robots and propose a training method based on curriculum learning in this work. The proposed method can produce a reliable policy that does not require robot state estimator and only take raw sensors data. The feasibility of the proposed method is verified in simulations and real quadrupedal robot. Video of the quadrupedal robot is available at www.youtube.com/watch?v=-iho4KIlEPw.

  • Conference Article
  • Cite Count Icon 45
  • 10.1109/humanoids.2015.7363584
Probabilistic segmentation applied to an assembly task
  • Nov 1, 2015
  • Rudolf Lioutikov + 3 more

Movement primitives are a well established approach for encoding and executing robot movements. While the primitives themselves have been extensively researched, the concept of movement primitive libraries has not received as much attention. Libraries of movement primitives represent the skill set of an agent and can be queried and sequenced in order to solve specific tasks. The goal of this work is to segment unlabeled demonstrations into an optimal set of skills. Our novel approach segments the demonstrations while learning a probabilistic representation of movement primitives. The method differs from current approaches by taking advantage of the often neglected, mutual dependencies between the segments contained in the demonstrations and the primitives to be encoded. Therefore, improving the combined quality of both segmentation and skill learning. Furthermore, our method allows incorporating domain specific insights using heuristics, which are subsequently evaluated and assessed through probabilistic inference methods. We demonstrate our method on a real robot application, where the robot segments demonstrations of a chair assembly task into a skill library. The library is subsequently used to assemble the chair in an order not present in the demonstrations.

  • Research Article
  • 10.1080/02533839.2025.2565362
AMSS-KG: an improved Kriging model with adaptive multi-scale sampling
  • Oct 8, 2025
  • Journal of the Chinese Institute of Engineers
  • Huan Xie + 3 more

To address inefficiencies in Kriging modeling caused by traditional single/multi-point sampling approaches, this paper proposes an adaptive multi-scale sampling method that dynamically adjusts sample quantities using the change rates of the coefficient of determination and mean absolute percentage error change rates. The enhanced Kriging model combines this adaptive sampling with sequential Kriging sampling to progressively reduce sample acquisition as accuracy improves. Tests on four standard functions demonstrate that the proposed model matches classical methods in accuracy for low-dimensional weakly nonlinear problems while significantly outperforming them in three challenging cases: low-dimensional strong nonlinearity (lower prediction errors), high-dimensional weak nonlinearity, and high-dimensional strong nonlinearity (better goodness-of-fit). Remarkably, it achieves these results with only 36.68% of the samples required by conventional expected improvement methods, substantially cutting computational costs. Additionally, a computationally demanding aeroacoustic noise reduction optimization study for a natural gas orifice plate validates the method’s practical effectiveness. The results confirm its strong suitability for engineering applications. The adaptive sampling mechanism allows automatic sample reduction in later stages without accuracy loss, overcoming the dual challenges of slow single-point convergence and excessive computational demands in fixed multi-point approaches. This improvement enhances Kriging’s applicability for high-precision modeling of complex engineering problems.

  • Conference Article
  • Cite Count Icon 6
  • 10.1109/iros.1994.407518
Efficient search for robot skill learning: simulation and reality
  • Sep 12, 1994
  • J.G Schneider + 1 more

Table lookup with interpolation is used for many learning and adaptation tasks. Redundant mappings capture the important concept of motor in real, behaving systems. Few robot skill implementations have dealt with redundant mappings, in which the space to be searched to create the table has much higher dimensionality than the table. A practical method for inverting redundant mappings is important in physical systems with limited time for trials. The authors present the Guided Table Fill In algorithm, which uses data already stored in the table to guide search through the space of potential table entries. The algorithm is illustrated and tested on a skill learning task using a robot with a flexible link. The authors' experiments show that the ability to search high dimensional action spaces efficiently allows skill learners to find new behaviors that are qualitatively different from what they were presented or what the system designer may have expected. Thus the use of this technique can allow researchers to seek higher dimensional action spaces for their systems rather than constraining their search space at the risk of excluding the best actions. The authors also present a model for the robot arm, flexible link dynamics, and release mechanism of their robot. The authors' experiments suggest that the use of even a crude simulation model can be helpful for learning on the real robot. >

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant