76
|
Ariki Y, Hyon SH, Morimoto J. Extraction of primitive representation from captured human movements and measured ground reaction force to generate physically consistent imitated behaviors. Neural Netw 2013; 40:32-43. [PMID: 23380596 DOI: 10.1016/j.neunet.2013.01.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2010] [Revised: 12/14/2012] [Accepted: 01/04/2013] [Indexed: 11/28/2022]
Abstract
In this paper, we propose an imitation learning framework to generate physically consistent behaviors by estimating the ground reaction force from captured human behaviors. In the proposed framework, we first extract behavioral primitives, which are represented by linear dynamical models, from captured human movements and measured ground reaction force by using the Gaussian mixture of linear dynamical models. Therefore, our method has small dependence on classification criteria defined by an experimenter. By switching primitives with different combinations while estimating the ground reaction force, different physically consistent behaviors can be generated. We apply the proposed method to a four-link robot model to generate squat motion sequences. The four-link robot model successfully generated the squat movements by using our imitation learning framework. To show generalization performance, we also apply the proposed method to robot models that have different torso weights and lengths from a human demonstrator and evaluate the control performances. In addition, we show that the robot model is able to recognize and imitate demonstrator movements even when the observed movements are deviated from the movements that are used to construct the primitives. For further evaluation in higher-dimensional state space, we apply the proposed method to a seven-link robot model. The seven-link robot model was able to generate squat-and-sway motions by using the proposed framework.
Collapse
|
77
|
Sugimoto N, Morimoto J, Hyon SH, Kawato M. The eMOSAIC model for humanoid robot control. Neural Netw 2012; 29-30:8-19. [DOI: 10.1016/j.neunet.2012.01.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2010] [Revised: 09/20/2011] [Accepted: 01/13/2012] [Indexed: 12/01/2022]
|
78
|
Matsubara T, Morimoto J, Nakanishi J, Hyon SH, Hale JG, Cheng G. Learning to Acquire Whole-Body Humanoid Center of Mass Movements to Achieve Dynamic Tasks. Adv Robot 2012. [DOI: 10.1163/156855308x324785] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
79
|
Cheng G, Hyon SH, Morimoto J, Ude A, Hale JG, Colvin G, Scroggin W, Jacobsen SC. CB: a humanoid research platform for exploring neuroscience. Adv Robot 2012. [DOI: 10.1163/156855307781389356] [Citation(s) in RCA: 133] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
80
|
Morimoto J, Doya K. Hierarchical reinforcement learning for motion learning: learning 'stand-up' trajectories. Adv Robot 2012. [DOI: 10.1163/156855399x00513] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
81
|
Noda T, Hyon SH, Morimoto J. Exoskeleton assistive robot: Learning feedforward assist model iteratively through human–robot interaction. Neurosci Res 2011. [DOI: 10.1016/j.neures.2011.07.1796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
82
|
Morimoto J, Umeda T, Nishimura Y, Isa T, Kawato M, Toyama K. Canonical correlation analysis (CCA) of multi-joint motion (JM) and dorsal root ganglion (DRG) neuronal activities. Neurosci Res 2011. [DOI: 10.1016/j.neures.2011.07.871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
83
|
Suetani H, Ideta A, Morimoto J. Predicting upcoming falls of biped walkers in irregular environments: Approaches based on kernel multivariate analysis. Neurosci Res 2011. [DOI: 10.1016/j.neures.2011.07.1795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
84
|
Matsubara T, Hyon SH, Morimoto J. User-adaptive myoelectric interface for EMG-based robotic hand control. Neurosci Res 2011. [DOI: 10.1016/j.neures.2011.07.1793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
85
|
Matsubara T, Hyon SH, Morimoto J. Learning parametric dynamic movement primitives from multiple demonstrations. Neural Netw 2011; 24:493-500. [PMID: 21388784 DOI: 10.1016/j.neunet.2011.02.004] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2010] [Revised: 12/10/2010] [Accepted: 02/09/2011] [Indexed: 10/18/2022]
Abstract
Learning from demonstration has shown to be a suitable approach for learning control policies (CPs). However, most previous studies learn CPs from a single demonstration, which results in limited scalability and insufficient generalization toward a wide range of applications in real environments. This paper proposes a novel approach to learn highly scalable CPs of basis movement skills from multiple demonstrations. In contrast to conventional studies with a single demonstration, i.e., dynamic movement primitives (DMPs), our approach efficiently encodes multiple demonstrations by shaping a parametric-attractor landscape in a set of differential equations. Assuming a certain similarity among multiple demonstrations, our approach learns the parametric-attractor landscape by extracting a small number of common factors in multiple demonstrations. The learned CPs allow the synthesis of novel movements with novel motion styles by specifying the linear coefficients of the bases as parameter vectors without losing useful properties of the DMPs, such as stability and robustness against perturbations. For both discrete and rhythmic movement skills, we present a unified learning procedure for learning a parametric-attractor landscape from multiple demonstrations. The feasibility and highly extended scalability of DMPs are demonstrated on an actual dual-arm robot.
Collapse
|
86
|
Abstract
AbstractDeep level transient spectroscopy (DLTS), which assumes a single exponential decay form for the transient junction capacitance, is the most commonly used method to characterize deep impurity levels in semiconductors. However conventional DLTS may lead to erroneous results if there are several closely spaced energy levels or the emission rate has a continuous spectrum. To overcome this difficulty a novel method named the multi-exponential analysis of DLTS by CONTIN (MEDLTS by CONTIN) is proposed. This method analyzes the emission rate to have a finite continuous spectrum S(λ) which appears in the transient junction capacitance C(t)=, by using the program “CONTIN” developed by Provencher in biophysics. Even if S(λ) includes two peaks at λ1 and λ2, those peaks can be distinguished for λ2/ λ1>2. As an example of the application of this method, deep levels in Si:Au were experimentally investigated. According to the three dimensional S(λ)-T2/λ-1/T representation, the single peak in the conventional DLTS was clarified to consist of two adjacent levels with activation energies and capture cross sections EB1=0.51eV, σB1=4.0×10−15cm2 and EB2=0.47eV, σB2=1.1×10−15cm2. With the assumption of the finite continuous spectrum S(λ) for the emission rate, MEDLTS by CONTIN permits one to get much information correctly. MEDLTS by CONTIN is superior to the conventional DLTS because it is a single-temperature scan, multi-exponential analysis instead of the conventional multi-temperature scan, single-exponential analysis.
Collapse
|
87
|
Ude A, Gams A, Asfour T, Morimoto J. Task-Specific Generalization of Discrete and Periodic Dynamic Movement Primitives. IEEE T ROBOT 2010. [DOI: 10.1109/tro.2010.2065430] [Citation(s) in RCA: 245] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
88
|
Ariki Y, Morimoto J, Hyon SH. Extraction of movement primitives without explicit labeling for imitation learning. Neurosci Res 2010. [DOI: 10.1016/j.neures.2010.07.1462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
89
|
Morimoto J, Umeda T, Sakatani T, Isa T, Kawato M. Estimation of common features shared by multiple sensor data including neural recordings. Neurosci Res 2010. [DOI: 10.1016/j.neures.2010.07.2517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
90
|
Umeda T, Sakatani T, Morimoto J, Yamashita O, Satoh M, Seki K, Kawato M, Isa T. Coding of hand/arm trajectories by neuronal activity in dorsal root ganglia of monkeys. Neurosci Res 2010. [DOI: 10.1016/j.neures.2010.07.1447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
91
|
Hyon SH, Morimoto J, Kawato M. Development of a P-E hybrid exoskeleton robot for walking and postural rehabilitation. Neurosci Res 2010. [DOI: 10.1016/j.neures.2010.07.1463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
92
|
Morimoto J, Atkeson CG. Nonparametric representation of an approximated Poincaré map for learning biped locomotion. Auton Robots 2009. [DOI: 10.1007/s10514-009-9133-z] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
93
|
Mori N, Okumoto M, Morimoto J, Imai S, Matsuyama T, Takamori Y, Yagasaki O. Genetic Analysis of Susceptibility to Radiation-induced Apoptosis of Thymocytes in Mice. Int J Radiat Biol 2009; 62:153-9. [PMID: 1355508 DOI: 10.1080/09553009214551961] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Genetic control of thymocyte susceptibility to radiation-induced apoptosis was investigated in BALB/cHeA, STS/A and five other strains of mice by counting pyknotic cells in a selected area of thymic cortex on histological specimens after whole-body X-irradiation. Number of dead cells increased almost linearly with doses (range 0.25-0.75 Gy) in BALB/cHeA and STS/A mice. However, dead cell counts in BALB/cHeA mice were more than twice those in STS/A mice at each dose. Of five other strains of mice, C57BL/6N and B10.BR mice exhibited a sensitive phenotype similar to BALB/cHeA mice, while C3H/HeAMsNrs and NFS/N mice showed a resistant phenotype similar to STS/A mice. A/J mice seemed to be rather resistant. A sex difference was not recognized in BALB/cHeA and STS/A mice. Resistance was dominant over susceptibility in the progenies of reciprocal crosses between the two strains, indicating an autosomal inheritance and no maternal effect. Segregation ratio of susceptible phenotype to resistant one in the backcrosses of female (BALB/cHeA x STS/A)F1 mice with male BALB/cHeA mice was not significantly different from 1:1 and all backcrosses of female (BALB/cHeA x STS/A)F1 mice with male STS/A mice exhibited a resistant phenotype. Results suggested that thymocyte susceptibility to radiation-induced apotosis is controlled by one major autosomal allele.
Collapse
|
94
|
Morimoto J, Hyon SH, Kawato M. Feature Extraction for Task Relevant System Identification. Neurosci Res 2009. [DOI: 10.1016/j.neures.2009.09.977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
95
|
Morimoto J, Endo G, Nakanishi J, Cheng G. A Biologically Inspired Biped Locomotion Strategy for Humanoid Robots: Modulation of Sinusoidal Patterns by a Coupled Oscillator Model. IEEE T ROBOT 2008. [DOI: 10.1109/tro.2008.915457] [Citation(s) in RCA: 88] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
96
|
Endo G, Morimoto J, Matsubara T, Nakanishi J, Cheng G. Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot. Int J Rob Res 2008. [DOI: 10.1177/0278364907084980] [Citation(s) in RCA: 139] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In this paper we describe a learning framework for a central pattern generator (CPG)-based biped locomotion controller using a policy gradient method. Our goals in this study are to achieve CPG-based biped walking with a 3D hardware humanoid and to develop an efficient learning algorithm with CPG by reducing the dimensionality of the state space used for learning. We demonstrate that an appropriate feedback controller can be acquired within a few thousand trials by numerical simulations and the controller obtained in numerical simulation achieves stable walking with a physical robot in the real world. Numerical simulations and hardware experiments evaluate the walking velocity and stability. The results suggest that the learning algorithm is capable of adapting to environmental changes. Furthermore, we present an online learning scheme with an initial policy for a hardware robot to improve the controller within 200 iterations.
Collapse
|
97
|
Abstract
In this study, we propose a novel use of reinforcement learning for estimating hidden variables and parameters of nonlinear dynamical systems. A critical issue in hidden-state estimation is that we cannot directly observe estimation errors. However, by defining errors of observable variables as a delayed penalty, we can apply a reinforcement learning frame-work to state estimation problems. Specifically, we derive a method to construct a nonlinear state estimator by finding an appropriate feedback input gain using the policy gradient method. We tested the proposed method on single pendulum dynamics and show that the joint angle variable could be successfully estimated by observing only the angular velocity, and vice versa. In addition, we show that we could acquire a state estimator for the pendulum swing-up task in which a swing-up controller is also acquired by reinforcement learning simultaneously. Furthermore, we demonstrate that it is possible to estimate the dynamics of the pendulum itself while the hidden variables are estimated in the pendulum swing-up task. Application of the proposed method to a two-linked biped model is also presented.
Collapse
|
98
|
Shibata MA, Morimoto J, Doi H, Morishima S, Naka M, Otsuki Y. Electrogene therapy using endostatin, with or without suicide gene therapy, suppresses murine mammary tumor growth and metastasis. Cancer Gene Ther 2006; 14:268-78. [PMID: 17096028 DOI: 10.1038/sj.cgt.7701009] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Syngeneic inoculated metastatic mammary cancers received direct intratumoral injection of a plasmid vector containing either endostatin (pEndo) with or without a suicide gene (pHSVtk), pHSVtk alone or control vector once a week for 8 weeks. We applied electrogene transfer to the tumors after each injection and administered ganciclovir (GCV) to pHSVtk-transfected mice using an osmotic minipump. Anticancer efficacy was monitored using a variety of parameters, namely tumor volume, intratumoral microvessel density and DNA synthesis, number of mice with metastasis, and number of sites of metastasis per mouse. Tumor volume was significantly lower in all therapeutic groups, with the most effective growth suppression in the pEndo+pHSVtk/GCV group. Lymph node metastasis was significantly less frequent in all therapeutic groups, whereas the multiplicity of lung metastases was significantly lower only in the pEndo and pEndo+pHSVtk/GCV groups. All therapeutic groups showed significantly lower intratumor microvessel density and DNA synthesis. The pEndo and pEndo+pHSVtk/GCV groups also showed a significant reduction in the numbers of dilated lymphatic vessels containing intralumenal tumor cells. Our data suggest that endostatin electrogene therapy alone or in combination with pHSVtk/GCV suicide gene therapy is more beneficial than suicide gene therapy alone. The observed antimetastatic activity of endostatin may be of high clinical significance in the treatment of metastatic breast cancer.
Collapse
|
99
|
Mori N, Morimoto J, Nakamura K. Computer Simulation of Steady Shear Flow of Liquid Crystals with the Gay-Berne Potential. ACTA ACUST UNITED AC 2006. [DOI: 10.1080/10587259708047083] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
100
|
Abstract
This letter proposes a new reinforcement learning (RL) paradigm that explicitly takes into account input disturbance as well as modeling errors. The use of environmental models in RL is quite popular for both off-line learning using simulations and for online action planning. However, the difference between the model and the real environment can lead to unpredictable, and often unwanted, results. Based on the theory of H∞ control, we consider a differential game in which a “disturbing” agent tries to make the worst possible disturbance while a “control” agent tries to make the best control input. The problem is formulated as finding a min-max solution of a value function that takes into account the amount of the reward and the norm of the disturbance. We derive online learning algorithms for estimating the value function and for calculating the worst disturbance and the best control in reference to the value function. We tested the paradigm, which we call robust reinforcement learning (RRL), on the control task of an inverted pendulum. In the linear domain, the policy and the value function learned by online algorithms coincided with those derived analytically by the linear H∞ control theory. For a fully nonlinear swing-up task, RRL achieved robust performance with changes in the pendulum weight and friction, while a standard reinforcement learning algorithm could not deal with these changes. We also applied RRL to the cart-pole swing-up task, and a robust swing-up policy was acquired.
Collapse
|