1
|
Masala GL, Esposito M, Maniscalco U, Calimera A. Editorial: Language and Vision in Robotics: Emerging Neural and On-Device Approaches. FRONTIERS IN COMPUTER SCIENCE 2022. [DOI: 10.3389/fcomp.2022.930067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
2
|
Yang J, Chew E, Liu P. Service humanoid robotics: a novel interactive system based on bionic-companionship framework. PeerJ Comput Sci 2021; 7:e674. [PMID: 34458575 PMCID: PMC8371998 DOI: 10.7717/peerj-cs.674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 07/22/2021] [Indexed: 06/13/2023]
Abstract
At present, industrial robotics focuses more on motion control and vision, whereas humanoid service robotics (HSRs) are increasingly being investigated and researched in the field of speech interaction. The problem and quality of human-robot interaction (HRI) has become a widely debated topic in academia. Especially when HSRs are applied in the hospitality industry, some researchers believe that the current HRI model is not well adapted to the complex social environment. HSRs generally lack the ability to accurately recognize human intentions and understand social scenarios. This study proposes a novel interactive framework suitable for HSRs. The proposed framework is grounded on the novel integration of Trevarthen's (2001) companionship theory and neural image captioning (NIC) generation algorithm. By integrating image-to-natural interactivity generation and communicating with the environment to better interact with the stakeholder, thereby changing from interaction to a bionic-companionship. Compared to previous research a novel interactive system is developed based on the bionic-companionship framework. The humanoid service robot was integrated with the system to conduct preliminary tests. The results show that the interactive system based on the bionic-companionship framework can help the service humanoid robot to effectively respond to changes in the interactive environment, for example give different responses to the same character in different scenes.
Collapse
Affiliation(s)
- Jiaji Yang
- Cardiff School of Technologies, Cardiff Metropolitan University, Cardiff, Cardiff, United Kingdom
| | - Esyin Chew
- Cardiff School of Technologies, Cardiff Metropolitan University, Cardiff, Cardiff, United Kingdom
| | - Pengcheng Liu
- The Department of Computer Science, University of York, York, United Kingdom
| |
Collapse
|
3
|
Giorgi I, Cangelosi A, Masala GL. Learning Actions From Natural Language Instructions Using an ON-World Embodied Cognitive Architecture. Front Neurorobot 2021; 15:626380. [PMID: 34054452 PMCID: PMC8155541 DOI: 10.3389/fnbot.2021.626380] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 04/08/2021] [Indexed: 11/13/2022] Open
Abstract
Endowing robots with the ability to view the world the way humans do, to understand natural language and to learn novel semantic meanings when they are deployed in the physical world, is a compelling problem. Another significant aspect is linking language to action, in particular, utterances involving abstract words, in artificial agents. In this work, we propose a novel methodology, using a brain-inspired architecture, to model an appropriate mapping of language with the percept and internal motor representation in humanoid robots. This research presents the first robotic instantiation of a complex architecture based on the Baddeley's Working Memory (WM) model. Our proposed method grants a scalable knowledge representation of verbal and non-verbal signals in the cognitive architecture, which supports incremental open-ended learning. Human spoken utterances about the workspace and the task are combined with the internal knowledge map of the robot to achieve task accomplishment goals. We train the robot to understand instructions involving higher-order (abstract) linguistic concepts of developmental complexity, which cannot be directly hooked in the physical world and are not pre-defined in the robot's static self-representation. Our proposed interactive learning method grants flexible run-time acquisition of novel linguistic forms and real-world information, without training the cognitive model anew. Hence, the robot can adapt to new workspaces that include novel objects and task outcomes. We assess the potential of the proposed methodology in verification experiments with a humanoid robot. The obtained results suggest robust capabilities of the model to link language bi-directionally with the physical environment and solve a variety of manipulation tasks, starting with limited knowledge and gradually learning from the run-time interaction with the tutor, past the pre-trained stage.
Collapse
Affiliation(s)
- Ioanna Giorgi
- Department of Computer Science, The University of Manchester, Manchester, United Kingdom
| | - Angelo Cangelosi
- Department of Computer Science, The University of Manchester, Manchester, United Kingdom
| | - Giovanni L Masala
- Department of Computing and Mathematics, Manchester Metropolitan University, Manchester, United Kingdom
| |
Collapse
|
4
|
Sharkawy AN, Koustoumpardis PN, Aspragathos N. A recurrent neural network for variable admittance control in human–robot cooperation: simultaneously and online adjustment of the virtual damping and Inertia parameters. INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS 2020. [DOI: 10.1007/s41315-020-00154-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
5
|
Combined Sensing, Cognition, Learning, and Control for Developing Future Neuro-Robotics Systems: A Survey. IEEE Trans Cogn Dev Syst 2019. [DOI: 10.1109/tcds.2019.2897618] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
6
|
Yamada T, Matsunaga H, Ogata T. Paired Recurrent Autoencoders for Bidirectional Translation Between Robot Actions and Linguistic Descriptions. IEEE Robot Autom Lett 2018. [DOI: 10.1109/lra.2018.2852838] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
7
|
Suzuki K, Mori H, Ogata T. Motion Switching With Sensory and Instruction Signals by Designing Dynamical Systems Using Deep Neural Network. IEEE Robot Autom Lett 2018. [DOI: 10.1109/lra.2018.2853651] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
8
|
Murata S, Li Y, Arie H, Ogata T, Sugano S. Learning to Achieve Different Levels of Adaptability for Human–Robot Collaboration Utilizing a Neuro-Dynamical System. IEEE Trans Cogn Dev Syst 2018. [DOI: 10.1109/tcds.2018.2797260] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
9
|
Nakajo R, Murata S, Arie H, Ogata T. Acquisition of Viewpoint Transformation and Action Mappings via Sequence to Sequence Imitative Learning by Deep Neural Networks. Front Neurorobot 2018; 12:46. [PMID: 30087605 PMCID: PMC6066551 DOI: 10.3389/fnbot.2018.00046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Accepted: 07/03/2018] [Indexed: 11/13/2022] Open
Abstract
We propose an imitative learning model that allows a robot to acquire positional relations between the demonstrator and the robot, and to transform observed actions into robotic actions. Providing robots with imitative capabilities allows us to teach novel actions to them without resorting to trial-and-error approaches. Existing methods for imitative robotic learning require mathematical formulations or conversion modules to translate positional relations between demonstrators and robots. The proposed model uses two neural networks, a convolutional autoencoder (CAE) and a multiple timescale recurrent neural network (MTRNN). The CAE is trained to extract visual features from raw images captured by a camera. The MTRNN is trained to integrate sensory-motor information and to predict next states. We implement this model on a robot and conducted sequence to sequence learning that allows the robot to transform demonstrator actions into robot actions. Through training of the proposed model, representations of actions, manipulated objects, and positional relations are formed in the hierarchical structure of the MTRNN. After training, we confirm capability for generating unlearned imitative patterns.
Collapse
Affiliation(s)
- Ryoichi Nakajo
- Department of Intermedia Art and Science, Waseda University, Tokyo, Japan
| | - Shingo Murata
- Department of Modern Mechanical Engineering, Waseda University, Tokyo, Japan
| | - Hiroaki Arie
- Department of Modern Mechanical Engineering, Waseda University, Tokyo, Japan
| | - Tetsuya Ogata
- Department of Intermedia Art and Science, Waseda University, Tokyo, Japan
| |
Collapse
|
10
|
Taniguchi A, Taniguchi T, Cangelosi A. Cross-Situational Learning with Bayesian Generative Models for Multimodal Category and Word Learning in Robots. Front Neurorobot 2018; 11:66. [PMID: 29311888 PMCID: PMC5742219 DOI: 10.3389/fnbot.2017.00066] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Accepted: 11/21/2017] [Indexed: 11/24/2022] Open
Abstract
In this paper, we propose a Bayesian generative model that can form multiple categories based on each sensory-channel and can associate words with any of the four sensory-channels (action, position, object, and color). This paper focuses on cross-situational learning using the co-occurrence between words and information of sensory-channels in complex situations rather than conventional situations of cross-situational learning. We conducted a learning scenario using a simulator and a real humanoid iCub robot. In the scenario, a human tutor provided a sentence that describes an object of visual attention and an accompanying action to the robot. The scenario was set as follows: the number of words per sensory-channel was three or four, and the number of trials for learning was 20 and 40 for the simulator and 25 and 40 for the real robot. The experimental results showed that the proposed method was able to estimate the multiple categorizations and to learn the relationships between multiple sensory-channels and words accurately. In addition, we conducted an action generation task and an action description task based on word meanings learned in the cross-situational learning scenario. The experimental results showed that the robot could successfully use the word meanings learned by using the proposed method.
Collapse
Affiliation(s)
- Akira Taniguchi
- Emergent Systems Laboratory, Ritsumeikan University, Kusatsu, Japan
| | | | - Angelo Cangelosi
- The Centre for Robotics and Neural Systems, Plymouth University, Plymouth, United Kingdom
| |
Collapse
|
11
|
Yamada T, Murata S, Arie H, Ogata T. Representation Learning of Logic Words by an RNN: From Word Sequences to Robot Actions. Front Neurorobot 2017; 11:70. [PMID: 29311891 PMCID: PMC5744442 DOI: 10.3389/fnbot.2017.00070] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Accepted: 12/14/2017] [Indexed: 11/13/2022] Open
Abstract
An important characteristic of human language is compositionality. We can efficiently express a wide variety of real-world situations, events, and behaviors by compositionally constructing the meaning of a complex expression from a finite number of elements. Previous studies have analyzed how machine-learning models, particularly neural networks, can learn from experience to represent compositional relationships between language and robot actions with the aim of understanding the symbol grounding structure and achieving intelligent communicative agents. Such studies have mainly dealt with the words (nouns, adjectives, and verbs) that directly refer to real-world matters. In addition to these words, the current study deals with logic words, such as “not,” “and,” and “or” simultaneously. These words are not directly referring to the real world, but are logical operators that contribute to the construction of meaning in sentences. In human–robot communication, these words may be used often. The current study builds a recurrent neural network model with long short-term memory units and trains it to learn to translate sentences including logic words into robot actions. We investigate what kind of compositional representations, which mediate sentences and robot actions, emerge as the network's internal states via the learning process. Analysis after learning shows that referential words are merged with visual information and the robot's own current state, and the logical words are represented by the model in accordance with their functions as logical operators. Words such as “true,” “false,” and “not” work as non-linear transformations to encode orthogonal phrases into the same area in a memory cell state space. The word “and,” which required a robot to lift up both its hands, worked as if it was a universal quantifier. The word “or,” which required action generation that looked apparently random, was represented as an unstable space of the network's dynamical system.
Collapse
Affiliation(s)
- Tatsuro Yamada
- Department of Intermedia Art and Science, Waseda University, Tokyo, Japan
| | - Shingo Murata
- Department of Modern Mechanical Engineering, Waseda University, Tokyo, Japan
| | - Hiroaki Arie
- Department of Modern Mechanical Engineering, Waseda University, Tokyo, Japan
| | - Tetsuya Ogata
- Department of Intermedia Art and Science, Waseda University, Tokyo, Japan
| |
Collapse
|
12
|
Xiao L, Zhang Y, Liao B, Zhang Z, Ding L, Jin L. A Velocity-Level Bi-Criteria Optimization Scheme for Coordinated Path Tracking of Dual Robot Manipulators Using Recurrent Neural Network. Front Neurorobot 2017; 11:47. [PMID: 28928651 PMCID: PMC5591439 DOI: 10.3389/fnbot.2017.00047] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Accepted: 08/17/2017] [Indexed: 11/13/2022] Open
Abstract
A dual-robot system is a robotic device composed of two robot arms. To eliminate the joint-angle drift and prevent the occurrence of high joint velocity, a velocity-level bi-criteria optimization scheme, which includes two criteria (i.e., the minimum velocity norm and the repetitive motion), is proposed and investigated for coordinated path tracking of dual robot manipulators. Specifically, to realize the coordinated path tracking of dual robot manipulators, two subschemes are first presented for the left and right robot manipulators. After that, such two subschemes are reformulated as two general quadratic programs (QPs), which can be formulated as one unified QP. A recurrent neural network (RNN) is thus presented to solve effectively the unified QP problem. At last, computer simulation results based on a dual three-link planar manipulator further validate the feasibility and the efficacy of the velocity-level optimization scheme for coordinated path tracking using the recurrent neural network.
Collapse
Affiliation(s)
- Lin Xiao
- College of Information Science and Engineering, Jishou University, Jishou, China
| | - Yongsheng Zhang
- College of Information Science and Engineering, Jishou University, Jishou, China
| | - Bolin Liao
- College of Information Science and Engineering, Jishou University, Jishou, China
| | - Zhijun Zhang
- School of Automation Science and Engineering, South China University of Technology, Guangzhou, China
| | - Lei Ding
- College of Information Science and Engineering, Jishou University, Jishou, China
| | - Long Jin
- School of Information Science and Engineering, Lanzhou University, Lanzhou, China
| |
Collapse
|