[EBOOK] Learning Mobile Manipulation Actions From Human Demonstrations An Approach To Learning And Augmenting Action Models And Their Integration Into Task Representations PDF Download

Learning Mobile Manipulation Actions from Human Demonstrations an Approach to Learning and Augmenting Action Models and Their Integration Into Task Representations

Book Details:

Download or read book Learning Mobile Manipulation Actions from Human Demonstrations an Approach to Learning and Augmenting Action Models and Their Integration Into Task Representations written by Tim Welschehold and published by . This book was released on 2020 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Abstract: While incredible advancements in robotics have been achieved over the last decade, direct physical interaction with an initially unknown and dynamic environment is still a challenging problem. In order to use robots as service assistants and take over household chores in the user's home environment, they must be able to perform goal directed manipulation tasks autonomously and further, learn these task intuitively from their owners. Consider for instance the task of setting a breakfast table: Although it is a relatively simple task for a human being, it poses some serious challenges to the robot. It must physically handle the users customized household environment and the objects therein, i.e., how can the items needed to set up the table be grasped and moved, how can the kitchen cabinets be opened, etc. Additionally the personal preferences of the user on how the breakfast table should be arranged must be respected. Due to the diverse characteristics of the custom objects and the individual human needs even a standard task like setting a breakfast table is impossible to pre-program before knowing the place of use and its occurrences. Therefore, the most promising way to engage robots as domestic help is to enable them to learn the tasks they should perform directly by their owners, without requiring the owner to possess any special knowledge of robotics or programming skills. Throughout this thesis we present various contributions addressing these challenges. Although learning from demonstration is a well-established approach to teaching robots without explicit programming, most approaches in literature for learning manipulation actions use kinesthetic training as these actions require thorough knowledge of the interactions between the robot and the object which can be learned directly by kinesthetic teaching since no abstraction is needed. In addition, in most current imitation learning approaches mobile platforms are not considered. In this thesis we present a novel approach to learn joint robot base and end-effector action models from observing demonstrations carried out by a human teacher. To achieve this we adapt trajectory data obtained from RGBD recordings of the human teacher performing the action to the capabilities of the robot. We formulate a graph optimization problem that the links the observed human trajectories with robot grasping capabilities and kinematic constraints between co-occurring base and gripper poses, allowing us to generate robot suitable trajectories. In a next step, we do not just learn individual manipulation actions, but to combine several actions into one task. Challenges arise from handling ambiguous goals and generalizing the task to new settings. We present an approach to learn both representations together from the same teacher demonstrations, one for individual mobile manipulation actions as described above, and one for the representation of the overall task intent. We leverage a framework based on Monte Carlo tree search to compute sequences of feasible actions imitating the teacher intention in new settings without explicitly specifying a task goal. In this way, we can reproduce complex tasks while ensuring that all composing actions are executable in the given setting. The mobile manipulation models mentioned above are encoded as dynamic systems to facilitate interaction with objects in world coordinates. However, this poses the challenge of translating kinematic constraints of the robot to the task space and including them in the action models. In this thesis we propose to couple robot base and end-effector motions generated by arbitrary dynamical systems by modulating the base velocity, while respecting the robots kinematic design. To this end we learn an approximation of the inverse reachability in closed form and implement the coupling as an obstacle avoidance problem. Furthermore, in this work we address the challenge of imitating manipulation actions, the execution of which depends on additional non-geometric quantities as, e.g., contact forces when handing over an object or measured liquid height, while pouring water into a cup. We suggest an approach to include this additional information in form of measured features directly into the action models. These features are recorded in the demonstrations alongside the geometric route of the manipulation action and their correlation is captured in a Gaussian Mixture Model that parametrizes the dynamic system used. This enables us to also couple the motion's geometric trajectory to the perceived features in the scene during action imitation. All the above described contributions were evaluated extensively in real world robot experiments on a PR2 system and a KUKA Iiwa Robot Arm

Developing a Mobile Manipulation System to Handle Unknown and Unstructured Objects

Book Details:

Author : Abdulrahman Al-Shanoon
Publisher :
Release : 2021
ISBN :
Pages : 0 pages

Download or read book Developing a Mobile Manipulation System to Handle Unknown and Unstructured Objects written by Abdulrahman Al-Shanoon and published by . This book was released on 2021 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: The exceptional human's ability to interact with unknown objects based on minimal prior experience is a permanent inspiration to the field of robotic manipulation. The recent revolution in industrial and service robots demands high-autonomy and intelligent mobile-manipulators. The goal of the thesis is to develop an autonomous mobile robotic manipulation system that can handle unknown and unstructured objects with the least training and human involvement. First, an end-to-end vision-based mobile manipulation architecture with minimal training using synthetic datasets is proposed in this thesis. The system includes: 1) effective training strategy of a perception network for object pose estimation, 2) the result is utilized as sensing feedback to integrate into a visual servoing system to achieve autonomous mobile manipulation. Experimental findings from simulations and real-world settings showed the efficiency of using computer-generated datasets, that can be generalized to the physical mobile-manipulator task. The model of the presented robot is experimentally verified and discussed. Second, a challenging robotic manipulation scenario of unknown-adjacent objects is addressed in this thesis by using a scalable self-supervised system that can learn grasping control strategies for unknown objects based on limited knowledge and simple sample objects. The developed learning scheme can be beneficial to both generalization and transferability without requiring any additional training or prior object awareness. Finally, an end-to-end self-learning framework is proposed to learn manipulating policies for challenging scenarios based on minimal training time and raw experience. The proposed model learns from scratch, from visual observations to sequential decision-making, manipulating actions and generalizes to unknown scenarios. The agent comprehends a sequence of manipulations that purposely lead to successful grasps. Results of the experiments demonstrated the effectiveness of the learning between manipulating actions, in which the grasping success rate has dramatically increased. The proposed system is successfully experimented and validated in simulations and real-world settings.

Computers

Robot Learning from Human Demonstration

Book Details:

Author : Sonia Dechter
Publisher : Springer Nature
Release : 2022-06-01
ISBN : 3031015703
Pages : 109 pages

Download or read book Robot Learning from Human Demonstration written by Sonia Dechter and published by Springer Nature. This book was released on 2022-06-01 with total page 109 pages. Available in PDF, EPUB and Kindle. Book excerpt: Learning from Demonstration (LfD) explores techniques for learning a task policy from examples provided by a human teacher. The field of LfD has grown into an extensive body of literature over the past 30 years, with a wide variety of approaches for encoding human demonstrations and modeling skills and tasks. Additionally, we have recently seen a focus on gathering data from non-expert human teachers (i.e., domain experts but not robotics experts). In this book, we provide an introduction to the field with a focus on the unique technical challenges associated with designing robots that learn from naive human teachers. We begin, in the introduction, with a unification of the various terminology seen in the literature as well as an outline of the design choices one has in designing an LfD system. Chapter 2 gives a brief survey of the psychology literature that provides insights from human social learning that are relevant to designing robotic social learners. Chapter 3 walks through an LfD interaction, surveying the design choices one makes and state of the art approaches in prior work. First, is the choice of input, how the human teacher interacts with the robot to provide demonstrations. Next, is the choice of modeling technique. Currently, there is a dichotomy in the field between approaches that model low-level motor skills and those that model high-level tasks composed of primitive actions. We devote a chapter to each of these. Chapter 7 is devoted to interactive and active learning approaches that allow the robot to refine an existing task model. And finally, Chapter 8 provides best practices for evaluation of LfD systems, with a focus on how to approach experiments with human subjects in this domain.

Computers

Robot Programming by Demonstration

Book Details:

Author : Sylvain Calinon
Publisher : EPFL Press
Release : 2009-08-24
ISBN : 9781439808672
Pages : 248 pages

Download or read book Robot Programming by Demonstration written by Sylvain Calinon and published by EPFL Press. This book was released on 2009-08-24 with total page 248 pages. Available in PDF, EPUB and Kindle. Book excerpt: Recent advances in RbD have identified a number of key issues for ensuring a generic approach to the transfer of skills across various agents and contexts. This book focuses on the two generic questions of what to imitate and how to imitate and proposes active teaching methods.

Machine learning

Understanding and Learning Robotic Manipulation Skills from Humans

Book Details:

Author : Elena Galbally Herrero
Publisher :
Release : 2022
ISBN :
Pages : 0 pages

Download or read book Understanding and Learning Robotic Manipulation Skills from Humans written by Elena Galbally Herrero and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Humans are constantly learning new skills and improving upon their existing abilities. In particular, when it comes to manipulating objects, humans are extremely effective at generalizing to new scenarios and using physical compliance to our advantage. Compliance is key to generating robust behaviors by reducing the need to rely on precise trajectories. Programming robots through predefined trajectories has been highly successful for performing tasks in structured environments, such as assembly lines. However, such an approach is not viable for real-time operations in real-world scenarios. Inspired by humans, we propose to program robots at a higher level of abstraction by using primitives that leverage contact information and compliant strategies. Compliance increases robustness to uncertainty in the environment and primitives provide us with atomic actions that can be reused to avoid coding new tasks from scratch. We have developed a framework that allows us to: (i) collect and segment human data from multiple contact-rich tasks through direct or haptic demonstrations, (ii) analyze this data and extract the human's compliant strategy, and (iii) encode the strategy into robot primitives using task-level controllers. During autonomous task execution, haptic interfaces enable human real-time intervention and additional data collection for recovery from failures. At the core of this framework is the notion of a compliant frame - an origin and three directions in space along and about which we control motion and compliance. The compliant frame is attached to the object being manipulated and together with the desired task parameters defines a primitive. Task parameters include desired forces, moments, positions, and orientations. This task specification provides a physically meaningful, low-dimensional, and robot-independent representation. This thesis presents a novel framework for learning manipulation skills from demonstration data. Leveraging compliant frames enables us to understand human actions and extract strategies that generalize across objects and robots. The framework was extensively validated through simulation and hardware experiments, including five real-world construction tasks.

Computers

Interactive Task Learning

Book Details:

Author : Kevin A. Gluck
Publisher : MIT Press
Release : 2019-08-16
ISBN : 0262349434
Pages : 355 pages

Download or read book Interactive Task Learning written by Kevin A. Gluck and published by MIT Press. This book was released on 2019-08-16 with total page 355 pages. Available in PDF, EPUB and Kindle. Book excerpt: Experts from a range of disciplines explore how humans and artificial agents can quickly learn completely new tasks through natural interactions with each other. Humans are not limited to a fixed set of innate or preprogrammed tasks. We learn quickly through language and other forms of natural interaction, and we improve our performance and teach others what we have learned. Understanding the mechanisms that underlie the acquisition of new tasks through natural interaction is an ongoing challenge. Advances in artificial intelligence, cognitive science, and robotics are leading us to future systems with human-like capabilities. A huge gap exists, however, between the highly specialized niche capabilities of current machine learning systems and the generality, flexibility, and in situ robustness of human instruction and learning. Drawing on expertise from multiple disciplines, this Strüngmann Forum Report explores how humans and artificial agents can quickly learn completely new tasks through natural interactions with each other. The contributors consider functional knowledge requirements, the ontology of interactive task learning, and the representation of task knowledge at multiple levels of abstraction. They explore natural forms of interactions among humans as well as the use of interaction to teach robots and software agents new tasks in complex, dynamic environments. They discuss research challenges and opportunities, including ethical considerations, and make proposals to further understanding of interactive task learning and create new capabilities in assistive robotics, healthcare, education, training, and gaming. Contributors Tony Belpaeme, Katrien Beuls, Maya Cakmak, Joyce Y. Chai, Franklin Chang, Ropafadzo Denga, Marc Destefano, Mark d'Inverno, Kenneth D. Forbus, Simon Garrod, Kevin A. Gluck, Wayne D. Gray, James Kirk, Kenneth R. Koedinger, Parisa Kordjamshidi, John E. Laird, Christian Lebiere, Stephen C. Levinson, Elena Lieven, John K. Lindstedt, Aaron Mininger, Tom Mitchell, Shiwali Mohan, Ana Paiva, Katerina Pastra, Peter Pirolli, Roussell Rahman, Charles Rich, Katharina J. Rohlfing, Paul S. Rosenbloom, Nele Russwinkel, Dario D. Salvucci, Matthew-Donald D. Sangster, Matthias Scheutz, Julie A. Shah, Candace L. Sidner, Catherine Sibert, Michael Spranger, Luc Steels, Suzanne Stevenson, Terrence C. Stewart, Arthur Still, Andrea Stocco, Niels Taatgen, Andrea L. Thomaz, J. Gregory Trafton, Han L. J. van der Maas, Paul Van Eecke, Kurt VanLehn, Anna-Lisa Vollmer, Janet Wiles, Robert E. Wray III, Matthew Yee-King

Learning Mobile Manipulation

Book Details:

Author : David Joseph Watkins
Publisher :
Release : 2022
ISBN :
Pages : 0 pages

Download or read book Learning Mobile Manipulation written by David Joseph Watkins and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Providing mobile robots with the ability to manipulate objects has, despite decades of research, remained a challenging problem. The problem is approachable in constrained environments where there is ample prior knowledge of the environment layout and manipulatable objects. The challenge is in building systems that scale beyond specific situational instances and gracefully operate in novel conditions. In the past, researchers used heuristic and simple rule-based strategies to accomplish tasks such as scene segmentation or reasoning about occlusion. These heuristic strategies work in constrained environments where a roboticist can make simplifying assumptions about everything from the geometries of the objects to be interacted with, level of clutter, camera position, lighting, and a myriad of other relevant variables. The work in this thesis will demonstrate how to build a system for robotic mobile manipulation that is robust to changes in these variables. This robustness will be enabled by recent simultaneous advances in the fields of big data, deep learning, and simulation.

End-user computing

Constructing Mobile Manipulation Behaviors Using Expert Interfaces and Autonomous Robot Learning

Book Details:

Author : Hai Dai Nguyen
Publisher :
Release : 2013
ISBN :
Pages : pages

Download or read book Constructing Mobile Manipulation Behaviors Using Expert Interfaces and Autonomous Robot Learning written by Hai Dai Nguyen and published by . This book was released on 2013 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: With current state-of-the-art approaches, development of a single mobile manipulation capability can be a labor-intensive process that presents an impediment to the creation of general purpose household robots. At the same time, we expect that involving a larger community of non-roboticists can accelerate the creation of new novel behaviors. We introduce the use of a software authoring environment called ROS Commander (ROSCo) allowing end-users to create, refine, and reuse robot behaviors with complexity similar to those currently created by roboticists. Akin to Photoshop, which provides end-users with interfaces for advanced computer vision algorithms, our environment provides interfaces to mobile manipulation algorithmic building blocks that can be combined and configured to suit the demands of new tasks and their variations. As our system can be more demanding of users than alternatives such as using kinesthetic guidance or learning from demonstration, we performed a user study with 11 able-bodied participants and one person with quadriplegia to determine whether computer literate non-roboticists will be able to learn to use our tool. In our study, all participants were able to successfully construct functional behaviors after being trained. Furthermore, participants were able to produce behaviors that demonstrated a variety of creative manipulation strategies, showing the power of enabling end-users to author robot behaviors. Additionally, we introduce how using autonomous robot learning, where the robot captures its own training data, can complement human authoring of behaviors by freeing users from the repetitive task of capturing data for learning. By taking advantage of the robot's embodiment, our method creates classifiers that predict using visual appearances 3D locations on home mechanisms where user constructed behaviors will succeed. With active learning, we show that such classifiers can be learned using a small number of examples. We also show that this learning system works with behaviors constructed by non-roboticists in our user study. As far as we know, this is the first instance of perception learning with behaviors not hand-crafted by roboticists.

Technology & Engineering

Computational Human Robot Interaction

Book Details:

Author : Andrea Thomaz
Publisher :
Release : 2016-12-20
ISBN : 9781680832082
Pages : 140 pages

Download or read book Computational Human Robot Interaction written by Andrea Thomaz and published by . This book was released on 2016-12-20 with total page 140 pages. Available in PDF, EPUB and Kindle. Book excerpt: Computational Human-Robot Interaction provides the reader with a systematic overview of the field of Human-Robot Interaction over the past decade, with a focus on the computational frameworks, algorithms, techniques, and models currently used to enable robots to interact with humans.

Robotics

IROS

Book Details:

Author :
Publisher :
Release : 2001
ISBN :
Pages : 654 pages

Download or read book IROS written by and published by . This book was released on 2001 with total page 654 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Psychology

Trust in Human Robot Interaction

Book Details:

Author : Chang S. Nam
Publisher : Academic Press
Release : 2020-11-17
ISBN : 0128194731
Pages : 616 pages

Download or read book Trust in Human Robot Interaction written by Chang S. Nam and published by Academic Press. This book was released on 2020-11-17 with total page 616 pages. Available in PDF, EPUB and Kindle. Book excerpt: Trust in Human-Robot Interaction addresses the gamut of factors that influence trust of robotic systems. The book presents the theory, fundamentals, techniques and diverse applications of the behavioral, cognitive and neural mechanisms of trust in human-robot interaction, covering topics like individual differences, transparency, communication, physical design, privacy and ethics. - Presents a repository of the open questions and challenges in trust in HRI - Includes contributions from many disciplines participating in HRI research, including psychology, neuroscience, sociology, engineering and computer science - Examines human information processing as a foundation for understanding HRI - Details the methods and techniques used to test and quantify trust in HRI

Learning Robotic Manipulation from User Demonstrations

Book Details:

Author : Rouhollah Rahmatizadeh
Publisher :
Release : 2017
ISBN :
Pages : 65 pages

Download or read book Learning Robotic Manipulation from User Demonstrations written by Rouhollah Rahmatizadeh and published by . This book was released on 2017 with total page 65 pages. Available in PDF, EPUB and Kindle. Book excerpt: Personal robots that help disabled or elderly people in their activities of daily living need to be able to autonomously perform complex manipulation tasks. Traditional approaches to this problem employ task-specific controllers. However, these must to be designed by expert programmers, are focused on a single task, and will perform the task as programmed, not according to the preferences of the user. In this dissertation, we investigate methods that enable an assistive robot to learn to execute tasks as demonstrated by the user. First, we describe a learning from demonstration (LfD) method that learns assistive tasks that need to be adapted to the position and orientation of the user's head. Then we discuss a recurrent neural network controller that learns to generate movement trajectories for the end-effector of the robot arm to accomplish a task. The input to this controller is the pose of related objects and the current pose of the end-effector itself. Next, we discuss how to extract user preferences from the demonstration using reinforcement learning. Finally, we extend this controller to one that learns to observe images of the environment and generate joint movements for the robot to accomplish a desired task. We discuss several techniques that improve the performance of the controller and reduce the number of required demonstrations. One of this is multi-task learning: learning multiple tasks simultaneously with the same neural network. Another technique is to make the controller output one joint at a time-step, therefore to condition the prediction of each joint on the previous joints. We evaluate these controllers on a set of manipulation tasks and show that they can learn complex tasks, overcome failure, and attempt a task several times until they succeed.

Computer vision

Learning Multi step Robotic Manipulation Tasks Through Visual Planning

Book Details:

Author : Sulabh Kumra
Publisher :
Release : 2022
ISBN :
Pages : 0 pages

Download or read book Learning Multi step Robotic Manipulation Tasks Through Visual Planning written by Sulabh Kumra and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: "Multi-step manipulation tasks in unstructured environments are extremely challenging for a robot to learn. Such tasks interlace high-level reasoning that consists of the expected states that can be attained to achieve an overall task and low-level reasoning that decides what actions will yield these states. A model-free deep reinforcement learning method is proposed to learn multi-step manipulation tasks. This work introduces a novel Generative Residual Convolutional Neural Network (GR-ConvNet) model that can generate robust antipodal grasps from n-channel image input at real-time speeds (20ms). The proposed model architecture achieved a state-of-the-art accuracy on three standard grasping datasets. The adaptability of the proposed approach is demonstrated by directly transferring the trained model to a 7 DoF robotic manipulator with a grasp success rate of 95.4% and 93.0% on novel household and adversarial objects, respectively. A novel Robotic Manipulation Network (RoManNet) is introduced, which is a vision-based model architecture, to learn the action-value functions and predict manipulation action candidates. A Task Progress based Gaussian (TPG) reward function is defined to compute the reward based on actions that lead to successful motion primitives and progress towards the overall task goal. To balance the ratio of exploration/exploitation, this research introduces a Loss Adjusted Exploration (LAE) policy that determines actions from the action candidates according to the Boltzmann distribution of loss estimates. The effectiveness of the proposed approach is demonstrated by training RoManNet to learn several challenging multi-step robotic manipulation tasks in both simulation and real-world. Experimental results show that the proposed method outperforms the existing methods and achieves state-of-the-art performance in terms of success rate and action efficiency. The ablation studies show that TPG and LAE are especially beneficial for tasks like multiple block stacking."--Abstract.

Computers

A Concise Introduction to Models and Methods for Automated Planning

Book Details:

Author : Hector Radanovic
Publisher : Springer Nature
Release : 2022-05-31
ISBN : 3031015649
Pages : 132 pages

Download or read book A Concise Introduction to Models and Methods for Automated Planning written by Hector Radanovic and published by Springer Nature. This book was released on 2022-05-31 with total page 132 pages. Available in PDF, EPUB and Kindle. Book excerpt: Planning is the model-based approach to autonomous behavior where the agent behavior is derived automatically from a model of the actions, sensors, and goals. The main challenges in planning are computational as all models, whether featuring uncertainty and feedback or not, are intractable in the worst case when represented in compact form. In this book, we look at a variety of models used in AI planning, and at the methods that have been developed for solving them. The goal is to provide a modern and coherent view of planning that is precise, concise, and mostly self-contained, without being shallow. For this, we make no attempt at covering the whole variety of planning approaches, ideas, and applications, and focus on the essentials. The target audience of the book are students and researchers interested in autonomous behavior and planning from an AI, engineering, or cognitive science perspective. Table of Contents: Preface / Planning and Autonomous Behavior / Classical Planning: Full Information and Deterministic Actions / Classical Planning: Variations and Extensions / Beyond Classical Planning: Transformations / Planning with Sensing: Logical Models / MDP Planning: Stochastic Actions and Full Feedback / POMDP Planning: Stochastic Actions and Partial Feedback / Discussion / Bibliography / Author's Biography

Technology & Engineering

Springer Handbook of Robotics

Book Details:

Author : Bruno Siciliano
Publisher : Springer
Release : 2016-07-27
ISBN : 3319325523
Pages : 2259 pages

Download or read book Springer Handbook of Robotics written by Bruno Siciliano and published by Springer. This book was released on 2016-07-27 with total page 2259 pages. Available in PDF, EPUB and Kindle. Book excerpt: The second edition of this handbook provides a state-of-the-art overview on the various aspects in the rapidly developing field of robotics. Reaching for the human frontier, robotics is vigorously engaged in the growing challenges of new emerging domains. Interacting, exploring, and working with humans, the new generation of robots will increasingly touch people and their lives. The credible prospect of practical robots among humans is the result of the scientific endeavour of a half a century of robotic developments that established robotics as a modern scientific discipline. The ongoing vibrant expansion and strong growth of the field during the last decade has fueled this second edition of the Springer Handbook of Robotics. The first edition of the handbook soon became a landmark in robotics publishing and won the American Association of Publishers PROSE Award for Excellence in Physical Sciences & Mathematics as well as the organization’s Award for Engineering & Technology. The second edition of the handbook, edited by two internationally renowned scientists with the support of an outstanding team of seven part editors and more than 200 authors, continues to be an authoritative reference for robotics researchers, newcomers to the field, and scholars from related disciplines. The contents have been restructured to achieve four main objectives: the enlargement of foundational topics for robotics, the enlightenment of design of various types of robotic systems, the extension of the treatment on robots moving in the environment, and the enrichment of advanced robotics applications. Further to an extensive update, fifteen new chapters have been introduced on emerging topics, and a new generation of authors have joined the handbook’s team. A novel addition to the second edition is a comprehensive collection of multimedia references to more than 700 videos, which bring valuable insight into the contents. The videos can be viewed directly augmented into the text with a smartphone or tablet using a unique and specially designed app. Springer Handbook of Robotics Multimedia Extension Portal: http://handbookofrobotics.org/

Learning Affordance Environment and Interaction Representations by Watching People in Video

Book Details:

Author : Tushar Nagarajan
Publisher :
Release : 2022
ISBN :
Pages : 0 pages

Download or read book Learning Affordance Environment and Interaction Representations by Watching People in Video written by Tushar Nagarajan and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Understanding how to use a physical environment is a key requirement for embodied AI agents operating in human spaces. This involves both navigation --- identifying obstacles to avoid, or semantically relevant objects to search for and move towards --- and interaction with the environment --- for example, opening doors, turning-on lights, picking and placing objects, using tools like knives, or appliances like dishwashers. Current embodied vision models are proficient at detecting and naming things (objects, obstacles, scenes), making them especially qualified for the former navigation challenges. However, they are often unconcerned about how things around them could be used, how they change upon interaction, and how they influence agent behavior. These factors are crucial for building embodied agents that can interact with and modify their environment to accomplish complex tasks. In contrast, humans can both navigate in and interact with their surroundings from years of experience, and can effortlessly reason about their actions in the context of new environments. Videos of human activity thus offer an immediate window into this experience, and can be used to target the deficiencies of current embodied vision models. In this dissertation, I study how to upgrade embodied vision models with the capacity to reason about objects and their persistent environment, and model agent actions within it by watching humans navigate in and use their own environment. I study this problem along three axes: Objects First, I study how objects change and transform visually. Objects are dynamic, have specific functionality and occur in various states. I propose a model that characterizes the underlying visual transformation that causes object state changes (e.g. a whole apple --> sliced apple). While this models the dynamic nature of objects in isolation, it does not reveal how an agent would interact with them to cause such changes. To this end, I propose a method to anticipate "interaction hotspots" --- a spatial affordance map indicating how an object would be manipulated during different interactions --- directly from video of people using these objects, and without the need for manual labels. Environments Next, shifting the focus from individual objects to the overall physical space that the agent uses, I propose approaches to learn environment-centric representations from egocentric video that encode the environment at two levels --- global and local. On the global level, I propose a topological graph structured video representation that links first-person video of human activity with the environment zones it is captured in. The collective knowledge of activities that take place in each zone naturally reveal environment affordances ("what actions can I do here?") and produces a strong encoding of long-term activity. On the local level, I propose an approach to learn representations that are predictive of the camera-wearer's (potentially unseen) local surroundings using walkthroughs from simulated agents in 3D environments. Such a representation implicitly widens the field of view for tasks that reason about short video clips by providing a way to access features of their surroundings in a persistent, geometrically consistent manner. Embodied skill Finally, equipped with environment-aware embodied vision models, I propose approaches to leverage them to solve downstream interaction tasks. I propose a model to learn priors from egocentric videos about how humans bring objects together to enable activities, and encode them into a reward function for embodied agents that allows efficient exploration of interactions while solving complex, multi-step tasks. Further, in new environments where a person is unfamiliar with the objects, tools or the space itself, I devise an approach that trains visual affordance models of where actions are likely to succeed directly from agent actions. I introduce the "interaction exploration" task where an agent must autonomously discover objects to interact with (and what interactions are valid with them) to train the affordance model online, thus priming them to address various downstream interaction tasks. My results show the importance of learning visual models of objects, environments and the agents interacting with them, and highlights how to build them by watching people. For object understanding, my research enables training models that go beyond modeling appearance based cues (what objects look like), and instead capture functional properties (how objects are used). My research on environment understanding leads to video models that not just encode the instantaneous activity in a short window of time, but that produce representations of human activity that are grounded in the structure of the underlying environment over long time periods. Finally, for embodied skill learning, my research paves the way for embodied agents to benefit from the object and environment understanding models I have proposed, thus providing a path for human experience to directly benefit agent skill learning. In the last chapter of my dissertation, I discuss future research directions leading to my overarching research goal beyond this dissertation

Political Science

AI and education

Book Details:

Author : Miao, Fengchun
Publisher : UNESCO Publishing
Release : 2021-04-08
ISBN : 9231004476
Pages : 50 pages

Download or read book AI and education written by Miao, Fengchun and published by UNESCO Publishing. This book was released on 2021-04-08 with total page 50 pages. Available in PDF, EPUB and Kindle. Book excerpt: Artificial Intelligence (AI) has the potential to address some of the biggest challenges in education today, innovate teaching and learning practices, and ultimately accelerate the progress towards SDG 4. However, these rapid technological developments inevitably bring multiple risks and challenges, which have so far outpaced policy debates and regulatory frameworks. This publication offers guidance for policy-makers on how best to leverage the opportunities and address the risks, presented by the growing connection between AI and education. It starts with the essentials of AI: definitions, techniques and technologies. It continues with a detailed analysis of the emerging trends and implications of AI for teaching and learning, including how we can ensure the ethical, inclusive and equitable use of AI in education, how education can prepare humans to live and work with AI, and how AI can be applied to enhance education. It finally introduces the challenges of harnessing AI to achieve SDG 4 and offers concrete actionable recommendations for policy-makers to plan policies and programmes for local contexts. [Publisher summary, ed]