To achieve optimal robot behavior in dynamic scenarios we need to consider complex dynamics in a predictive manner. In the vehicle dynamics community, it is well know that to achieve time-optimal driving on low friction surface, the vehicle should utilize drifting. Hence, many authors have devised rules to split circuits and employ drifting on some segments. These rules are suboptimal and do not generalize to arbitrary circuit shapes (e.g., S-like curves). So, the question “When to go into which mode and how to drive in it?” remains unanswered. To choose the suitable mode (discrete decision), the algorithm needs information about the feasibility of different modes (continuous motion). This makes it a class of Task and Motion Planning (TAMP) problems, which are known to be hard to solve optimally in real-time. In the AI planning community, search methods are commonly used. However, they cannot be directly applied to TAMP problems due to the continuous component. Here, we present a search-based method that effectively solves this problem and efficiently searches in a highly dimensional state space with nonlinear and unstable dynamics. The space of the possible trajectories is explored by sampling different combinations of motion primitives guided by the search. Our approach allows to use multiple locally approximated models to generate motion primitives (e.g., learned models of drifting) and effectively simplify the problem without losing accuracy. The algorithm performance is evaluated in simulated driving on a mixed-track with segments of different curvatures (right and left). Our code is available at https://git.io/JenvB.
Robotic Packaging Optimization with Reinforcement Learning
Eveline Drijver, Rodrigo Pérez-Dattari, Jens Kober, Cosimo Della Santina, and Zlatan Ajanović
In 2023 IEEE International Conference on Automation Science and Engineering, 2023
Intelligent manufacturing is becoming increasingly important due to the growing demand for maximizing productivity and flexibility while minimizing waste and lead times. This work investigates automated secondary robotic food packaging solutions that transfer food products from the conveyor belt into containers. A major problem in these solutions is varying product supply which can cause drastic productivity drops. Conventional rule-based approaches, used to address this issue, are often inadequate, leading to violation of the industry’s requirements. Reinforcement learning, on the other hand, has the potential of solving this problem by learning responsive and predictive policy, based on experience. However, it is challenging to utilize it in highly complex control schemes. In this paper, we propose a reinforcement learning framework, designed to optimize the conveyor belt speed while minimizing interference with the rest of the control system. When tested on real-world data, the framework exceeds the performance requirements (99.8% packed products) and maintains quality (100% filled boxes). Compared to the existing solution, our proposed framework improves productivity, has smoother control, and reduces computation time.
2022
Vision for Bosnia and Herzegovina in Artificial Intelligence Age: Global Trends, Potential Opportunities, Selected Use-cases and Realistic Goals
Zlatan Ajanović, Emina Aličković, Aida Branković, Sead Delalić, Eldar Kurtić, Salem Malikić, Adnan Mehonić, Hamza Merzić, Kenan Šehić, and Bahrudin Trbalić
In Scientific-Professional Conference “Artificial Intelligence in Bosnia and Herzegovina”- Research, Application and Development Perspectives, 2022
Artificial Intelligence (AI) is one of the most promising technologies of the 21. century, with an already noticeable impact on society and the economy. With this work, we provide a short overview of global trends, applications in industry and selected use-cases from our international experience and work in industry and academia. The goal is to present global and regional positive practices and provide an informed opinion on the realistic goals and opportunities for positioning B&H on the global AI scene.
Interactive Imitation Learning in Robotics: A Survey
Carlos Celemin, Rodrigo Pérez-Dattari, Eugenio Chisari, Giovanni Franzese, Leandro Souza Rosa, Ravi Prakash, Zlatan Ajanović, Marta Ferraz, Abhinav Valada, and Jens Kober
Interactive Imitation Learning (IIL) is a branch of Imitation Learning (IL) where human feedback is provided intermittently during robot execution allowing an online improvement of the robot’s behavior.
In recent years, IIL has increasingly started to carve out its own space as a promising data-driven alternative for solving complex robotic tasks. The advantages of IIL are twofold, 1) it is data-efficient, as the human feedback guides the robot directly towards an improved behavior (in contrast with Reinforcement Learning (RL), where behaviors must be discovered by trial and error), and 2) it is robust, as the distribution mismatch between the teacher and learner trajectories is minimized by providing feedback directly over the learner’s trajectories (as opposed to offline IL methods such as Behavioral Cloning).
Nevertheless, despite the opportunities that IIL presents, its terminology, structure, and applicability are not clear nor unified in the literature, slowing down its development and, therefore, the research of innovative formulations and discoveries.
In this work, we attempt to facilitate research in IIL and lower entry barriers for new practitioners by providing a survey of the field that unifies and structures it. In addition, we aim to raise awareness of its potential, what has been accomplished and what are still open research questions.
We organize the most relevant works in IIL in terms of human-robot interaction (i.e., types of feedback), interfaces (i.e., means of providing feedback), learning (i.e., models learned from feedback and function approximators), user experience (i.e., human perception about the learning process), applications, and benchmarks. Furthermore, we analyze similarities and differences between IIL and RL, providing a discussion on how the concepts offline, online, off-policy and on-policy learning should be transferred to IIL from the RL literature.
We particularly focus on robotic applications in the real world and discuss their implications, limitations, and promising future areas of research.
PARTNR: Pick and Place Ambiguity Resolving by Trustworthy iNteractive leaRning
Jelle Luijkx, Zlatan Ajanovic, Laura Ferranti, and Jens Kober
Several recent works show impressive results in mapping language-based human commands and image scene observations to direct robot executable policies (e.g., pick and place poses). However, these approaches do not consider the uncertainty of the trained policy and simply always execute actions suggested by the current policy as the most probable ones. This makes them vulnerable to domain shift and inefficient in the number of required demonstrations. We extend previous works and present the PARTNR algorithm that can detect ambiguities in the trained policy by analyzing multiple modalities in the pick and place poses using topological analysis. PARTNR employs an adaptive, sensitivity-based, gating function that decides if additional user demonstrations are required. User demonstrations are aggregated to the dataset and used for subsequent training. In this way, the policy can adapt promptly to domain shift and it can minimize the number of required demonstrations for a well-trained policy. The adaptive threshold enables to achieve the user-acceptable level of ambiguity to execute the policy autonomously and in turn, increase the trustworthiness of our system. We demonstrate the performance of PARTNR in a table-top pick and place task.
2020
A Multi-Heuristic Search-Based Motion Planning for Autonomous Parking
Bhargav Adabala, and Zlatan Ajanovic
In 30th International Conference on Automated Planning and Scheduling: Planning and Robotics Workshop, 2020
Planning is a crucial component of autonomous vehicle control. It is responsible for finding a collision-free sequence of states that take the vehicle towards its goal. In unstructured environments like parking lots or construction sites, due to the large search-space and kinodynamic constraints of the vehicle, real-time planning is challenging. Several state-ofthe-art solutions utilize heuristic search-based planning algorithms. However, they heavily rely on the quality of the single heuristic function used to guide the search, and they are not capable to achieve reasonable performance, resulting in unnecessary delays in the response of the vehicle. This work solves the planning problem by adopting a Multi-Heuristic Search approach, that enables the use of multiple heuristic functions and their advantages to capture different complexities of a given search space. Based on our knowledge, this approach was not used for this domain so far. For this purpose, multiple admissible and non-admissible heuristic functions are defined, original Multi-Heuristic A* Search was extended for bidirectional use and dealing with hybrid continuous-discrete search space and a mechanism for adapting scale of motion primitives is introduced. To demonstrate the advantage, Multi-Heuristic A* algorithm is benchmarked against a very popular heuristic search-based algorithm, Hybrid A*. The Multi-Heuristic A* algorithm outperformed Hybrid A* in terms of computation efficiency and motion plan (path) quality.
Search-Based Motion Planning for Performance Autonomous Driving
Zlatan Ajanovic, Enrico Regolin, Georg Stettinger, Martin Horn, and Antonella Ferrara
In Advances in Dynamics of Vehicles on Roads and Tracks, 2020
Driving on the limits of vehicle dynamics requires predictive planning of future vehicle states. In this work, a search-based motion planning is used to generate suitable reference trajectories of dynamic vehicle states with the goal to achieve the minimum lap time on slippery roads. The search-based approach enables to explicitly consider a nonlinear vehicle dynamics model as well as constraints on states and inputs so that even challenging scenarios can be achieved in a safe and optimal way. The algorithm performance is evaluated in simulated driving on a track with segments of different curvatures. Our code is available at https://git.io/JenvB.
Closed-loop validation of autonomous vehicles is an open problem, significantly influencing development and adoption of this technology. The main contribution of this paper is a novel approach to reproducible, scenario-based validation that decouples the problem into several sub-problems, while avoiding to brake the crucial couplings. First, a realistic scenario is generated from the real urban traffic. Second, human participants, drive in a virtual scenario (in a driving simulator), based on the real traffic. Third, human and automated driving trajectories are reproduced and compared in the real vehicle on an empty track without traffic. Thus, benefits of automation with respect to safety, efficiency and comfort can be clearly benchmarked in a reproducible manner. Presented approach is used to benchmark performance of SBOMP planner in one scenario and validate SuperHuman driving performance.
2019
A Novel Approach to Model Exploration for Value Function Learning
Zlatan Ajanovic, Halil Beglerovic, and Bakir Lacevic
In RSS 2019 Workshop on Combining Learning and Reasoning, 2019
Planning and Learning are complementary approaches. Planning relies on deliberative reasoning about the current state and sequence of future reachable states to solve the problem. Learning, on the other hand, is focused on improving system performance based on experience or available data. Learning to improve the performance of planning based on experience in similar, previously solved problems, is ongoing research. One approach is to learn Value function (cost-to-go) which can be used as heuristics for speeding up search-based planning. Existing approaches in this direction use the results of the previous search for learning the heuristics. In this work, we present a search-inspired approach of systematic model exploration for the learning of the value function which does not stop when a plan is available but rather prolongs search such that not only resulting optimal path is used but also extended region around the optimal path. This, in turn, improves both the efficiency and robustness of successive planning. Additionally, the effect of losing admissibility by using ML heuristic is managed by bounding ML with other admissible heuristics.
This thesis presents behavior and motion planning method that enables Autonomous Vehicles (AV) to achieve SuperHuman driving performance in terms of safety, efficiency and comfort. The developed method enables synergy of research in behavior and motion planning for Automated Driving, with research in eco-driving community, which target mainly complementary problem variations. Established approach in eco-driving is considering long planning horizons and multiple constraints (i.e. traffic lights, speed limits, etc.), but exclusively single lane driving. On the other hand, motion planning for Automated Driving considers multilane driving, but short planning horizons and decoupled (or hierarchical) solutions, focused on effectively reacting to the changing situations, and not on the long-term optimal behavior.As a result of the synergy, developed search-based optimal motion planning (SBOMP) solution enables optimal Automated Driving scalable to various challenging scenarios in urban, rural and highway environment. As a highlight, SBOMP enables, what is believed to be, the first demonstration of optimal multilane diving in dense traffic with traffic lights, while achieving SuperHuman driving performance. Even though, this scenario is pretty common in everyday driving, it was not tackled by any of these research communities before.The presented SBOMP framework is also extended to the third use-case, Performance Driving. By considering a more detailed vehicle model, SBOMP enables minimum lap-time driving on a slippery road, effectively entering and exiting drifting maneuvers and switching between right and left turns.The presented work is extensively tested in simulation, benchmarked with human driving behavior acquired in driving simulator study and in-vehicle testing on proving ground. The results show that in challenging urban driving scenario with traffic lights, AV outperforms even the best human drivers in terms of safety, efficiency and comfort. While human drivers violate traffic rules and even cause crashes, by using predictive planning, AV manages to drive smoothly through the traffic.Hopefully, this work contributes to the effort that Autonomous Vehicles become the first mass product of intelligent mobile robots in our society.
Predictive motion planning is a key for achieving energy-efficient driving, which is one of the major visions of automated driving nowadays. Motion planning is a challenging task, especially in the presence of other dynamic traffic participants. Two main issues have to be addressed. First, for globally optimal driving, the entire trip has to be considered at once. Second, the movement of other traffic participants is usually not known in advance. Both issues lead to increased computational effort. The length of the prediction horizon is usually large and the problem of unknown future movement of other traffic participants usually requires frequent replanning. This work proposes a novel motion planning approach for vehicles operating in dynamic environments. The above-mentioned problems are addressed by splitting the planning into a strategic planning part and situation-dependent replanning part. Strategic planning is done without considering other dynamic participants and is reused later in order to lower the computational effort during replanning phase.
A Novel Model-Based Heuristic for Energy-Optimal Motion Planning for Automated Driving
Predictive motion planning is the key to achieve energy-efficient driving, which is one of the main benefits of automated driving. Researchers have been studying the planning of velocity trajectories, a simpler form of motion planning, for over a decade now and many different methods are available. Dynamic programming has shown to be the most common choice due to its numerical background and ability to include nonlinear constraints and models. Although planning of an optimal trajectory is done in a systematic way, dynamic programming does not use any knowledge about the considered problem to guide the exploration and therefore explores all possible trajectories.
A* is a search algorithm which enables using knowledge about the problem to guide the exploration to the most promising solutions first. Knowledge has to be represented in a form of a heuristic function, which gives an optimistic estimate of cost for transitioning to the final state, which is not a straightforward task. This paper presents a novel heuristics incorporating air drag and auxiliary power as well as operational costs of the vehicle, besides kinetic and potential energy and rolling resistance known in the literature. Furthermore, optimal cruising velocity, which depends on vehicle aerodynamic properties and auxiliary power, is derived. Results are compared for different variants of heuristic functions and dynamic programming as well.
Safe Learning-Based Optimal Motion Planning for Automated Driving
Zlatan Ajanovic, Bakir Lacevic, Georg Stettinger, Daniel Watzenig, and Martin Horn
In ICML/IJCAI/AAMAS 2018 Workshop on Planning and Learning (PAL-18), Dec 2018
This paper presents preliminary work on learning the search heuristic for the optimal motion planning for automated driving in urban traffic. Previous work considered search-based optimal motion planning framework (SBOMP) that utilized numerical or model-based heuristics that did not consider dynamic obstacles. Optimal solution was still guaranteed since dynamic obstacles can only increase the cost. However, significant variations in the search efficiency are observed depending whether dynamic obstacles are present or not. This paper introduces machine learning (ML) based heuristic that takes into account dynamic obstacles, thus adding to the performance consistency for achieving real-time implementation.
Search-Based Optimal Motion Planning for Automated Driving
Zlatan Ajanovic, Bakir Lacevic, Barys Shyrokau, Michael Stolz, and Martin Horn
In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Dec 2018
This paper presents a framework for fast and robust motion planning designed to facilitate automated driving. The framework allows for real-time computation even for horizons of several hundred meters and thus enabling automated driving in urban conditions. This is achieved through several features. Firstly, a convenient geometrical representation of both the search space and driving constraints enables the use of classical path planning approach. Thus, a wide variety of constraints can be tackled simultaneously (other vehicles, traffic lights, etc.). Secondly, an exact cost-to-go map, obtained by solving a relaxed problem, is then used by A*-based algorithm with model predictive flavour in order to compute the optimal motion trajectory. The algorithm takes into account both distance and time horizons. The approach is validated within a simulation study with realistic traffic scenarios. We demonstrate the capability of the algorithm to devise plans both in fast and slow driving conditions, even when full stop is required.
2017
Energy Efficient Driving in Dynamic Environment: Considering Other Traffic Participants and Overtaking Possibility
Zlatan Ajanović, Michael Stolz, and Martin Horn
In Comprehensive Energy Management – Eco Routing & Velocity Profiles, Dec 2017
This chapter studies energy efficient driving of (semi)autonomous electric vehicles operating in a dynamic environment with other traffic participants on a unidirectional, multi-lane road. This scenario is considered to be a so called hard problem, as constraints imposed are varying in time and space. Neglecting the constraints imposed from the surrounding traffic, the generation of an energy optimal speed trajectory may lead to bad results, with the risk of low driver acceptance when applied in a real driving environment. An existing approach satisfies constraints from surrounding traffic by modifying an existing unconstrained trajectory. In contrast to this, the proposed approach incorporates a leading vehicle’s motion as constraint in order to generate a new optimal speed trajectory in a global optimal sense. First simulation results show that energy optimal driving considering other vehicle participants is important. Even in simple setups significantly (8%) less energy is consumed at only 1.3% travelling time prolongation compared to the best constant speed driving strategy. Additionally, the proposed driving strategy is using 4.5% less energy and leads to 1.6% shorter travelling time compared to the existing overtaking approach. Using simulation studies, the proposed energy optimal driving strategy is analyzed in different scenarios.