Algorithm Selection in Optimization and Application to Angry Birds • Shahaf S. Shperberg, Avinoam Yehezkel, Solomon Eyal Shimony · [AWARD] Algorithm Selection in Optimization and Application to Angry Birds Shahaf S. Shperberg (Ben-Gurion University), Avinoam Yehezkel (Ben-Gurion University), Solomon Eyal Shimony (Ben-Gurion University) Consider the MaxScore algorithm selection problem: given some optimization problem instances, a set of algorithms that solve them, and a time limit, what is the optimal policy for selecting (algorithm, instance) runs so as to maximize the sum of solution qualities for all problem instances? We analyze the computational complexity of restrictions of MaxScore (NP-hard), and provide a dynamic programming approximation algorithm. This algorithm, as well as new greedy algorithms, are evaluated empirically on data from agent runs on Angry Birds problem instances. Results show a significant improvement over a hyper-agent greedy scheme from related work. li.ca.ugb.tsop@hsbrephs Efficient Heuristic Search for Optimal Environment Redesign • Sarah Keren, Luis Pineda, Avigdor Gal, Erez Karpas, Shlomo Zilberstein · [AWARD] Efficient Heuristic Search for Optimal Environment Redesign Sarah Keren (Harvard University), Luis Pineda (University of Massachusetts Amherst), Avigdor Gal (Technion), Erez Karpas (Technion), Shlomo Zilberstein (University of Massachusetts Amherst) Given an environment, the utility measure of the agents acting within it, a set of possible environment modifications, and a description of design constraints, the objective of equi-reward utility maximizing design (ER-UMD) is to find a valid sequence of modifications to apply to the environment in order to maximize agent utility. To efficiently traverse the typically large space of possible design options, we use heuristic search and propose new heuristics, which relax the design process; instead of computing the value achieved by a single modification, we use a dominating modification guaranteed to be at least as beneficial. The proposed technique enables heuristic caching for similar nodes thereby saving computational overhead. We specify sufficient conditions under which our approach is guaranteed to produce admissible estimates, and describe a range of models that comply with these requirements. Also, for models with lifted representations of environment modifications, we provide simple methods to automatically generate dominating modifications. We evaluate our approach on a range of stochastic settings for which our heuristic is admissible. We demonstrate its efficiency by comparing it to a previously suggested heuristic, that employs a relaxation of the environment, and to a compilation from ER-UMD to planning. moc.liamg@nerek.e.haras Robust Operations Management on Mars • Michael Saint-Guillain · [AWARD] Robust Operations Management on Mars Michael Saint-Guillain (Université catholique de Louvain) We compare both deterministic and robust stochastic approaches to the problem of scheduling a set of scientific tasks under processing time uncertainty. While dealing with strict time windows and minimum transition time constraints, we provide closed-form expressions to compute the exact probability that a solution has to remain feasible. Experiments, taking uncertainty on the stochastic knowledge itself into account, are conducted on real instances involving the constraints faced and objectives pursued during a recent two-week Mars analog mission in the desert of Utah, USA. The results reveal that, even when using very bad approximations of probability distributions, solutions computed from the stochastic models we introduce, significantly outperform the ones obtained from a classical deterministic formulation, while preserving most of the solution’s quality. eb.niavuolcu@tnias.leahcim Quantifying Degrees of Controllability in Temporal Networks with Uncertainty • Shyan Akmal, Savana Ammons, Hemeng Li, James C. Boerkoel, Jr. · Best Student Paper Award (Runner-up) Quantifying Degrees of Controllability in Temporal Networks with Uncertainty Shyan Akmal (Harvey Mudd College), Savana Ammons (Harvey Mudd College), Hemeng Li (Harvey Mudd College), James C. Boerkoel, Jr. (Harvey Mudd College) Controllability for Simple Temporal Networks with Uncertainty (STNUs) has thus far been limited to three levels: strong, dynamic, and weak. Because of this, there is currently no systematic way for an agent to assess just how far from being controllable an uncontrollable STNU is. We use a new geometric interpretation of STNUs to introduce the degrees of strong and dynamic controllability – continuous metrics that measure how far a network is from being controllable. We utilize these metrics to approximate the probabilities that an STNU can be dispatched successfully offline and online respectively. We introduce new methods for predicting the degrees of strong and dynamic controllability for uncontrollable networks. In addition, we show empirically that both metrics are good predictors of the actual dispatch success rate. ude.cmh@lamkas Backward Sequence Analysis for Single-Armed Cluster Tools • Jun-Ho Lee, Hyun-Jung Kim · [AWARD] Backward Sequence Analysis for Single-Armed Cluster Tools Jun-Ho Lee (Konkuk University), Hyun-Jung Kim (Sungkyunkwan University) This paper analyzes a backward sequence for single-armed cluster tools with processing time variations. We first define a fundamental cycle with the backward sequence and derive a formula for the cycle time by considering processing time variations. We then develop conditions for which the backward sequence is optimal. The upper bound on the average cycle time from the backward sequence is also analyzed. We then show experimentally that the sequence performs well even with processing time variations. ude.ukks@jh.mik On the Pathological Search Behavior of Distributed Greedy Best First Search • Ryo Kuroiwa, Alex Fukunaga · [AWARD] On the Pathological Search Behavior of Distributed Greedy Best First Search Ryo Kuroiwa (The University of Tokyo), Alex Fukunaga (The University of Tokyo) Although A* search can be efficiently parallelized using methods such as Hash-Distributed A* (HDA*), distributed parallelization of Greedy Best First Search (GBFS), a sub-optimal search which often finds solutions much faster than A*, has received little attention. We show that surprisingly, HDGBFS, an adaptation of HDA* to GBFS, often performs significantly worse than sequential GBFS. We analyze and explain this performance degradation, and propose a novel method for distributed parallelization of GBFS, which significantly outperforms HDGBFS. moc.liamg@eoeghm An Empirical Study of Perfect Potential Heuristics • Augusto B. Corrêa, Florian Pommerening · [AWARD] An Empirical Study of Perfect Potential Heuristics Augusto B. Corrêa (University of Basel), Florian Pommerening (University of Basel) Potential heuristics are weighted functions over state features of a planning task. A recent study defines the complexity of a task as the minimum required feature complexity for a potential heuristic that makes a search backtrack-free. This gives an indication of how complex potential heuristics need to be to achieve good results in satisficing planning. However, these results do not directly transfer to optimal planning. In this paper, we empirically study how complex potential heuristics must be to represent the perfect heuristic and how close to perfect heuristcs can get with a limited number of features. We aim to identify the practical trade-offs between size, complexity and time for the quality of potential heuristics. Our results show that, even for simple planning tasks, finding perfect potential heuristics might be harder than expected. hc.sabinu@aerrocsaalb.otsugua Error-Tolerant Anytime Approach for Plan Recognition using a Particle Filter • Jean Massardi, Mathieu Gravel, Éric Beaudry · [AWARD] Error-Tolerant Anytime Approach for Plan Recognition using a Particle Filter Jean Massardi (Université du Québec à Montréal), Mathieu Gravel (Université du Québec à Montréal), Éric Beaudry (Université du Québec à Montréal) Classical plan recognition approaches require restrictive assumptions and are generally off-line. However, many real-world applications of plan recognition need to deal with real-time constraints, noisy information, temporal relations in actions, agent preferences, and so on. Many existing approaches tried to relax assumptions, but none can deal with all previously cited needs. This paper proposes an extension of previous works on plan recognition based on plan tree grammar. Our anytime top-down approach uses a particle filter. This approach manages to give a quick reliable solution to the plan recognition problem while dealing with noisy observations and without reducing the expressiveness of plan libraries. We show empirical evidence of efficiency by using this approach on simulated problems. ac.maqu.reirruoc@naej.idrassam Planning with Global State Constraints and State-Dependent Action Costs • Franc Ivankovic, Patrik Haslum, Dan Gordon · [AWARD] Planning with Global State Constraints and State-Dependent Action Costs Franc Ivankovic (Data61), Patrik Haslum (Australian National University, Data61), Dan Gordon (Australian National University) Planning with global state constraints refers to an extension of classical planning in which some properties of each state are derived via a set of equations, rules or constraints. This extension enables more elegant modeling of networked physical systems such as power grids. So far, planning research in this setting focused on domains where action costs are constant, rather than a function of a state in which the action is applied. This restriction prevents us from accurately specifying the objective in some real-world domains, leading to generation of sub-optimal plans. For example, when reconfiguring a power network, we often need to temporarily leave some users without electricity for a certain amount of time, and in such circumstances it is desirable to reduce the unsupplied load over the total time span. This preference can be expressed using state-dependent action costs. We extend planning with global state constraints to include state-dependent action costs, adapt abstraction heuristics to this setting, and show improved performance on a set of problems. moc.liamg@civoknavikecnarf Tree-REX: SAT-based Tree Exploration for Efficient and High-Quality HTN Planning • Dominik Schreiber, Tomáš Balyo, Damien Pellier, Humbert Fiorino · [AWARD] Tree-REX: SAT-based Tree Exploration for Efficient and High-Quality HTN Planning Dominik Schreiber (Karlsruhe Institute of Technology), Tomáš Balyo (Karlsruhe Institute of Technology), Damien Pellier (Laboratoire d'Informatique de Grenoble), Humbert Fiorino (Laboratoire d'Informatique de Grenoble) In this paper, we propose a novel SAT-based planning approach to solve totally ordered hierarchical planning problems. Our approach called "Tree-like Reduction Exploration" (Tree-REX) makes two contributions: (1) it allows us to rapidly solve hierarchical planning problems by making effective use of incremental SAT solving, and (2) it implements an anytime approach that gradually improves plan quality (makespan) as time resources are allotted. Incremental SAT solving is important as it reduces the encoding volume of planning problems, it builds on information obtained from previous search iterations and speeds up the search for plans. We show that Tree-REX outperforms state-of-the-art SAT-based HTN planning with respect to run times and plan quality on most of the considered IPC domains. ed.rebierhcskinimod@liam Explicability? Legibility? Predictability? Transparency? Privacy? Security? The Emerging Landscape of Interpretable Robot Behavior • Tathagata Chakraborti, Anagha Kulkarni, Sarath Sreedharan, David Smith, Subbarao Kambhampati · [AWARD] Explicability? Legibility? Predictability? Transparency? Privacy? Security? The Emerging Landscape of Interpretable Robot Behavior Tathagata Chakraborti (IBM Research), Anagha Kulkarni (Arizona State University), Sarath Sreedharan (Arizona State University), David Smith, Subbarao Kambhampati (Arizona State University) There has been significant interest of late in generating behavior of agents that is interpretable to the human (observer) in the loop. However, the work in this area has typically lacked coherence on the topic, with proposed solutions for “explicable”, “legible”, “predictable” and “transparent” planning with overlapping, and sometimes conflicting, semantics all aimed at some notion of understanding what intentions the observer will ascribe to an agent by observing its behavior. This is also true for the recent works on “security” and “privacy” of plans which are also trying to answer the same question, but from the opposite point of view – i.e. when the agent is trying to hide instead of reveal its intentions. This paper attempts to provide a workable taxonomy of relevant concepts in this exciting and emerging field of inquiry. ude.usa@itrobarkahC.atagahtaT Improving the Combination of JPS and Geometric Containers • Yue Hu, Long Qin, Quanjun Yin, Daniel Harabor, Cong Hu · [AWARD] Improving the Combination of JPS and Geometric Containers Yue Hu (National University of Defense Technology), Long Qin (National University of Defense Technology), Quanjun Yin (National University of Defense Technology), Daniel Harabor (Monash University), Cong Hu (National University of Defense Technology) The JPS family of grid-based pathfinding algorithms can be improved with preprocessing methods such as Geometric Containers. However, such enhancements require a Dijkstra search for every node in the grid and the space and time costs of all this additional computation can be prohibitive. In this work we consider an alternative approach where we run Dijkstra only from every node where a jump point is located. We also compute and store geometric containers only for those outgoing edges which are consistent with the diagonal-first ordering in JPS. Since the number of jump points on a grid is usually much smaller than the total number of grid cells, we can save up to orders of magnitude of time and space. In addition to improving preprocessing overheads, we also present a partial expansion strategy which can improve the performance of online search by reducing the total number of operations on the open list. moc.liamg@esc.euyuh Mixed Discrete Continuous Non-Linear Planning Through Piecewise Linear Approximation • Elad Denenberg, Amanda Coles · [AWARD] Mixed Discrete Continuous Non-Linear Planning Through Piecewise Linear Approximation Elad Denenberg (King's College London), Amanda Coles (King's College London) Reasoning with numeric quantities that change continuously is vital to the application of planners in many real-world scenarios. Therefore, over recent years there has been a significant push, resulting in the development of several planners capable of handling this. Scalability remains a challenge for such planners, especially those capable of reasoning with non-linear numeric change. In this paper, we present a novel approach to reasoning with non-linear domains. Using linear over and under-estimators to bound the problem, allowing us to use scalable planners that handle linear change to find plans for non-linear domains. We compare the performance of our approach to existing planners capable of reasoning with non-linear change on several domains and demonstrate that our planner can achieve state-ofthe-art performance in non-linear planning. moc.liamg@neddale Subset-Saturated Cost Partitioning for Optimal Classical Planning • Jendrik Seipp, Malte Helmert · [AWARD] Subset-Saturated Cost Partitioning for Optimal Classical Planning Jendrik Seipp (University of Basel), Malte Helmert (University of Basel) Cost partitioning is a method for admissibly adding multiple heuristics for state-space search. Saturated cost partitioning considers the given heuristics in sequence, assigning to each heuristic the minimum fraction of remaining costs that it needs to preserve its estimates for all states. We generalize saturated cost partitioning by allowing to preserve the heuristic values of only a subset of states and show that this often leads to stronger heuristics. hc.sabinu@ppies.kirdnej Efficiently Exploring Ordering Problems through Conflict-directed Search • Jingkai Chen, Cheng Fang, David Wang, Andrew Wang, Brian Williams · [AWARD] Efficiently Exploring Ordering Problems through Conflict-directed Search Jingkai Chen (Massachusetts Institute of Technology), Cheng Fang (Massachusetts Institute of Technology), David Wang (Massachusetts Institute of Technology), Andrew Wang (Massachusetts Institute of Technology), Brian Williams (Massachusetts Institute of Technology) In planning and scheduling, solving problems with both state and temporal constraints is hard, since they may be highly coupled. Judicious orderings of events enable solvers to efficiently make decisions over sequences of actions to satisfy complex hybrid specifications. The ordering problem is thus fundamental to planning. Promising recent works have explored the ordering problem as search, incorporating a special tree structure for efficiency. However, such approaches only reason over partial order specifications. Having observed that an ordering is inconsistent with respect to underlying constraints, prior works do not exploit the tree structure to efficiently generate orderings which resolve the inconsistencies. In this paper, we present Conflict-directed Incremental Total Ordering (CDITO), a conflict-directed search method to incrementally and systematically generate event total orders given ordering constraints and conflicts returned by sub-solvers. Due to its ability to reason over these conflicts, CDITO is much more efficient than Incremental Total Ordering. We demonstrate this by benchmarking on temporal network configuration problems that involve routing network flows and managing bandwidth resources over time. ude.tim.liasc@nehckj Lazy CBS: Implict Conflict-Based Search Using Lazy Clause Generation • Graeme Gange, Daniel Harabor, Peter J. Stuckey · [AWARD] Lazy CBS: Implict Conflict-Based Search Using Lazy Clause Generation Graeme Gange (Monash University), Daniel Harabor (Monash University), Peter J. Stuckey (The University of Melbourne) Conflict-based Search (CBS) is a effective approach to optimal multi-agent path finding. However, performance of CBS approaches degrade rapidly in highly-contended graphs with many agents. One of the reasons this occurs is that CBS does not detect independent subproblems; i.e. it can re-solve the same conflicts between the same pairs of agents up to exponentially many times, each time along a different branch. Constraint programming approaches with nogood learning avoid this kind of duplication of effort by storing nogoods that record the reasons for conflicts. This can exponentially reduce search in constraint programming. In this work, we present Lazy CBS, a new approach to multi-agent pathfinding which replaces the high-level solver of CBS with a lazily constructed constraint programming model with nogoods. We use core-guided depth-first search to explore the space of conflicts and we detect along each branch reusable nogoods which help to quickly identify feasible solutions. Our experiments show that Lazy CBS can significantly improve on the state-of-the-art for optimal MAPF problems under the sumof-costs metric, especially in cases where there exists significant contention. ude.hsanom@egnag.emearg Tabu-Based Large Neighbourhood Search for Time/Sequence-Dependent Scheduling Problems with Time Windows • Lei He, Mathijs de Weerdt, Neil Yorke-Smith · [AWARD] Tabu-Based Large Neighbourhood Search for Time/Sequence-Dependent Scheduling Problems with Time Windows Lei He (National University of Defense Technology), Mathijs de Weerdt (Delft University of Technology), Neil Yorke-Smith (Delft University of Technology) An important class of scheduling problems is characterised by time-dependency and/or sequence-dependency with time windows. We introduce and analyze four algorithmic ideas for this class: a novel hybridization of adaptive large neighbourhood search (ALNS) and tabu search (TS), randomized generic neighbourhood operators, a partial sequence dominance heuristic, and a fast insertion strategy. An evaluation of the resulting hybrid algorithm on two domains, a realworld multi-orbit agile Earth observation satellite scheduling problem, and an order acceptance and scheduling problem, shows that it robustly outperforms a mixed integer programming method, a two-stage hybridization of ALNS and TS, and recent state-of-the-art metaheuristic methods. ln.tfledut@eh.l Propagating Piecewise-Linear Weights in Temporal Networks • Luke Hunsberger, Roberto Posenato · [AWARD] Propagating Piecewise-Linear Weights in Temporal Networks Luke Hunsberger (Vassar College), Roberto Posenato (Università degli Studi di Verona) This paper presents a novel technique using piecewise-linear functions (PLFs) as weights on edges in the graphs of two kinds of temporal networks to solve several previously open problems. Generalizing constraint-propagation rules to accommodate PLF weights requires implementing a small handful of functions. Most problems are solved by inserting one or more edges with an initial unknown weight (represented as a variable), and then using the modified rules to propagate the PLF weights. For one kind of network, a new set of propagation rules is introduced to avoid a non-termination issue that arises when propagating PLF weights. The paper also presents two new results for determining the tightest horizon that can be imposed while preserving a network's dynamic consistency/controllability. ti.rvinu@otanesop.otrebor Relaxed BDDs: An Admissible Heuristic for Delete-Free Planning Based on a Discrete Relaxation • Margarita Castro, Chiara Piacentini, Andre Augusto Cire, Chris Beck · [AWARD] Relaxed BDDs: An Admissible Heuristic for Delete-Free Planning Based on a Discrete Relaxation Margarita Castro (University of Toronto), Chiara Piacentini (University of Toronto), Andre Augusto Cire (University of Toronto), Chris Beck (University of Toronto) We investigate the use of relaxed binary decision diagrams (BDDs) as an alternative to linear programming (LP) for computing an admissible heuristic for the cost-optimal delete-free planning (DFP) problem. Our main contributions are the introduction of a novel BDD encoding, a construction algorithm for the sequential relaxation of a DFP task and a study of the effectiveness of relaxed BDD heuristics, both from a theoretical and practical perspective. We further show that relaxed BDDs can be used beyond heuristic computation to extract delete-free plans, find action landmarks, and identify redundant actions. Our empirical analysis shows that while BDD-based heuristics trail the state of the art, even small relaxed BDDs are competitive with the LP heuristic for the DFP task. ac.otnorotu.eim@ortsacpm Model Recognition as Planning • Diego Aineto, Sergio Jiménez, Eva Onaindia, Miquel Ramírez · [AWARD] Model Recognition as Planning Diego Aineto (Universitat Politècnica de València), Sergio Jiménez (Universitat Politècnica de València), Eva Onaindia (Universitat Politècnica de València), Miquel Ramírez (The University of Melbourne) Given a partially observed plan execution, and a set of possible planning models (models that share the same state variables but different action schemata), model recognition is the task of identifying the model that explains the observation. The paper formalizes this task and introduces a novel method that estimates the probability of a STRIPS model to produce an observation of a plan execution. This method builds on top of off-the-shelf classical planning algorithms and it is robust to missing actions and intermediate states in the observation. The effectiveness of the method is tested in three experiments, each encoding a set of different STRIPS models and all using empty-action observations: (1) a classical string classification task; (2) identification of the model that encodes a failure present in an observation; and (3) recognition of a robot navigation policy. se.vpu.cisd@ragiaeid A theoretical and algorithmic analysis of configurable MDPs • Rui Silva, Gabriele Farina, Francisco S. Melo, Manuela Veloso · [AWARD] A theoretical and algorithmic analysis of configurable MDPs Rui Silva (Instituto Superior Técnico and INESC-ID and Carnegie Mellon University), Gabriele Farina (Carnegie Mellon University), Francisco S. Melo (Instituto Superior Tecnico/INESC-ID), Manuela Veloso (Carnegie Mellon University) This paper analyzes, from theoretical and algorithmic perspectives, a class of problems recently introduced in the literature of Markov decision processes—configurable Markov decision processes. In this new class of problems we jointly optimize the probability transition function and associated optimal policy, in order to improve the performance of a decision-making agent. We contribute a complexity analysis on the problem from a computational perspective, where we show that, in general, solving a configurable MDP is NP-Hard. We also discuss practical challenges often faced in solving this class of problems. Additionally, we formally derive a gradient-based approach that sheds some light on the correctness and limitations of existing methods. We conclude by demonstrating the application of different parameterizations of configurable MDPs in two scenarios, offering a discussion on advantages and drawbacks from modeling and algorithmic perspectives. Our contributions set the foundation for a better understanding of this recent problem, and the wider applicability of the underlying ideas to different planning problems. tp.ltu.tsi@avlis.ariexiet.iur Towards a Unified View of AI Planning and Reactive Synthesis • Alberto Camacho, Meghyn Bienvenu, Sheila A. McIlraith · [AWARD] Towards a Unified View of AI Planning and Reactive Synthesis Alberto Camacho (University of Toronto), Meghyn Bienvenu (CNRS, University of Bordeaux), Sheila A. McIlraith (University of Toronto) AI automated planning and reactive synthesis are well established techniques for sequential decision making. In this paper we examine a collection of AI planning problems with temporally extended goals, specified in Linear Temporal Logic (LTL). We characterize these so-called LTL planning problems as two-player games and thereby establish their correspondence to reactive synthesis problems. This unifying view furthers our understanding of the relationship between plan and program synthesis, establishing complexity results for LTL planning tasks. Building on this correspondence, we identify restricted fragments of LTL for which plan synthesis can be realized more efficiently. ude.otnorot.sc@ohcamaca Online Risk-Bounded Motion Planning for Autonomous Vehicles in Dynamic Environments • Xin Huang, Sungkweon Hong, Andreas Hofmann, Brian Williams · [AWARD] Online Risk-Bounded Motion Planning for Autonomous Vehicles in Dynamic Environments Xin Huang (MIT), Sungkweon Hong (MIT), Andreas Hofmann (MIT), Brian Williams (MIT) A crucial challenge to efficient and robust motion planning for autonomous vehicles is understanding the intentions of the surrounding agents. Ignoring the intentions of the other agents in dynamic environments can lead to risky or overconservative plans. In this work, we model the motion planning problem as a partially observable Markov decision process (POMDP) and propose an online system that combines an intent recognition algorithm and a POMDP solver to generate risk-bounded plans for the ego vehicle navigating with a number of dynamic agent vehicles. The intent recognition algorithm predicts the probabilistic hybrid motion states of each agent vehicle over a finite horizon using Bayesian filtering and a library of pre-learned motion models. We update the POMDP model with the intent recognition results in real time and solve it using a heuristic search algorithm which produces policies with upper-bound guarantees on the probability of colliding with other dynamic agents. We demonstrate that our system is able to generate better motion plans in terms of efficiency and safety in a number of challenging environments including unprotected intersection left turns and lane changes as compared to the baseline methods. ude.tim@nixgnauh Stochastic Planning with Lifted Symbolic Trajectory Optimization • Hao Cui, Thomas Keller, Roni Khardon · [AWARD] Stochastic Planning with Lifted Symbolic Trajectory Optimization Hao Cui (Tufts University), Thomas Keller (University of Basel), Roni Khardon (Indiana University Bloomington) This paper investigates online stochastic planning for problems with large factored state and action spaces. One promising approach in recent work estimates the quality of applicable actions in the current state through aggregate simulation from the states they reach. This leads to significant speedup, compared to search over concrete states and actions, and suffices to guide decision making in cases where the performance of a random policy is informative of the quality of a state. The paper makes two significant improvements to this approach. The first, taking inspiration from lifted belief propagation, exploits the structure of the problem to derive a more compact computation graph for aggregate simulation. The second improvement replaces the random policy embedded in the computation graph with symbolic variables that are optimized simultaneously with the search for high quality actions. This expands the scope of the approach to problems that require deep search and where information is lost quickly with random steps. An empirical evaluation shows that these ideas significantly improve performance, leading to state of the art performance on hard planning problems. ude.stfut@iuC.oaH Landmark-Enhanced Heuristics for Goal Recognition in Incomplete Domain Models • Ramon Fraga Pereira, André Grahl Pereira, Felipe Meneguzzi · [AWARD] Landmark-Enhanced Heuristics for Goal Recognition in Incomplete Domain Models Ramon Fraga Pereira (Pontifical Catholic University of Rio Grande do Sul), André Grahl Pereira (Universidade Federal do Rio Grande do Sul), Felipe Meneguzzi (Pontifical Catholic University of Rio Grande do Sul) Recent approaches to goal recognition have progressively relaxed the assumptions about the amount and correctness of domain knowledge and available observations, yielding accurate and efficient algorithms. These approaches, however, assume completeness and correctness of the domain theory against which their algorithms match observations: this is too strong for most real-world domains. In this paper, we develop goal recognition techniques that are capable of recognizing goals using incomplete domain theories by considering different notions of planning landmarks in such domains. We evaluate the resulting techniques empirically in a large dataset of incomplete domains, and perform an ablation study to understand their effect on recognition performance. moc.liamg@arierepfnomar Using FastMap to Solve Graph Problems in a Euclidean Space • Jiaoyang Li, Ariel Felner, Sven Koenig, T. K. Satish Kumar · [AWARD] Using FastMap to Solve Graph Problems in a Euclidean Space Jiaoyang Li (University of Southern California), Ariel Felner (Ben-Gurion University), Sven Koenig (University of Southern California), T. K. Satish Kumar (University of Southern California) It is well known that many graph problems, like the Traveling Salesman Problem, are easier to solve in a Euclidean space. This motivates the idea of quickly preprocessing a given graph by embedding it in a Euclidean space to solve graph problems efficiently. In this paper, we study a near-linear time algorithm, called FastMap, that embeds a given non-negative edge-weighted undirected graph in a Euclidean space and approximately preserves the pairwise shortest path distances between vertices. The Euclidean space can then be used either for heuristic guidance of A* (as suggested previously) or for geometric interpretations that facilitate the application of techniques from analytical geometry. We present a new variant of FastMap and compare it with the original variant theoretically and empirically. We demonstrate its usefulness for solving a path-finding and a multi-agent meeting problem. ude.csu@lnayoaij Disjoint Splitting for Multi-Agent Path Finding with Conflict-Based Search • Jiaoyang Li, Daniel Harabor, Peter J. Stuckey, Ariel Felner, Hang Ma, Sven Koenig · [AWARD] Disjoint Splitting for Multi-Agent Path Finding with Conflict-Based Search Jiaoyang Li (University of Southern California), Daniel Harabor (Monash University), Peter J. Stuckey (Monash University), Ariel Felner (Ben-Gurion University), Hang Ma (University of Southern California), Sven Koenig (University of Southern California) Multi-Agent Path Finding (MAPF) is the planning problem of finding collision-free paths for a team of agents. We focus on Conflict-Based Search (CBS), a two-level tree-search state-of-the-art MAPF algorithm. The standard splitting strategy used by CBS is not disjoint, that is, when it splits a problem into two subproblems, some solutions are shared by both subproblems, which can create duplication of search effort. In this paper, we demonstrate how to improve CBS with disjoint splitting and how to modify the low-level search of CBS to take maximal advantage of it. Experiments show that disjoint splitting increases the success rates and speeds of CBS and its variants by up to 3 orders of magnitude. ude.csu@lnayoaij Learning Scheduling Models from Event Data • Arik Senderovich, Kyle E. C. Booth, J. Christopher Beck · [AWARD] Learning Scheduling Models from Event Data Arik Senderovich (University of Toronto), Kyle E. C. Booth (University of Toronto), J. Christopher Beck (University of Toronto) A significant challenge in declarative approaches to scheduling is the creation of a parameterized model: the set of resources and their capacities and the types of activities and their temporal and resource requirements. In practice, such models are developed manually by skilled consultants and used repeatedly to solve different problem instances. For example, in a factory, the model may be used each day to schedule the current customer orders. In this work, we aim to automate the creation of such models by learning them from event data. We introduce a novel methodology that combines process mining, timed Petri nets (TPNs), and constraint programming (CP). The approach learns a sub-class of TPN from event logs of executions of past schedules and maps the TPN to a broad class of scheduling problems. We show how any problem of the scheduling class can be converted to a CP model. With new instance data (e.g., the day’s orders), the CP model can then be solved by an off-the-shelf solver. Our approach provides an end-to-end solution, going from event logs to model-based optimal schedules. To demonstrate the value of the methodology we conduct experiments in which we learn and solve scheduling models from two types of data: event logs generated from job-shop scheduling benchmarks and real-world event logs from an outpatient hospital. ac.otnorotu.eim@skiras Theoretical Foundations for Structural Symmetries of Lifted PDDL Tasks • Silvan Sievers, Gabriele Röger, Martin Wehrle, Michael Katz · [AWARD] Theoretical Foundations for Structural Symmetries of Lifted PDDL Tasks Silvan Sievers (University of Basel), Gabriele Röger (University of Basel), Martin Wehrle (University of Basel), Michael Katz (IBM) We transfer the notion of structural symmetries to lifted planning task representations, based on abstract structures which we define to model planning tasks. We show that symmetries are preserved by common grounding methods and we shed some light on the relation to previous symmetry concepts used in planning. Using a suitable graph representation of lifted tasks, our experimental analysis of common planning benchmarks reveals that symmetries occur in the lifted representation of many domains. Our work establishes the theoretical ground for exploiting symmetries beyond their previous scope, such as for faster grounding and mutex generation, as well as for state space transformations and reductions. hc.sabinu@sreveis.navlis A Factored Approach to Contingent Multi-Agent Planning • Shashank Shekhar, Ronen Brafman, Guy Shani · [AWARD] A Factored Approach to Contingent Multi-Agent Planning Shashank Shekhar (BGU of the Negev, Israel), Ronen Brafman (Ben-Gurion University), Guy Shani (Ben Gurion University) Collaborative Multi-Agent Planning (MAP) under uncertainty with partial observability is a notoriously difficult problem. Such MAP problems are often modeled as DecPOMDPs, or its qualitative variant, QDec-POMDP, which is essentially a MAP version of contingent planning. The QDecPOMDP model was introduced with the hope that its simpler, non-probabilistic structure will allow for better scalability. Indeed, the recent IMAP algorithm scales much better than comparable Dec-POMDP algorithms (Bazinin and Shani 2018). In this work we suggest a new approach to solving QDec-POMDPs based on problem factoring. First, we find a solution to a MAP problem where the results of all observations are known to all agents. This is essentially a single-agent planning problem for the entire team. Then, we project the solution tree into sub-trees, one per agent, and let each agent transform its projected tree into a legal local tree. If all agents succeed, we combine the trees into a valid joint-plan. Otherwise, we continue to explore the space of team solutions. This approach is sound, complete, and as our empirical evaluation demonstrates, scales better than the IMAP algorithm. moc.liamg@0102rahkehsknahsahs On the Relation between Star-Topology Decoupling and Petri Net Unfolding • Daniel Gnad, Joerg Hoffmann · [AWARD] On the Relation between Star-Topology Decoupling and Petri Net Unfolding Daniel Gnad (Saarland University), Joerg Hoffmann (Saarland University) Petri net unfolding expands concurrent sub-threads of a transition system separately. In AI Planning, star-topology decoupling (STD) finds a partitioning of state variables into components whose dependencies take a star shape, and expands leafcomponent state spaces separately. Thus both techniques rely on the separate expansion of state-space composites. How do they relate? We show that, provided compatible search orderings, STD state space size dominates that of unfolding if every component contains a single state variable, and unfolding dominates STD in the absence of prevail conditions (nondeleted action preconditions). In all other cases, exponential state space size advantages are possible on either side. Thus the sources of exponential advantages of STD are exactly a) state space size in the presence of prevail conditions (our results), and b) decidability of reachability in time linear in state space size vs. NP-hard for unfolding (known results). ed.dnalraas-inu.sc@dang Temporal Planning as Refinement-Based Model Checking • Alexander Heinz, Martin Wehrle, Sergiy Bogomolov, Daniele Magazzeni, Marius Greitschus, Andreas Podelski · [AWARD] Temporal Planning as Refinement-Based Model Checking Alexander Heinz (University of Freiburg), Martin Wehrle (University of Basel), Sergiy Bogomolov (Australian National University), Daniele Magazzeni (King's College London), Marius Greitschus (University of Freiburg), Andreas Podelski (University of Freiburg) Planning as model checking based on source-to-source compilations has found increasing attention. Previously proposed approaches for temporal and hybrid planning are based on static translations, in the sense that the resulting model checking problems are uniquely defined by the given input planning problems. As a drawback, the translations can become too large to be efficiently solvable. In this paper, we address propositional temporal planning, lifting static translations to a more flexible framework. Our framework is based on a refinement cycle that allows for adaptively computing suitable translations of increasing size. Our experiments on temporal IPC domains show that the resulting translations to timed automata often become succinct, resulting in promising performance when applied with the directed model checker MCTA. moc.liamg@2rednaxela.h Advanced Factoring Strategies for Decoupled Search using Linear Programming • Frederik Schmitt, Daniel Gnad, Joerg Hoffmann · [AWARD] Advanced Factoring Strategies for Decoupled Search using Linear Programming Frederik Schmitt (Saarland University), Daniel Gnad (Saarland University), Joerg Hoffmann (Saarland University) Star-topology decoupled state space search decomposes a planning task and searches over the component state spaces instead of multiplying the state variables. This can lead to an exponential reduction of the search effort. To do so, in a preprocess before the search, the given planning task needs to be decomposed into factors, such that the interaction between these factors forms a star topology. Prior work has identified several ways to automatically decompose planning tasks, however, was not able to release the full potential of decoupled search. We try to close this gap by introducing an integer linear programming formulation of the factoring process, allowing us to explicitly specify the properties that a factoring should have. We prove that our approach returns the factoring that maximizes the number of factors, if this is the objective, and employ two other properties to assess the quality of a factoring. Our experimental evaluation shows that this leads to superior performance and substantially increases the applicability of decoupled search. ed.dnalraas-inu.sc@dang Foundations for Restraining Bolts: Reinforcement Learning with LTLf/LDLf restraining specifications • Giuseppe De Giacomo, Marco Favorito, Luca Iocchi, Fabio Patrizi · [AWARD] Foundations for Restraining Bolts: Reinforcement Learning with LTLf/LDLf restraining specifications Giuseppe De Giacomo (Sapienza Univ. Roma), Marco Favorito (Sapienza University of Rome), Luca Iocchi (Sapienza University of Rome), Fabio Patrizi (Sapienza University of Rome) In this work we investigate on the concept of “restraining bolt”, envisioned in Science Fiction. Specifically we introduce a novel problem in AI. We have two distinct sets of features extracted from the world, one by the agent and one by the authority imposing restraining specifications (the “restraining bolt”). The two sets are apparently unrelated since of interest to independent parties, however they both account for (aspects of) the same world. We consider the case in which the agent is a reinforcement learning agent on the first set of features, while the restraining bolt is specified logically using lineartimelogiconfinitetracesLTLf/LDLf overthesecond set of features. We show formally, and illustrate with examples, that, under general circumstances, the agent can learn while shaping its goals to suitably conform (as much as possible) to the restraining bolt specifications. ti.1amorinu.sid@omocaiged Symbolic Planning with Axioms • David Speck, Florian Geißer, Robert Mattmüller, Álvaro Torralba · [AWARD] Symbolic Planning with Axioms David Speck (University of Freiburg), Florian Geißer (Research School of Computer Science, ANU), Robert Mattmüller (University of Freiburg), Álvaro Torralba (Saarland University) Axioms are an extension for classical planning models that allow for modeling complex preconditions and goals exponentially more compactly. Although axioms were introduced in planning more than a decade ago, modern planning techniques rarely support axioms, especially in cost-optimal planning. Symbolic search is a popular and competitive optimal planning technique based on the manipulation of sets of states. In this work, we extend symbolic search algorithms to support axioms natively. We analyze different ways of encoding derived variables and axiom rules to evaluate them in a symbolic representation. We prove that all encodings are sound and complete, and empirically show that the presented approach outperforms the previous state of the art in costoptimal classical planning with axioms. ed.grubierf-inu.kitamrofni@dkceps Finding Centroids and Minimum Covering States in Planning • Alberto Pozanco, Yolanda E-Martín, Susana Fernández, Daniel Borrajo · [AWARD] Finding Centroids and Minimum Covering States in Planning Alberto Pozanco (Universidad Carlos III de Madrid), Yolanda E-Martín (Universidad Carlos III de Madrid), Susana Fernández (Universidad Carlos III de Madrid), Daniel Borrajo (Universidad Carlos III de Madrid) In automated planning, the most common task consists of finding a plan that achieves a set of goals. In this paper, we focus on a different task; that of finding states that minimize some goal-related metric. First, we present some domains for which that task is useful. Second, we propose two of such types of states: (1) centroid states, which minimize the distance to all the goals in the problem; and (2) minimum covering states, which minimize the maximum distance to any of the goals. Third, we define optimal and suboptimal algorithms to find such states. Finally, we show some experimental results in planning instances from different domains. se.m3cu.ap@ocnazopa A stochastic dual dynamic integer programming for the uncapacitated lot-sizing problem with uncertain demand and costs • Franco Quezada, Céline Gicquel, Safia Kedad-Sidhoum · [AWARD] A stochastic dual dynamic integer programming for the uncapacitated lot-sizing problem with uncertain demand and costs Franco Quezada (Sorbonne Université, LIP6), Céline Gicquel (Université Paris Saclay, LRI), Safia Kedad-Sidhoum (Conservatoire national des arts et métiers, CEDRIC) We study the uncapacitated lot-sizing problem with uncertain demand and costs. We consider a multi-stage decision process and rely on a scenario tree to represent the uncertainty. We propose to solve this stochastic combinatorial optimization problem thanks to a new extension of the stochastic dual dynamic integer programming algorithm. Our results show that this approach can provide good quality solutions in a reasonable time for large-size instances. rf.6pil@adazeuq.ocnarf Counterexample-Guided Abstraction Refinement for Pattern Selection in Optimal Classical Planning • Alexander Rovner, Silvan Sievers, Malte Helmert · [AWARD] Counterexample-Guided Abstraction Refinement for Pattern Selection in Optimal Classical Planning Alexander Rovner (University of Basel), Silvan Sievers (University of Basel), Malte Helmert (University of Basel) We describe a new algorithm for generating pattern collections for pattern database heuristics in optimal classical planning. The algorithm uses the counterexample-guided abstraction refinement (CEGAR) principle to guide the pattern selection process. Our experimental evaluation shows that a single run of the CEGAR algorithm can compute informative pattern collections in a fairly short time. Using multiple CEGAR algorithm runs, we can compute much larger pattern collections, still in shorter time than existing approaches, which leads to a planner that outperforms the state-of-the-art pattern selection methods by a significant margin. hc.sabinu.duts@renvor.xela Privacy Leakage of Search-based Multi-Agent Planning Algorithms • Michal Štolba, Daniel Fišer, Antonín Komenda · [AWARD] Privacy Leakage of Search-based Multi-Agent Planning Algorithms Michal Štolba (Czech Technical University in Prague), Daniel Fišer (Czech Technical University in Prague), Antonín Komenda (Czech Technical University in Prague) Privacy-Preserving Multi-Agent Planning (PP-MAP) has recently gained the attention of the research community, resulting in a number of PP-MAP planners and theoretical works. Many such planners lack strong theoretical guarantees, thus in order to compare their abilities w.r.t. privacy, a versatile and practical metric is crucial. In this work, we propose such a metric, building on the existing theoretical work. We generalize and implement the approach in order to be applicable on real planning domains and provide an evaluation of stateof-the-art PP-MAP planners over the standard set of benchmarks. The evaluation shows that the proposed privacy leakage metric is able to provide comparison of PP-MAP planners and reveal important properties. zc.tuvc.lef.stnega@ablots.lahcim Lagrangian Decomposition for Optimal Cost Partitioning • Florian Pommerening, Gabriele Röger, Malte Helmert, Hadrien Cambazard, Louis-Martin Rousseau, Domenico Salvagnin · Best Paper Award Lagrangian Decomposition for Optimal Cost Partitioning Florian Pommerening (University of Basel), Gabriele Röger (University of Basel), Malte Helmert (University of Basel), Hadrien Cambazard (Université Grenoble Alpes, Grenoble INP), Louis-Martin Rousseau (Polytechnique Montreal), Domenico Salvagnin (University of Padova) Optimal cost partitioning of classical planning heuristics has been shown to lead to excellent heuristic values but is often prohibitively expensive to compute. Lagrangian decomposition and Lagrangian relaxation are classical tools in mathematical programming that apply to optimization problems with a special block structure. We analyze the application of Lagrangian decomposition to cost partitioning in the context of operator-counting heuristics and interpret Lagrangian multipliers as cost functions for the combined heuristics. This allows us to view the computation of an optimal cost partitioning as an iterative process that can be seeded with any cost partitioning and improves over time. For non-negative cost partitioning, we derive an any-time algorithm to compute an optimal cost partitioning of abstraction heuristics without involving an LP solver. In each iteration, the computation reduces to independent shortest path problems in all abstractions. Finally, we discuss the extension to general cost functions. hc.sabinu@gnineremmop.nairolf On Computational Complexity of Automorphism Groups in Classical Planning • Alexander Shleyfman · [AWARD] On Computational Complexity of Automorphism Groups in Classical Planning Alexander Shleyfman (Technion – Israel Institute of Technology) Symmetry based pruning is a family of powerful methods for reducing search effort in planning as heuristic search. Applying these methods requires first establishing an automorphism group that is then used for pruning within the search process. Despite the growing popularity of state-space symmetries in planning techniques, the computational complexity of finding the automorphism group of a compactly represented planning task has not been formally established. In a series of reductions, we show that computing the automorphism group of a grounded planning task is GI-hard. Furthermore, we discuss the presentations of these symmetry groups and list some of their drawbacks. moc.liamg@rednaxela.namfyelhs Eliminating Redundant Actions in Partially Ordered Plans – A Complexity Analysis • Conny Olz, Pascal Bercher · [AWARD] Eliminating Redundant Actions in Partially Ordered Plans – A Complexity Analysis Conny Olz (Ulm University), Pascal Bercher (Ulm University) In this paper we study the computational complexity of post-optimizing partially ordered plans, i.e., we investigate the problem that is concerned with detecting and deleting unnecessary actions. For totally ordered plans it can easily be tested in polynomial time whether a single action can be removed without violating executability. Identifying an executable sub-plan, i.e., asking whether k plan steps can be removed, is known to be NP-hard. We investigate the same questions for partially ordered input plans, as they are created by many search algorithms or used by real-world applications – in particular time-critical ones that exploit parallelism of non-conflicting actions. More formally, we investigate the computational complexity of removing an action from a partially ordered solution plan in which every linearization is a solution in the classical sense while allowing ordering insertions afterwards to repair arising executability issues. It turns out that this problem is NP-complete – even if just a single action is removed – and thereby show that this reasoning task is harder than for totally ordered plans. Moreover, we identify the structural properties responsible for this hardness by providing a fixed-parameter tractability (FPT) result. ed.mlu-inu@rehcreb.lacsap On Compiling Away PDDL3 Qualitative Preferences without Using Automata • Francesco Percassi, Alfonso Emilio Gerevini · [AWARD] On Compiling Away PDDL3 Qualitative Preferences without Using Automata Francesco Percassi (University of Brescia), Alfonso Emilio Gerevini (University of Brescia) We address the problem of propositional planning extended with the class of soft temporally extended goals supported in PDDL3, also called qualitative preferences since IPC-5. Such preferences are useful to characterise plan quality by allowing the user to express certain soft constraints on the state trajectory of the desired solution plans. We propose and evaluate a compilation approach that extends previous work on compiling soft reachability goals and always goals to the full set of PDDL3 qualitative preferences. This approach directly compiles qualitative preferences into propositional planning without using automata to represent the trajectory constraints. Moreover, since no numerical fluent is used, it allows many existing STRIPS planners to immediately address planning with preferences without changing their algorithms or codef. An experimental analysis presented in the paper evaluates the performance of state-of-the-art propositional planners supporting action costs using our compilation of PDDL 3 qualitative preferences. The results indicate that the proposed approach is highly competitive with respect to current planners that natively support the considered class of preference, as well as with a recent automata-based compilation approach. ti.sbinu@issacrep.f Discovery of Optimal Solution Horizons in Non-Stationary Markov Decision Processes with Unbounded Rewards • Grigory Neustroev, Mathijs de Weerdt, Remco Verzijlbergh · [AWARD] Discovery of Optimal Solution Horizons in Non-Stationary Markov Decision Processes with Unbounded Rewards Grigory Neustroev (Delft University of Technology), Mathijs de Weerdt (Delft University of Technology), Remco Verzijlbergh (Delft University of Technology) Infinite-horizon non-stationary Markov decision processes provide a general framework to model many real-life decision-making problems, e.g., planning equipment maintenance. Unfortunately, these problems are notoriously difficult to solve, due to their infinite dimensionality. Often, only the optimality of the initial action is of importance to the decision-maker: once it has been identified, the procedure can be repeated to generate a plan of arbitrary length. The optimal initial action can be identified by finding a time horizon so long that data beyond it has no effect on the initial decision. This horizon is known as a solution horizon and can be discovered by considering a series of truncations of the problem until a stopping rule guaranteeing initial decision optimality is satisfied. We present such a stopping rule for problems with unbounded rewards. Given a candidate policy, the rule uses a mathematical program that searches for other possibly optimal initial actions within the space of feasible truncations. If no better action can be found, the candidate action is deemed optimal. Our rule runs faster than comparable rules and discovers shorter, more efficient solution horizons. ln.tfledut@veortsueN.G Best-First Width Search for Multi Agent Privacy-preserving Planning • Alfonso E. Gerevini, Nir Lipovetzky, Francesco Percassi, Alessandro Saetti, Ivan Serina · [AWARD] Best-First Width Search for Multi Agent Privacy-preserving Planning Alfonso E. Gerevini (University of Brescia), Nir Lipovetzky (The University of Melbourne), Francesco Percassi (University of Brescia), Alessandro Saetti (University of Brescia), Ivan Serina (University of Brescia) In multi-agent planning, preserving the agents’ privacy has become an increasingly popular research topic. For preserving the agents’ privacy, agents jointly compute a plan that achieves mutual goals by keeping certain information private to the individual agents. Unfortunately, this can severely restrict the accuracy of the heuristic functions used while searching for solutions. It has been recently shown that, for centralized planning, the performance of goal oriented search can be improved by combining goal oriented search and width-based search. The combination of these techniques has been called best-first width search. In this paper, we investigate the usage of best-first width search in the context of (decentralised) multi-agent privacy-preserving planning, addressing the challenges related to the agents’ privacy and performance. In particular, we show that best-first width search is a very effective approach over several benchmark domains, even when the search is driven by heuristics that roughly estimate the distance from goal states, computed without using the private information of other agents. An experimental study analyses the effectiveness of our techniques and compares them with the state-of-the-art. ti.sbinu@inivereg.osnofla Oversubscription Planning as Classical Planning with Multiple Cost Functions • Michael Katz, Emil Keyder, Florian Pommerening, Dominik Winterer · [AWARD] Oversubscription Planning as Classical Planning with Multiple Cost Functions Michael Katz (IBM), Emil Keyder (Invitae), Florian Pommerening (University of Basel), Dominik Winterer (University of Konstanz) The aim of classical planning is to minimize the summed cost of operators among those plans that achieve a fixed set of goals. Oversubscription planning (OSP), on the other hand, seeks to maximize the utility of the set of facts achieved by a plan, while keeping the cost of the plan at or below some specified bound. Here, we investigate the use of reformulations that yield planning problems with two separate cost functions, but no utilities, for solving OSP tasks. Such reformulations have also been proposed in the context of netbenefit planning, where the planner tries to maximize the difference between the utility achieved and the cost of the plan. One of our reformulations is adapted directly from that setting, while the other is novel. In both cases, they allow for easy adaptation of existing classical planning heuristics to the OSP problem within a simple branch and bound search. We validate our approach using state of the art admissible heuristics in this framework, and report our results. moc.mbi@1ztak.leahcim Planning under LTL Environment Specifications • Benjamin Aminof, Giuseppe De Giacomo, Aniello Murano, Sasha Rubin · [AWARD] Planning under LTL Environment Specifications Benjamin Aminof (Hebrew University), Giuseppe De Giacomo (Sapienza University of Rome), Aniello Murano (University of Naples Federico II), Sasha Rubin (University of Naples Federico II) Planning domains represent what an agent assumes or believes about the environment it acts in. In the presence of nondeterminism, additional temporal assumptions, such as fairness, are often expressed as extra conditions on the domain. Here we consider environment specifications expressed in arbitrary LTL, which generalize many forms of environment specifications, including classical specifications of nondeterministic domains, fairness, and other forms of linear-time constraints on the domain itself. Existing literature typically implicitly or explicitly considers environment specifications as constraints on possible traces. In contrast, in spite of the fact that we use a linear-time formalism, we propose to consider environment specifications as specifications of environment strategies. Planning in this framework is the problem of computing an agent strategy that achieves its goal against all environment strategies satisfying the specification. We study the mathematical and computational properties of planning in this general setting. We observe that not all LTL formulas correspond to legitimate environment specifications, and formally characterize the ones that do. Moreover, we show that our notion of planning generalizes the classical notion of Church’s synthesis, and that in spite of this one can still solve it optimally using classical Church’s synthesis. ti.1amorinu.sid@omocaiged An Exact Algorithm to make a Trade-off between Cost and Probability in SSPs • Valdinei Freire, Karina Valdivia Delgado, Willy Arthur Silva Reis · [AWARD] An Exact Algorithm to make a Trade-off between Cost and Probability in SSPs Valdinei Freire (University of São Paulo), Karina Valdivia Delgado (University of São Paulo), Willy Arthur Silva Reis (University of São Paulo) In stochastic sequential decision problems, such as Stochastic Shortest Path Problems, GUBS (Goal with Utility-Based Semantic) criterion considers a trade-off between probabilityto-goal and cost-to-goal using a goal semantic based on Expected Utility Theory (EUT); in such a semantic, goal paths have priority over non-goal paths, but it implies neither MAXPROB criterion nor Dual criterion. Whereas evaluation criteria based on a sounding theory such as EUT are desirable, optimal policies under GUBS are non-stationary. Non-stationary solutions are undesirable because there is not always a finite representation and, when it is not the case, the representation may be too large to be stored. Considering exponential cost utility, we contribute with: (i) the first exact algorithm to obtain finite optimal policies for GUBS criterion, and (ii) four strategies to find sub-optimal policies. We conduct experiments on one synthetic problem to evaluate each strategy. Although optimal solutions have a high memory cost, sub-optimal policies can save memory space with a small decrease in performance. rb.psu@erierf.ienidlav A Multi-Label A* Algorithm for Multi-Agent Pathfinding • Florian Grenouilleau, Willem-Jan van Hoeve, J. N. Hooker · [AWARD] A Multi-Label A* Algorithm for Multi-Agent Pathfinding Florian Grenouilleau (Polytechnique Montréal), Willem-Jan van Hoeve (Carnegie Mellon University), J. N. Hooker (Carnegie Mellon University) Given a set of agents, the multi-agent pathfinding problem consists in determining, for each agent, a path from its start location to its assigned goal while avoiding collisions with other agents. Recent work has studied variants of the problem in which agents are assigned a sequence of goals (tasks) that become available over time, such as the online multiagent pickup and delivery (MAPD) problem. In this paper, we propose a multi-label A* algorithm (MLA*) for this problem. It extends the classic A* algorithm by allowing the computation of paths with multiple ordered goals (such as a pickup and delivery). Moreover, we develop a new h-value-based centralized heuristic for the MAPD. Computational experiments show that our proposed MLA* obtains substantial improvements in terms of makespan and service time as compared to existing methods, while being more computationally efficient. On instances with a thousand tasks and hundreds of agents, our method reduces the average service time by 43% compared to the state of the art, with considerably less computational effort. ten.tlerric@uaelliuonerg.nairolf Bridging the Gap Between Abstractions and Critical-Path Heuristics via Hypergraphs • Marcel Steinmetz, Álvaro Torralba · [AWARD] Bridging the Gap Between Abstractions and Critical-Path Heuristics via Hypergraphs Marcel Steinmetz (CISPA Helmholtz Center for Information Security), Álvaro Torralba (Saarland University, Saarland Informatics Campus) Abstractions and critical-path heuristics are among the most important families of admissible heuristics in classical planning. In this paper, we present a new family of heuristics, which we name hyperabstractions, given by the combination of the principal ideas underlying abstractions and critical-path heuristics. Hyperabstractions approximate goal distances through a mapping from states to sets of abstract states. The abstract transition behavior forms a relation between abstract states and sets of abstract states, and is formally represented via the notion of hypergraphs. We show that both abstractions and critical-path heuristics can naturally be expressed as members of this family. Moreover, we devise a method to construct hyperabstractions, using either a set of abstractions or a critical-path heuristic as a seed, in a way that guarantees that the resulting distance estimations dominate those of the input heuristics, sometimes even strictly. By finding suitable cost partitionings for hyperabstraction heuristics, this dominance result is preserved even in comparison to the additive combination of the input heuristics. Our experiments indicate the potential of this new class of heuristics, opening a wide range of future research topics. dnalraas.apsic@ztemniets.lecram A Logical Semantics for PDDL+ • Vitaliy Batusov, Mikhail Soutchanski · [AWARD] A Logical Semantics for PDDL+ Vitaliy Batusov (York University), Mikhail Soutchanski (Ryerson University, Toronto, Ontario) PDDL+ is an extension of PDDL2.1 which incorporates fully-featured autonomous processes and allows for better modelling of mixed discrete-continuous domains. Unlike PDDL2.1, PDDL+ lacks a logical semantics, relying instead on state-transitional semantics enriched with hybrid automata semantics for the continuous states. This complex semantics makes analysis and comparisons to other action formalisms difficult. In this paper, we propose a natural extension of Reiter’s situation calculus theories inspired by hybrid automata. The kinship between PDDL+ and hybrid automata allows us to develop a direct mapping between PDDL+ and situation calculus, thereby supplying PDDL+ with a logical semantics and the situation calculus with a modern way of representing autonomous processes. We outline the potential benefits of the mapping by suggesting a new approach to effective planning in PDDL+. ac.ukroy.esc@vosutabv Measuring and Optimizing Durability Against Scheduling Disturbances • Joon Young Lee, Vivaswat Ojha, James C. Boerkoel, Jr. · [AWARD] Measuring and Optimizing Durability Against Scheduling Disturbances Joon Young Lee (Harvey Mudd College), Vivaswat Ojha (Harvey Mudd College), James C. Boerkoel, Jr. (Harvey Mudd College) Flexibility is a useful and common metric for measuring the amount of slack in a Simple Temporal Network (STN) solution space. We extend this concept to specific schedules within an STN’s solution space, developing a related notion of durability that captures an individual schedule’s ability to withstand disturbances and still remain valid. We identify practical sources of scheduling disturbances that motivate the need for durable schedules, and create a geometrically-inspired empirical model that enables testing a given schedule’s ability to withstand these disturbances. We develop a number of durability metrics and use these to characterize and compute specific schedules that we expect to have high durability. Using our model of practical disturbances, we show that our durability metrics strongly predict a schedule’s resilience to practical scheduling disturbances. We also demonstrate that the schedules we identify as having high durability are up to three times more resilient to disturbances than an arbitrarily chosen schedule is. ude.cmh@eelooj DREAM: An Algorithm for Mitigating the Overhead of Robust Rescheduling • Jordan R. Abrahams, David A. Chu, Grace Diehl, Marina Knittel, Judy Lin, William Lloyd, James C. Boerkoel, Jr., Jeremy Frank · [AWARD] DREAM: An Algorithm for Mitigating the Overhead of Robust Rescheduling Jordan R. Abrahams (Harvey Mudd College), David A. Chu (Harvey Mudd College), Grace Diehl (Harvey Mudd College), Marina Knittel (Harvey Mudd College), Judy Lin (Harvey Mudd College), William Lloyd (Harvey Mudd College), James C. Boerkoel, Jr. (Harvey Mudd College), Jeremy Frank (NASA Ames Research Center) Generating and executing temporal plans is difficult in uncertain environments. The current state-of-the-art algorithm for probabilistic temporal networks maintains a high success rate by rescheduling frequently as uncertain events are resolved, but this approach involves substantial resource overhead due to computing and communicating new schedules between agents. Aggressive rescheduling could thus reduce overall mission duration or success in situations where agents have limited energy, computing power, and may not be feasible when communication is limited. In this paper, we propose new approaches for heuristically deciding when rescheduling is most efficacious. We propose two compatible approaches, Allowable Risk and Sufficient Change, that can be employed in combination to compromise between the computation rate, the communication rate, and success rate for new schedules. We show empirically that both approaches allow us to gracefully trade success rate for lower computation and/or communication as compared to a state-of-the-art dynamic scheduling algorithm. ude.cmh@leokreob Cyber-Physical Planning: Deliberation for Hybrid Systems with a Continuous Numeric State • Arthur Bit-Monnot, Luca Pulina, Armando Tacchella · [AWARD] Cyber-Physical Planning: Deliberation for Hybrid Systems with a Continuous Numeric State Arthur Bit-Monnot (University of Sassari), Luca Pulina (University of Sassari), Armando Tacchella (Università di Genova) Cyber-physical systems pose unique deliberation challenges, where complex strategies must be autonomously derived and executed in the physical world, relying on continuous state representations and subject to safety and security constraints. Robots are a typical example of cyber-physical system where high-level decisions must be reconciled with motion-level decisions in order to provide guarantees on the validity and efficiency of the plan. In this work we propose techniques to refine a high-level plan into a continuous state trajectory. The refinement is done by translating a high-level plan into a nonlinear optimization problem with constraints that can encode the intrinsic limitations and dynamics of the system as well as the rules for its continuous control. The refinement process either succeeds or yields an explanation that we exploit to refine the search space of a high-level task planner. We evaluate our approach on existing PDDL+ benchmarks as well as on a more realistic and challenging rover navigation problem. ti.ssinu@tibfa Robust Bayes-Adaptive Planning under Model Uncertainty • Apoorva Sharma, James Harrison, Matthew Tsao, Marco Pavone · [AWARD] Robust Bayes-Adaptive Planning under Model Uncertainty Apoorva Sharma (Stanford University), James Harrison (Stanford University), Matthew Tsao (Stanford University), Marco Pavone (Stanford University) Planning under model uncertainty is a fundamental problem across many applications of decision making and learning. In this paper, we propose the Robust Adaptive Monte Carlo Planning (RAMCP) algorithm, which allows computation of risk-sensitive Bayes-adaptive policies that optimally trade off exploration, exploitation, and robustness. RAMCP formulates the risk-sensitive planning problem as a two-player zero-sum game, in which an adversary perturbs the agent’s belief over the models. We introduce two versions of the RAMCP algorithm. The first, RAMCP-F, converges to an optimal risk-sensitive policy without having to rebuild the search tree as the underlying belief over models is perturbed. The second version, RAMCP-I, improves computational efficiency at the cost of losing theoretical guarantees, but is shown to yield empirical results comparable to RAMCP-F. RAMCP is demonstrated on an n-pull multi-armed bandit problem, as well as a patient treatment scenario. ude.drofnats@avroopa Approximate Gradient Descent Convergence Dynamics for Adaptive Control on Heterogeneous Networks • Jean Carpentier, Sebastien Blandin · [AWARD] Approximate Gradient Descent Convergence Dynamics for Adaptive Control on Heterogeneous Networks Jean Carpentier (Ecole Polytechnique), Sebastien Blandin (IBM Research) Adaptive control is a classical control method for complex cyber-physical systems, including transportation networks. In this work, we analyze the convergence properties of such methods on exemplar graphs, both theoretically and numerically. We first illustrate a limitation of the standard backpressure algorithm for scheduling optimization, and prove that a re-scaling of the model state can lead to an improvement in the overall system optimality by a factor of at most O(k) depending on the network parameters, where k characterizes the network heterogeneity. We exhaustively describe the associated transient and steady-state regimes, and derive convergence properties within this generalized class of backpressure algorithms. Extensive simulations are conducted on both a synthetic network and a more realistic large-scale network modeled on the Manhattan grid on which theoretical results are verified. moc.mbi.gs@nidnalbs Using Bi-Directional Information Exchange to Improve Decentralized Schedule-Driven Traffic Control • Hsu-Chieh Hu, Stephen Smith · [AWARD] Using Bi-Directional Information Exchange to Improve Decentralized Schedule-Driven Traffic Control Hsu-Chieh Hu (Carnegie Mellon University), Stephen Smith (Carnegie Mellon University) Recent work in decentralized, schedule-driven traffic control has demonstrated the ability to improve the efficiency of traffic flow in complex urban road networks. In this approach, a scheduling agent is associated with each intersection. Each agent senses the traffic approaching its intersection and in real-time constructs a schedule that minimizes the cumulative wait time of vehicles approaching the intersection over the current look-ahead horizon. In order to achieve network level coordination in a scalable manner, scheduling agents communicate only with their direct neighbors. Each time an agent generates a new intersection schedule it communicates its expected outflows to its downstream neighbors as a prediction of future demand and these outflows are appended to the downstream agent’s locally perceived demand. In this paper, we extend this basic coordination algorithm to additionally incorporate the complementary flow of information reflective of an intersection’s current congestion level to its upstream neighbors. We present an asynchronous decentralized algorithm for updating intersection schedules and congestion level estimates based on these bi-directional information flows. By relating this algorithm to the self-optimized decision making of the basic operation, we are able to approach network-wide optimality and reduce inefficiency due to strictly self-interested intersection control decisions. ude.umc.werdna@heihcush Cutting the Size of Compressed Path Databases With Wildcards and Redundant Symbols • Mattia Chiari, Shizhe Zhao, Adi Botea, Alfonso Gerevini, Daniel Harabor, Alessandro Saetti, Matteo Salvetti, Peter J. Stuckey · [AWARD] Cutting the Size of Compressed Path Databases With Wildcards and Redundant Symbols Mattia Chiari (University of Brescia), Shizhe Zhao (Monash University), Adi Botea (IBM), Alfonso Gerevini (University of Brescia), Daniel Harabor (Monash University), Alessandro Saetti (University of Brescia), Matteo Salvetti (University of Brescia), Peter J. Stuckey (The University of Melbourne) Path planning on gridmaps is a core AI problem, very popular in applications such as games. Compressed path databases (CPDs) represent a state-of-the-art approach to the problem, in terms of the speed of computing full optimal paths and individual optimal moves. Despite significant improvements in recent years, the memory required to store a CPD can still be a bottleneck for large maps. We present an approach to improving the compression in CPDs. Our techniques use an extended notion of wildcards, and a novel concept called a redundant symbol. We implemented our ideas on top of a state-of-the-art CPD system. Experiments demonstrate a substantial reduction of the size of CPDs. ti.sbinu.itneduts@710iraihc.m
Personalized Medication and Activity Planning in PDDL+ • Fares K. Alaboud, Andrew Coles · [AWARD] Personalized Medication and Activity Planning in PDDL+ Fares K. Alaboud (King's College London), Andrew Coles (King's College London) The emergence of planners capable of reasoning with continuous dynamics, as expressed in PDDL+, has increased the range of problems that fall within the capabilities of PDDL planners. One such problem is planning patients’ activities and medication regimes, considering non-linear medication pharmacokinetics. In this paper we explore the application of contemporary PDDL+ planners to this problem. To address their performance limitations, we present a linearize–validate cycle; tasks are solved by iterative refinement of a linear approximation of the domain, solved by a linear planner, then validated at each stage against the full non-linear semantics. In doing this we allow this domain to fall within the capabilities of current planners; and in our evaluation we use OPTIC to demonstrate this. ku.ca.lck@duobala.seraf Temporal Brittleness Analysis of Task Networks for Planetary Rovers • Tiago Stegun Vaquero, Steve Chien, Jagriti Agrawal, Wayne Chi, Terrance Huntsberger · [AWARD] Temporal Brittleness Analysis of Task Networks for Planetary Rovers Tiago Stegun Vaquero (Jet Propulsion Laboratory, California Institute of Technology), Steve Chien (Jet Propulsion Laboratory, California Institute of Technology), Jagriti Agrawal (Jet Propulsion Laboratory, California Institute of Technology), Wayne Chi (Jet Propulsion Laboratory, California Institute of Technology), Terrance Huntsberger (Jet Propulsion Laboratory, California Institute of Technology) We propose a new method to analyze the temporal brittleness of task networks, which allows the detection and enumeration of activities that, with modest task execution duration variation make the execution of the task network dynamically uncontrollable. In this method, we introduce a metric for measuring an activity brittleness – defined as the degree of acceptable deviation of its nominal duration – and describe how that measurement is mapped to task network structure. Complementary to existing work on plan robustness analysis which informs how likely a task network is to succeed or not, the proposed analysis and metric go deeper to pinpoint the sources of potential brittleness due to temporal constraints and to focus either human designers and/or automated task network generators (e.g. scheduler/planners) to address sources of undesirable brittleness. We apply the approach to a set of task networks (called sol types) in development for NASA’s next planetary rover and present common patterns that are sources of brittleness. These techniques are currently under evaluation for potential use supporting operations of the Mars 2020 rover. ude.hcetlac.lpj@oreuqav.nugets.ogait Optimizing Parameters for Uncertain Execution and Rescheduling Robustness • Wayne Chi, Jagriti Agrawal, Steve Chien, Elyse Fosse, Usha Guduri · [AWARD] Optimizing Parameters for Uncertain Execution and Rescheduling Robustness Wayne Chi (Jet Propulsion Laboratory, California Institute of Technology), Jagriti Agrawal (Jet Propulsion Laboratory, California Institute of Technology), Steve Chien (Jet Propulsion Laboratory, California Institute of Technology), Elyse Fosse (Jet Propulsion Laboratory, California Institute of Technology), Usha Guduri (Jet Propulsion Laboratory, California Institute of Technology) We describe the application of using Monte Carlo simulation to optimize a schedule for both execution and rescheduling robustness and activity score in the face of execution uncertainties. We apply these techniques to the problem of optimizing a schedule for a planetary rover with very limited onboard computation. We search in the activity input parameter space where a) the onboard scheduler is a one shot non-backtracking scheduler in which once an activity is placed it is never moved or deleted and b) the activity priority determines the order in which activities are considered for placement in the schedule. We show that simulation driven search for activity parameters outperforms alternative schemes that use static priority assignment. Our approach can be viewed as using simulation feedback to determine problem specific heuristics much like squeaky wheel optimization. These techniques are currently baselined for use in the ground operations of NASA's next planetary rover, the Mars 2020 rover. ude.hcetlac.lpj@ihc.enyaw The Clustered Dial-a-Ride Problem • Fabian Feitsch, Sabine Storandt · [AWARD] The Clustered Dial-a-Ride Problem Fabian Feitsch (University of Würzburg), Sabine Storandt (University of Konstanz) We study a generalization of the classical dial-a-ride problem, with an application to public transport planning in rural areas. In the classical dial-a-ride problem, n users each specify a pick-up and a delivery location, and the aim is to plan the least cost route to cater all requests. This can be modeled as a traveling salesmen problem in a complete graph with precedence constraints (pick-ups need to happen before deliveries). In this paper, we consider the clustered dial-a-ride problem, where we do not operate on a complete graph but on a graph composed of serially numbered cliques where each clique is connected to the next one via a single edge. This setting is inspired by door-to-door transportation for people from remote villages who want to get to another village or the next city by a bus which operates on demand. We argue that in case the optimal route exhibits certain structural properties, it can be computed significantly faster. To make use of this observation, we devise a classification algorithm which can decide whether the optimal route exhibits these structural properties before computing it. Extensive experiments on artificial and real-world instances reveal that the majority of optimal routes indeed have the desired properties and that our classifier is an efficient tool to recognize the respective instances. ed.hcstief-naibaf@liam Exact Methods for Extended Rotating Workforce Scheduling Problems • Lucas Kletzander, Nysret Musliu, Johannes Gärtner, Werner Schafhauser, Thomas Krennwallner · [AWARD] Exact Methods for Extended Rotating Workforce Scheduling Problems Lucas Kletzander (CD Laboratory, TU Wien), Nysret Musliu (CD Laboratory, TU Wien), Johannes Gärtner (XIMES GmbH), Werner Schafhauser (XIMES GmbH), Thomas Krennwallner (XIMES GmbH) In many professions daily demand for different shifts varies during the week. The rotating workforce scheduling problem deals with the creation of repeating schedules for such demand and is therefore of high practical relevance. This paper investigates solving this real-life problem with several new practically relevant features. This includes early recognition of certain infeasibility criteria, complex rest time constraints regarding weekly rest time, and optimization goals to deal with optimal assignments of free weekends. We introduce a state-of-the-art constraint model and evaluate it with different extensions. The evaluation shows that many real-life instances can be solved to optimality using a constraint solver. Our approach is under deployment in a state-of-the-art commercial solver for rotating workforce scheduling. ta.ca.neiwut@rednaztelk.sacul Towards Automating Crime Prevention through Environmental Design (CPTED) Analysis to Predict Burglary • Leanne Monchuk, Simon Parkinson, James Kitchen · [AWARD] Towards Automating Crime Prevention through Environmental Design (CPTED) Analysis to Predict Burglary Leanne Monchuk (University of Huddersfield), Simon Parkinson (University of Huddersfield), James Kitchen (West Yorkshire Police) The design of the built environment (such as housing developments, street networks) can increase the opportunity for crime and disorder to occur. For example, a housing development with poor surveillance can provide an opportunity for offenders to commit residential burglary and avoid detection. Crime Prevention through Environmental Design (CPTED) aims to reduce crime and disorder through the design and manipulation of the built environment. The police typically play an important role in the delivery and application of CPTED by assessing planning applications, identifying design features that may provide an opportunity for crime and offering remedial advice. In England and Wales, it is common practice for police specialists – Designing out Crime Officers (DOCOs) – to review architectural site plans during the planning process. However, owing to significant cuts to policing budgets, the number of DOCOs in post is reducing whilst the demand for new housing is on the increase. In this novel work, it is demonstrated that key knowledge about the opportunities for crime and disorder within the built environment can be elicited from a purposive sample of 28 experienced DOCOs, encoded in a domain model and utilised by Automated Planning techniques to automatically assess architectural site plans for future crime risk. ku.ca.duh@kuhcnoM.Y.L ZAC: A Zone pAth Construction Approach for Effective Real-Time Ride Sharing • Meghna Lowalekar, Pradeep Varakantham, Patrick Jaillet · UTRC Best Applications Paper Award ZAC: A Zone pAth Construction Approach for Effective Real-Time Ride Sharing Meghna Lowalekar (Singapore Management University), Pradeep Varakantham (Singapore Management University), Patrick Jaillet (Massachusetts Institute of Technology) Real-time ridesharing systems such as UberPool, Lyft Line, GrabShare have become hugely popular as they reduce the costs for customers, improve per trip revenue for drivers and reduce traffic on the roads by grouping customers with similar itineraries. The key challenge in these systems is to group the right requests to travel in available vehicles in real time, so that the objective (e.g., requests served, revenue or delay) is optimized. The most relevant existing work has focused on generating as many relevant feasible (with respect to available delay for customers) combinations of requests (referred to as trips) as possible in real-time. Since the number of trips increases exponentially with the increase in vehicle capacity and number of requests, unfortunately, such an approach has to employ ad hoc heuristics to identify relevant trips. To that end, we propose an approach that generates many zone (abstraction of individual locations) paths – where each zone path can represent multiple trips (combinations of requests) – and assigns available vehicles to these zone paths to optimize the objective. The key advantage of our approach is that these zone paths are generated using a combination of offline and online methods, consequently allowing for the generation of many more relevant combinations in real-time than competing approaches. We demonstrate that our approach outperforms (with respect to both objective and runtime) the current best approach for ridesharing on both real world and synthetic datasets. gs.ude.ums.sidhp@5102.lanhgem Reinforcement Learning Based Querying in Camera Networks for Efficient Target Tracking • Anil Sharma, Saket Anand, Sanjit Kaul · [AWARD] Reinforcement Learning Based Querying in Camera Networks for Efficient Target Tracking Anil Sharma (Indraprastha Institute of Information Technology, Dehli), Saket Anand (Indraprastha Institute of Information Technology, Dehli), Sanjit Kaul (Indraprastha Institute of Information Technology, Dehli) Surveillance camera networks are a useful monitoring infrastructure that can be used for various visual analytics applications, where high-level inferences and predictions could be made based on target tracking across the network. Most multi-camera tracking works focus on re-identification problems and trajectory association problems. However, as camera networks grow in size, the volume of data generated is humongous, and scalable processing of this data is imperative for deploying practical solutions. In this paper, we address the largely overlooked problem of scheduling cameras for processing by selecting one where the target is most likely to appear next. The inter-camera handover can then be performed on the selected cameras via re-identification or another target association technique. We model this scheduling problem using reinforcement learning and learn the camera selection policy using Q-learning. We do not assume the knowledge of the camera network topology but we observe that the resulting policy implicitly learns it. We evaluate our approach using NLPR MCT dataset, which is a real multi-camera multi-target tracking benchmark and show that the proposed policy substantially reduces the number of frames required to be processed at the cost of a small reduction in recall. ni.ca.dtiii@slina Solution Approaches for an Automotive Paint Shop Scheduling Problem • Felix Winter, Emir Demirović, Nysret Musliu, Christoph Mrkvicka · [AWARD] Solution Approaches for an Automotive Paint Shop Scheduling Problem Felix Winter (CD Laboratory, TU Wien), Emir Demirović (The University of Melbourne), Nysret Musliu (CD Laboratory, TU Wien), Christoph Mrkvicka (MCP GmbH) In the paint shops of the automotive supply industry a large number of synthetic material pieces need to be painted every day to provide the large variety of items required for car manufacturing. Because of the sophisticated automated production process and the tight due dates requested by car manufacturers, finding an optimized production schedule becomes a challenging task that is at the present time performed by multiple human planners. In this paper we formulate and solve a novel real life paint shop scheduling problem from the automotive supply industry which introduces unique constraints and objectives that do not appear in the existing literature. Additionally, we provide a new collection of benchmark instances based on real life planning scenarios that can be used to evaluate solution techniques for the problem. Exact methods are only able to solve few of the smaller instances in reasonable running time. Therefore, we propose a metaheuristic method based on local search that uses novel neighborhood relations and various ways to escape local optima. Our approach is able to provide feasible solutions for all instances within reasonable running time. ta.ca.neiwut.iabd@retniw Mixed Integer Programming versus Evolutionary Computation for Optimizing a Hard Real-World Staff Assignment Problem • Jannik Peters, Daniel Stephan, Isabel Amon, Hans Gawendowicz, Julius Lischeid, Lennart Salabarria, Jonas Umland, Felix Werner, Martin S. Krejca, Ralf Rothenberger, Timo Kötzing, Tobias Friedrich · [AWARD] Mixed Integer Programming versus Evolutionary Computation for Optimizing a Hard Real-World Staff Assignment Problem Jannik Peters (Hasso Plattner Institute, University of Potsdam), Daniel Stephan (Hasso Plattner Institute, University of Potsdam), Isabel Amon (Hasso Plattner Institute, University of Potsdam), Hans Gawendowicz (Hasso Plattner Institute, University of Potsdam), Julius Lischeid (Hasso Plattner Institute, University of Potsdam), Lennart Salabarria (Hasso Plattner Institute, University of Potsdam), Jonas Umland (Hasso Plattner Institute, University of Potsdam), Felix Werner (Hasso Plattner Institute, University of Potsdam), Martin S. Krejca (Hasso Plattner Institute, University of Potsdam), Ralf Rothenberger (Hasso Plattner Institute, University of Potsdam), Timo Kötzing (Hasso Plattner Institute, University of Potsdam), Tobias Friedrich (Hasso Plattner Institute, University of Potsdam) Assigning staff to engagements according to hard constraints while optimizing several objectives is a task encountered by many companies on a regular basis. Simplified versions of such assignment problems are NP-hard. Despite this, a typical approach to solving them consists of formulating them as mixed integer programming (MIP) problems and using a state-of-the-art solver to get solutions that closely approximate the optimum. In this paper, we consider a complex real-world staff assignment problem encountered by the professional service company KPMG, with the goal of finding an algorithm that solves it faster and with a better solution than a commercial MIP solver. We follow the evolutionary algorithm (EA) metaheuristic and design a search heuristic which iteratively improves a solution using domain-specific mutation operators. Furthermore, we use a flow algorithm to optimally solve a subproblem, which tremendously reduces the search space for the EA. For our real-world instance of the assignment problem, given the same total time budget of 100 hours, a parallel EA approach finds a solution that is only 1.7 % away from an upper bound for the (unknown) optimum within under five hours, while the MIP solver Gurobi still has a gap of 10.5 %. ed.iph.tneduts@sretep.kinnaj
Towards Stable Symbol Grounding with Zero-Suppressed State AutoEncoder • Masataro Asai, Hiroshi Kajino · [AWARD] Towards Stable Symbol Grounding with Zero-Suppressed State AutoEncoder Masataro Asai (IBM), Hiroshi Kajino (IBM) While classical planning has been an active branch of AI, its applicability is limited to the tasks precisely modeled by humans. Fully automated high-level agents should be instead able to find a symbolic representation of an unknown environment without supervision, otherwise it exhibits the knowledge acquisition bottleneck. Meanwhile, Latplan (Asai and Fukunaga 2018) partially resolves the bottleneck with a neural network called State AutoEncoder (SAE). SAE obtains the propositional representation of the image-based puzzle domains with unsupervised learning, generates a state space and performs classical planning. In this paper, we identify the problematic, stochastic behavior of the SAE-produced propositions as a new sub-problem of symbol grounding problem, the symbol stability problem. Informally, symbols are stable when their referents (e.g. propositional values) do not change against small perturbation of the observation, and unstable symbols are harmful for symbolic reasoning. We analyze the problem in Latplan both formally and empirically, and propose “Zero-Suppressed SAE”, an enhancement that stabilizes the propositions. We show that it finds the more stable propositions and the more compact representations, resulting in an improved success rate of Latplan. It is robust against various hyperparameters and eases the tuning effort, and also provides a weight pruning capability as a side effect. moc.mbi@iasA.oratasaM Deep Policies for Width-Based Planning in Pixel Domains • Miquel Junyent, Anders Jonsson, Vicenç Gómez · [AWARD] Deep Policies for Width-Based Planning in Pixel Domains Miquel Junyent (Oesia), Anders Jonsson (Universitat Pompeu Fabra), Vicenç Gómez (Universitat Pompeu Fabra) Width-based planning has demonstrated great success in recent years due to its ability to scale independently of the size of the state space. For example, Bandres et al. (2018) introduced a rollout version of the Iterated Width algorithm whose performance compares well with humans and learning methods in the pixel setting of the Atari games suite. In this setting, planning is done on-line using the “screen” states and selecting actions by looking ahead into the future. However, this algorithm is purely exploratory and does not leverage past reward information. Furthermore, it requires the state to be factored into features that need to be pre-defined for the particular task, e.g., the B-PROST pixel features. In this work, we extend width-based planning by incorporating an explicit policy in the action selection mechanism. Our method, called π-IW, interleaves width-based planning and policy learning using the state-actions visited by the planner. The policy estimate takes the form of a neural network and is in turn used to guide the planning step, thus reinforcing promising paths. Surprisingly, we observe that the representation learned by the neural network can be used as a feature space for the width-based planner without degrading its performance, thus removing the requirement of pre-defined features for the planner. We compare π-IW with previous width-based methods and with AlphaZero in simple environments and show that π-IW has superior performance. We also show that our proposed algorithm outperforms previous width-based methods in the pixel setting of Atari games suite. moc.liamg@ynabrabtneynuj.leuqim Unsupervised Grounding of Plannable First-Order Logic Representation from Images • Masataro Asai · [AWARD] Unsupervised Grounding of Plannable First-Order Logic Representation from Images Masataro Asai (IBM) Recently, there is an increasing interest in obtaining the relational structures of the environment in the Reinforcement Learning community. However, the resulting “relations” are not the discrete, logical predicates compatible to the symbolic reasoning such as classical planning or goal recognition. Meanwhile, Latplan (Asai and Fukunaga 2018) bridged the gap between deep-learning perceptual systems and symbolic classical planners. One key component of the system is a Neural Network called State AutoEncoder (SAE), which encodes an image-based input into a propositional representation compatible to classical planning. To get the best of both worlds, we propose First-Order State AutoEncoder, an unsupervised architecture for grounding the first-order logic predicates. Each predicate models a relationship between objects by taking the interpretable arguments and returning a propositional value. In the experiment using 8-Puzzle and a photorealistic Blocksworld environment, we show that (1) the resulting predicates capture the interpretable relations (e.g. spatial), (2) they help obtaining the compact, abstract model of the environment, and finally, (3) the resulting model is compatible to symbolic classical planning. moc.liamg@82817.2ohciug Resource Constrained Deep Reinforcement Learning • Abhinav Bhatia, Pradeep Varakantham, Akshat Kumar · [AWARD] Resource Constrained Deep Reinforcement Learning Abhinav Bhatia (Singapore Management University), Pradeep Varakantham (Singapore Management University), Akshat Kumar (Singapore Management University) In urban environments, supply resources have to be constantly matched to the ""right"" locations (where customer demand is present) so as to improve quality of life. For instance, ambulances have to be matched to base stations regularly so as to reduce response time for emergency incidents in EMS (Emergency Management Systems); vehicles (cars, bikes, scooters etc.) have to be matched to docking stations so as to reduce lost demand in shared mobility systems. Such problem domains are challenging owing to the demand uncertainty, combinatorial action spaces (due to allocation) and constraints on allocation of resources (e.g., resource capacity, minimum and maximum number of resources at locations and regions). Existing systems typically employ myopic and greedy optimization approaches to optimize allocation of supply resources to locations. Such approaches typically are unable to handle surges or variances in demand patterns well. Recent research has demonstrated the ability of Deep RL methods in adapting well to highly uncertain environments. However, existing Deep RL methods are unable to handle combinatorial action spaces and constraints on allocation of resources. To that end, we have developed three approaches on top of the well known actor critic approach, DDPG (Deep Deterministic Policy Gradient) that are able to handle constraints on resource allocation. More importantly, we demonstrate that they are able to outperform leading approaches on simulators validated on semi-real and real data sets. gs.ude.ums@bvanihba Fast Feature Selection for Linear Value Function Approximation • Bahram Behzadian, Soheil Gharatappeh, Marek Petrik · [AWARD] Fast Feature Selection for Linear Value Function Approximation Bahram Behzadian (University of New Hampshire), Soheil Gharatappeh (University of New Hampshire), Marek Petrik (University of New Hampshire) Linear value function approximation is a standard approach to solving reinforcement learning problems with large state spaces. Since designing good approximation features is difficult, automatic feature selection is an important research topic. We propose a new method for feature selection, which is based on a low-rank factorization of the transition matrix. Our approach derives features directly from high-dimensional raw inputs, such as image data. The method is easy to implement using SVD, and our experiments show that it is faster and more stable than alternative methods. ude.hnu.sc@marhab Learning Interpretable Models Expressed in Linear Temporal Logic • Alberto Camacho, Sheila A. McIlraith · [AWARD] Learning Interpretable Models Expressed in Linear Temporal Logic Alberto Camacho (University of Toronto), Sheila A. McIlraith (University of Toronto) We examine the problem of learning models that characterize the high-level behaviour of a system based on observation traces. Our aim is to develop models that are human interpretable. To this end, we introduce the problem of learning a Linear Temporal Logic (LTL) formula that parsimoniously captures a given set of positive and negative example traces. Our approach to learning LTL exploits a symbolic state representation, searching through a space of labeled skeleton formulae to construct an alternating automaton that models observed behaviour, from which the LTL can be read off. Construction of interpretable behaviour models is central to a diversity of applications related to planning and plan recognition. We showcase the relevance and significance of our work in the context of behaviour description and discrimination: i) active learning of a human-interpretable behaviour model that describes observed examples obtained by interaction with an oracle ii) passive learning of a classifier that discriminates individual agents, based on the human-interpretable signature way in which they perform particular tasks. Experiments demonstrate the effectiveness of our symbolic model learning approach in providing human-interpretable models and classifiers from reduced example sets. ude.otnorot.sc@ohcamaca Entropy based Independent Learning in Anonymous Multi-Agent Settings • Tanvi Verma, Pradeep Varakantham, Hoong Chuin Lau · [AWARD] Entropy based Independent Learning in Anonymous Multi-Agent Settings Tanvi Verma (Singapore Management University), Pradeep Varakantham (Singapore Management University), Hoong Chuin Lau (Singapore Management University) Efficient sequential matching of supply and demand is a problem of interest in many online to offline services. For instance, Uber, Lyft, Grab for matching taxis to customers; Ubereats, Deliveroo, FoodPanda etc for matching restaurants to customers. In these online to offline service problems, individuals who are responsible for supply (e.g., taxi drivers, delivery bikes or delivery van drivers) earn more by being at the ”right” place at the ”right” time. We are interested in developing approaches that learn to guide individuals to be in the ”right” place at the ”right” time (to maximize revenue) in the presence of other similar ”learning” individuals and only local aggregated observation of other agents states (e.g., only number of other taxis in same zone as current agent). Existing approaches in Multi-Agent Reinforcement Learning (MARL) are either not scalable (e.g., about 40000 taxis/cars for a city like Singapore) or assumptions of common objective (or action coordination) or centralized learning are not viable. A key characteristic of the domains of interest is that the interactions between individuals are anonymous, i.e., the outcome of an interaction (competing for demand) is dependent only on the number and not on the identity of the agents. We model these problems using the Anonymous MARL (AyMARL) model. To ensure scalability and individual learning, we focus on improving performance of independent reinforcement learning methods, specifically Deep Q-Networks (DQN) and Advantage Actor Critic (A2C) for AyMARL. The key contribution of this paper is in employing principle of maximum entropy to provide a general framework of independent learning that is both empirically effective (even with only local aggregated information of agent state distribution) and theoretically justified. Finally, our approaches provide a significant improvement with respect to joint and individual revenue on a generic simulator for online to offline services and a real world taxi problem over existing approaches. More importantly, this is achieved while having the least variance in revenues earned by the learning individuals, an indicator of fairness. gs.ude.ums.sidhp@5102.amrevivnat Learning Classical Planning Strategies with Policy Gradient • Pawel Gomoluch, Dalal Alrajeh, Alessandra Russo · [AWARD] Learning Classical Planning Strategies with Policy Gradient Pawel Gomoluch (Imperial College London), Dalal Alrajeh (Imperial College London), Alessandra Russo (Imperial College London) A common paradigm in classical planning is heuristic forward search. Forward search planners often rely on simple best-first search which remains fixed throughout the search process. In this paper, we introduce a novel search framework capable of alternating between several forward search approaches while solving a particular planning problem. Selection of the approach is performed using a trainable stochastic policy, mapping the state of the search to a probability distribution over the approaches. This enables using policy gradient to learn search strategies tailored to a specific distributions of planning problems and a selected performance metric, e.g. the IPC score. We instantiate the framework by constructing a policy space consisting of five search approaches and a two-dimensional representation of the planner’s state. Then, we train the system on randomly generated problems from five IPC domains using three different performance metrics. Our experimental results show that the learner is able to discover domain-specific search strategies, improving the planner’s performance relative to the baselines of plain bestfirst search and a uniform policy. ku.ca.lairepmi@41hculomog.lewap Size-Independent Neural Transfer for RDDL Planning • Sankalp Garg, Aniket Bajpai, Mausam · [AWARD] Size-Independent Neural Transfer for RDDL Planning Sankalp Garg (Indian Institute of Technology, Delhi), Aniket Bajpai (Indian Institute of Technology, Delhi), Mausam (Indian Institute of Technology, Delhi) Neural planners for RDDL MDPs produce deep reactive policies in an offline fashion. These scale well with large domains, but are sample inefficient and time-consuming to train from scratch for each new problem. To mitigate this, recent work has studied neural transfer learning, so that a generic planner trained on other problems of the same domain can rapidly transfer to a new problem. However, this approach only transfers across problems of the same size. We present the first method for neural transfer of RDDL MDPs that can transfer across problems of different sizes. Our architecture has two key innovations to achieve size independence: (1) a state encoder, which outputs a fixed length state embedding by max pooling over varying number of object embeddings, (2) a single parameter-tied action decoder that projects object embeddings into action probabilities for the final policy. On the two challenging RDDL domains of SysAdmin and Game Of Life, our approach powerfully transfers across problem sizes and has superior learning curves over training from scratch. moc.liamg@8991262plaknas
Replanning for Situated Robots • Michael Cashmore, Andrew Coles, Bence Cserna, Erez Karpas, Daniele Magazzeni, Wheeler Ruml · [AWARD] Replanning for Situated Robots Michael Cashmore (King's College London), Andrew Coles (King's College London), Bence Cserna (University of New Hampshire), Erez Karpas (Technion), Daniele Magazzeni (King's College London), Wheeler Ruml (University of New Hampshire) Planning enables intelligent agents, such as robots, to act so as to achieve their long term goals. To make the planning process tractable, a relatively low fidelity model of the world is often used, which sometimes leads to the need to replan. The typical view of replanning is that the robot is given the current state, the goal, and possibly some data from the previous planning process. However, for robots (or teams of robots) that exist in continuous physical space, act concurrently, have deadlines, or must otherwise consider durative actions, things are not so simple. In this paper, we address the problem of replanning for situated robots. Relying on previous work on situated temporal planning, we frame the replanning problem as a situated temporal planning problem, where currently executing actions are handled via Timed Initial Literals (TILs), under the assumptions that actions cannot be interrupted. We then relax this assumption, and address situated replanning with interruptible actions. We bridge the gap between the low-level model of the robot and the high-level model used for planning by the novel notion of a bail out action generator, which relies on the low-level model to generate high-level actions that describe possible ways to interrupt currently executing actions. Because actions can be interrupted at different times during their execution, we also propose a novel algorithm to handle temporal planning with time-dependent durations. li.ca.noinhcet@esaprak A Hierarchical Approach to Active Semantic Mapping Using Probabilistic Logic and Information Reward POMDPs • Tiago S. Veiga, Miguel Silva, Rodrigo Ventura, Pedro U. Lima · [AWARD] A Hierarchical Approach to Active Semantic Mapping Using Probabilistic Logic and Information Reward POMDPs Tiago S. Veiga (Institute for Systems and Robotics, Instituto Superior Técnico, U. Lisboa), Miguel Silva (Institute for Systems and Robotics, Instituto Superior Técnico, U. Lisboa), Rodrigo Ventura (Institute for Systems and Robotics, Instituto Superior Técnico, U. Lisboa), Pedro U. Lima (Institute for Systems and Robotics, Instituto Superior Técnico, U. Lisboa) Maintaining a semantic map of a complex and dynamic environment, where the uncertainty originates in both noisy perception and unexpected changes, is a challenging problem. In particular, we focus on the problem of maintaining a semantic map of an environment by a mobile agent. In this paper we address this problem in a hierarchical fashion. Firstly, we employ a probabilistic logic model representing the semantic map, as well as the associated uncertainty. Secondly, we model the interaction of the robot with the environment with a set of information-reward POMDP models, one for each partition of the environment (e.g., a room). The partition is performed in order to address the scalability limitations of POMDP models over very large state spaces. We then use probabilistic inference to determine which POMDP and policy to execute next. Experimental results show the efficiency of this architecture in real domestic service robotic scenarios. tp.aobsilu.ocincet.rsi@agievst Mars On-site Shared Analytics, Information, and Computing • Joshua Vander Hook, Tiago Stegun Vaquero, Federico Rossi, Martina Troesch, Marc Sanchez-Net, Joshua Schoolcraft, Jean-Pierre de la Croix, Steve Chien · [AWARD] Mars On-site Shared Analytics, Information, and Computing Joshua Vander Hook (Jet Propulsion Laboratory, California Institute of Technology), Tiago Stegun Vaquero (Jet Propulsion Laboratory, California Institute of Technology), Federico Rossi (Jet Propulsion Laboratory, California Institute of Technology), Martina Troesch (Jet Propulsion Laboratory, California Institute of Technology), Marc Sanchez-Net (Jet Propulsion Laboratory, California Institute of Technology), Joshua Schoolcraft (Jet Propulsion Laboratory, California Institute of Technology), Jean-Pierre de la Croix (Jet Propulsion Laboratory, California Institute of Technology), Steve Chien (Jet Propulsion Laboratory, California Institute of Technology) We study the use of distributed computation in a representative multi-robot planetary exploration mission. We model a network of small rovers with access to computing resources from a static base station based on current design efforts and extrapolation from the Mars 2020 rover autonomy. The key algorithmic problem is simultaneous scheduling of computation, communication, and caching of data, as informed by an autonomous mission planner. We consider minimum makespan scheduling and present a consensus-backed scheduler for shared-world, distributed scheduling based on an Integer Linear Program. We validate the pipeline with simulation and field results. Our results are intended to provide a baseline comparison and motivating application domain for future research into network-aware decentralized scheduling and resource allocation. ude.hcetlac.lpj@kooh Provable Infinite-Horizon Real-Time Planning for Repetitive Tasks • Fahad Islam, Oren Salzman, Maxim Likhachev · [AWARD] Provable Infinite-Horizon Real-Time Planning for Repetitive Tasks Fahad Islam (Carnegie Mellon University), Oren Salzman (Carnegie Mellon University), Maxim Likhachev (Carnegie Mellon University) In manufacturing and automation settings, robots often have to perform highly-repetitive manipulation tasks in structured environments. In this work we are interested in settings where tasks are similar, yet not identical (e.g., due to uncertain orientation of objects) and motion planning needs to be extremely fast. Preprocessing-based approaches prove to be very beneficial in these settings—they analyze the configuration-space offline to generate some auxiliary information which can then be used in the query phase to speedup planning times. Typically, the tighter the requirement is on query times the larger the memory footprint will be. In particular, for high-dimensional spaces, providing real-time planning capabilities is extremely challenging. While there are planners that guarantee real-time performance by limiting the planning horizon, we are not aware of general-purpose planners capable of doing it for infinite horizon (i.e., planning to the goal). To this end, we propose a preprocessingbased method that provides provable bounds on the query time while incurring only a small amount of memory overhead in the query phase. We evaluate our method on a 7-DOF robot arm and show a speedup of over tenfold in query time when compared to the PRM algorithm, while provably guaranteeing a maximum query time of less than 3 milliseconds. gro.liamthgirbluf@malsi.dahaf Learning Heuristic Functions for Mobile Robot Path Planning Using Deep Neural Networks • Takeshi Takahashi, He Sun, Dong Tian, Yebin Wang · [AWARD] Learning Heuristic Functions for Mobile Robot Path Planning Using Deep Neural Networks Takeshi Takahashi (University of Massachusetts Amherst), He Sun (Princeton University), Dong Tian (InterDigital, Inc.), Yebin Wang (Mitsubishi Electric Research Laboratories) Resorting to certain heuristic functions to guide the search, the computation efficiency of prevailing path planning algorithms such as A*, D* and their variants is solely determined by how good the heuristic function approximates the true path cost. In this study, we propose a novel approach to learn heuristic functions using a deep neural network (DNN) to improve the computation efficiency. Even though DNNs have been widely used for object segmentation, natural language processing, and perception, their role in helping to solve path planning has not been well investigated. This work shows how DNNs can be applied to path planning problems and what kind of loss functions is suitable for learning such a heuristic. Our preliminary results show that an appropriately designed and trained DNN can learn a heuristic which effectively guides conventional path planning algorithms and speeds up the path generation. moc.lrem@gnawnibey Goal Reasoning in a CLIPS-based Executive for Integrated Planning and Execution • Tim Niemueller, Till Hofmann, Gerhard Lakemeyer · [AWARD] Goal Reasoning in a CLIPS-based Executive for Integrated Planning and Execution Tim Niemueller (RWTH Aachen University), Till Hofmann (RWTH Aachen University), Gerhard Lakemeyer (RWTH Aachen University) The close integration of planning and execution is a challenging problem. Key questions are how to organize and explicitly represent the program flow to enable reasoning about it, how to dynamically create goals from run-time information and decide on-line which to pursue, and how to unify representations used during planning and execution. In this work, we present an integrated system that uses a goal reasoning model which represents this flow and supports dynamic goal generation. With an explicit world model representation, it allows reasoning about the current state of the world, the progress of the execution flow, and what goals should be pursued – or postponed or abandoned. Our executive implements a specific goal lifecycle with compound goal types that combine sub-goals by conjunctions, disjunctions, concurrency, or that impose temporal constraints. Goals also provide a frame of reference for execution monitoring. The current system can utilize PDDL as the underlying modeling language with extensions to aid execution and it contains well-defined extension points for domain-specific code. It has been used successfully in several scenarios. ed.nehcaa-htwr.gsbk@relleumein POMDP-based Candy Server: Lessons Learned from a Seven Day Demo • Marcus Hoerger, Joshua Mun Liang Song, Hanna Kurniawati, Alberto Elfes · [AWARD] POMDP-based Candy Server: Lessons Learned from a Seven Day Demo Marcus Hoerger (The University of Queensland), Joshua Mun Liang Song (The University of Queensland), Hanna Kurniawati (Australian National University), Alberto Elfes (CSIRO) An autonomous robot must decide a good strategy to achieve its long term goal, despite various types of uncertainty. The Partially Observable Markov Decision Processes (POMDPs) is a principled framework to address such a decision making problem. Despite the computational intractability of solving POMDPs, the past decade has seen substantial advancement in POMDP solvers. This paper presents our experience in enabling on-line POMDP solving to become the sole motion planner for a robot manipulation demo at IEEE SIMPAR and ICRA 2018. The demo scenario is a candy-serving robot: A 6-DOFs robot arm must pick-up a cup placed on a table by a user, use the cup to scoop candies from a box, and put the cup of candies back on the table. The average perception error is _3cm (≈ the radius of the cup), affecting the position of the cup and the surface level of the candies. This paper presents a strategy to alleviate the curse of history issue plaguing this scenario, the perception system and its integration with the planner, and lessons learned in enabling an online POMDP solver to become the sole motion planner of this entire task. The POMDP-based system were tested through a 7 days live demo at the two conferences. In this demo, 150 runs were attempted and 98% of them were successful. We also conducted further experiments to test the capability of our POMDP-based system when the environment is relatively cluttered by obstacles and when the user moves the cup while the robot tries to pick it up. In both cases, our POMDP-based system reaches a success rate of 90% and above. ua.ude.qu@regreoh.m POMHDP: Search-based Belief Space Planning using Multiple Heuristics • Sung-Kyun Kim, Oren Salzman, Maxim Likhachev · [AWARD] POMHDP: Search-based Belief Space Planning using Multiple Heuristics Sung-Kyun Kim (Carnegie Mellon University), Oren Salzman (Carnegie Mellon University), Maxim Likhachev (Carnegie Mellon University) Robots operating in the real world encounter substantial uncertainty that cannot be modeled deterministically before the actual execution. This gives rise to the necessity of robust motion planning under uncertainty also known as belief space planning. Belief space planning can be formulated as Partially Observable Markov Decision Process (POMDP). However, computing optimal policies for non-trivial POMDPs is computationally intractable. Building upon recent progress from the search community, we propose a novel anytime POMDP solver, Partially Observable Multi-Heuristic Dynamic Programming (POMHDP), that leverages multiple heuristics to efficiently compute high-quality solutions while guaranteeing asymptotic convergence to an optimal policy. Through iterative forward search, POMHDP utilizes domain knowledge to solve POMDPs with specific goals and an infinite horizon. We demonstrate the efficacy of our proposed framework on a real-world, highly-complex, truck unloading application. moc.liamg@7ksyks Trajectory Tracking Control for Robotic Vehicles using Counterexample Guided Training of Neural Networks • Arthur Clavière, Souradeep Dutta, Sriram Sankaranarayanan · [AWARD] Trajectory Tracking Control for Robotic Vehicles using Counterexample Guided Training of Neural Networks Arthur Clavière (Ecole Polytechnique and ISAE-SUPAERO, France), Souradeep Dutta (University of Colorado, Boulder), Sriram Sankaranarayanan (University of Colorado, Boulder) We investigate approaches to train neural networks for controlling vehicles to follow a fixed reference trajectory robustly, while respecting limits on their velocities and accelerations. Here robustness means that if a vehicle starts inside a fixed region around the reference trajectory, it remains within this region while moving along the reference from an initial set to a target set. We consider the combination of two ideas in this paper: (a) demonstrations of the correct control obtained from a model-predictive controller (MPC) and (b) falsification approaches that actively search for violations of the property, given a current candidate. Thus, our approach builds an initial training set using the MPC loop and creates a first candidate neural network controller. This controller is repeatedly analyzed using falsification that searches for counterexample trajectories, and the resulting counterexamples are used to create new training examples. This process proceeds iteratively until the falsifier no longer succeeds within a given computational budget. We propose falsification approaches using a combination of random sampling and gradient descent to systematically search for violations. We evaluate our combined approach on a variety of benchmarks that involve controlling dynamical models of cars and quadrotor aircraft. ude.odaroloc@attud.peedaruos Generalized Lazy Search for Robot Motion Planning: Interleaving Search and Edge Evaluations via Event-based Toggles • Aditya Mandalika, Sanjiban Choudhury, Oren Salzman, Siddhartha Srinivasa · Best Student Paper Award Generalized Lazy Search for Robot Motion Planning: Interleaving Search and Edge Evaluations via Event-based Toggles Aditya Mandalika (University of Washington), Sanjiban Choudhury (University of Washington), Oren Salzman (Carnegie Mellon University), Siddhartha Srinivasa (University of Washington) Lazy search algorithms are efficient at solving problems where edge evaluation is the bottleneck in computation, as is the case in robotic motion planning. The optimal algorithm in this class, LazySP, does so by lazily restricting edge evaluation only to the shortest path. However, this comes at the expense of search effort, i.e. LazySP has to recompute the search tree every time an edge is found to be invalid. This can get prohibitively expensive when dealing with large graphs or highly-cluttered environments. Our key insight is that both edge evaluation and search effort must be balanced to minimize the total planning time. Our contribution is two-fold. First, we propose a framework, Generalized Lazy Search (GLS), that seamlessly toggles between search and evaluation to manage wasted efforts. We show that for a choice of toggle, GLS is provably more efficient than LazySP. Secondly, we leverage prior experience in terms of edge probabilities to derive policies within the GLS framework that minimize expected planning time. We show that GLS armed with such priors significantly outperforms competitive baselines on a number of simulated environments in R2 and 7-DoF manipulation. moc.liamg@117kvaytida Open-world Reasoning for Service Robots • Yuqian Jiang, Nick Walker, Justin Hart, Peter Stone · [AWARD] Open-world Reasoning for Service Robots Yuqian Jiang (The University of Texas at Austin), Nick Walker (University of Washington), Justin Hart (The University of Texas at Austin), Peter Stone (The University of Texas at Austin) A service robot accepting verbal commands from a human operator is likely to encounter requests that reference objects not currently represented in its knowledge base. In domestic or office settings, the construction of a complete knowledge base would be cumbersome and unlikely to succeed in most real-world deployments. The world that such a robot operates in is thus “open” in the sense that some objects that it must act on in the real world are not described in its internal representation. However, when an operator gives a command referencing an object that the robot has not yet observed (and thus not incorporated into its knowledge base), we can think of the object as being hypothetical to the robot. This paper presents a novel method for closing the robot’s world model for planning purposes by introducing hypothetical objects into the robot’s knowledge base, reasoning about these hypothetical objects, and acting on these hypotheses in the real world. We use our implementation of this method on a domestic service robot as an illustrative demonstration to explore how it works in practice. ude.saxetu@naiquygnaij Speeding Up Search-based Motion Planning via Conservative Heuristics • Ishani Chatterjee, Maxim Likhachev, Ashwin Khadke, Manuela Veloso · [AWARD] Speeding Up Search-based Motion Planning via Conservative Heuristics Ishani Chatterjee (Carnegie Mellon University), Maxim Likhachev (Carnegie Mellon University), Ashwin Khadke (Carnegie Mellon University), Manuela Veloso (Carnegie Mellon University) Weighted A* search (wA*) is a popular tool for robot motionplanning. Its efficiency however depends on the quality of heuristic function used. In fact, it has been shown that the correlation between the heuristic function and the true costto-goal affects heavily the efficiency of the search, when used with a large weight on the heuristics. Motivated by this observation, we investigate the problem of computing heuristics that explicitly aim to minimize the amount of efforts the search has to do to find a feasible plan. The key observation we exploit is that while heuristics tries to guide the search along what looks like an optimal path towards the goal, there are other paths that are clearly sub-optimal yet are much easier to compute. For example, in motion planning domains like footstep-planning for humanoids, a heuristic that guides the search along a path away from obstacles is less likely to encounter local minima compared with the heuristics that guides the search along an optimal but closeto-obstacles path. We utilize this observation to define the concept of conservative heuristics and propose a simple algorithm for computing such a heuristic function. Experimental analysis on (1) humanoid footstep planning (simulation), (2) path planning for a UAV (simulation), and a real-world experiment in footstep-planning for a NAO robot shows the utility of the approach. ude.umc.werdna@rettahci