Header

UZH-Logo

Maintenance Infos

A Benchmark Comparison of Learned Control Policies for Agile Quadrotor Flight


Kaufmann, Elia; Bauersfeld, Leonard; Scaramuzza, Davide (2022). A Benchmark Comparison of Learned Control Policies for Agile Quadrotor Flight. In: 39th IEEE International Conference on Robotics and Automation, ICRA 2022, Philadelphia, PA, United States of America, 23 May 2022 - 27 May 2022. Institute of Electrical and Electronics Engineers, 10504-10510.

Abstract

Quadrotors are highly nonlinear dynamical systems that require carefully tuned controllers to be pushed to their physical limits. Recently, learning-based control policies have been proposed for quadrotors, as they would potentially allow learning direct mappings from high-dimensional raw sensory observations to actions. Due to sample inefficiency, training such learned controllers on the real platform is impractical or even impossible. Training in simulation is attractive but requires to transfer policies between domains, which demands trained policies to be robust to such domain gap. In this work, we make two contributions: (i) we perform the first benchmark comparison of existing learned control policies for agile quadrotor flight and show that training a control policy that commands body-rates and thrust results in more robust sim-to-real transfer compared to a policy that directly specifies individual rotor thrusts, (ii) we demonstrate for the first time that such a control policy trained via deep reinforcement learning can control a quadrotor in real-world experiments at speeds over 45 km/h.

Abstract

Quadrotors are highly nonlinear dynamical systems that require carefully tuned controllers to be pushed to their physical limits. Recently, learning-based control policies have been proposed for quadrotors, as they would potentially allow learning direct mappings from high-dimensional raw sensory observations to actions. Due to sample inefficiency, training such learned controllers on the real platform is impractical or even impossible. Training in simulation is attractive but requires to transfer policies between domains, which demands trained policies to be robust to such domain gap. In this work, we make two contributions: (i) we perform the first benchmark comparison of existing learned control policies for agile quadrotor flight and show that training a control policy that commands body-rates and thrust results in more robust sim-to-real transfer compared to a policy that directly specifies individual rotor thrusts, (ii) we demonstrate for the first time that such a control policy trained via deep reinforcement learning can control a quadrotor in real-world experiments at speeds over 45 km/h.

Statistics

Citations

Dimensions.ai Metrics
6 citations in Web of Science®
14 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

2 downloads since deposited on 27 Feb 2024
2 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Scopus Subject Areas:Physical Sciences > Software
Physical Sciences > Control and Systems Engineering
Physical Sciences > Electrical and Electronic Engineering
Physical Sciences > Artificial Intelligence
Scope:Discipline-based scholarship (basic research)
Language:English
Event End Date:27 May 2022
Deposited On:27 Feb 2024 13:47
Last Modified:02 May 2024 12:48
Publisher:Institute of Electrical and Electronics Engineers
Series Name:IEEE International Conference on Robotics and Automation. Proceedings
ISSN:1050-4729
ISBN:978-1-7281-9681-7
Additional Information:© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
OA Status:Closed
Publisher DOI:https://doi.org/10.1109/ICRA46639.2022.9811564