Header

UZH-Logo

Maintenance Infos

Continual learning with hypernetworks


von Oswald, Johannes; Henning, Christian; Grewe, Benjamin F; Sacramento, João (2020). Continual learning with hypernetworks. In: ICLR 2020, Virtual Conference, 26 April 2020 - 1 May 2020, ICLR.

Abstract

Artificial neural networks suffer from catastrophic forgetting when they are se-quentially trained on multiple tasks. To overcome this problem, we present a novelapproach based on task-conditioned hypernetworks, i.e., networks that generatethe weights of a target model based on task identity. Continual learning (CL) isless difficult for this class of models thanks to a simple key feature: instead ofrecalling the input-output relations of all previously seen data, task-conditionedhypernetworks only require rehearsing task-specific weight realizations, which canbe maintained in memory using a simple regularizer. Besides achieving state-of-the-art performance on standard CL benchmarks, additional experiments on longtask sequences reveal that task-conditioned hypernetworks display a very largecapacity to retain previous memories. Notably, such long memory lifetimes areachieved in a compressive regime, when the number of trainable hypernetworkweights is comparable or smaller than target network size. We provide insight intothe structure of low-dimensional task embedding spaces (the input space of thehypernetwork) and show that task-conditioned hypernetworks demonstrate transferlearning. Finally, forward information transfer is further supported by empiricalresults on a challenging CL benchmark based on the CIFAR-10/100 image datasets.

Abstract

Artificial neural networks suffer from catastrophic forgetting when they are se-quentially trained on multiple tasks. To overcome this problem, we present a novelapproach based on task-conditioned hypernetworks, i.e., networks that generatethe weights of a target model based on task identity. Continual learning (CL) isless difficult for this class of models thanks to a simple key feature: instead ofrecalling the input-output relations of all previously seen data, task-conditionedhypernetworks only require rehearsing task-specific weight realizations, which canbe maintained in memory using a simple regularizer. Besides achieving state-of-the-art performance on standard CL benchmarks, additional experiments on longtask sequences reveal that task-conditioned hypernetworks display a very largecapacity to retain previous memories. Notably, such long memory lifetimes areachieved in a compressive regime, when the number of trainable hypernetworkweights is comparable or smaller than target network size. We provide insight intothe structure of low-dimensional task embedding spaces (the input space of thehypernetwork) and show that task-conditioned hypernetworks demonstrate transferlearning. Finally, forward information transfer is further supported by empiricalresults on a challenging CL benchmark based on the CIFAR-10/100 image datasets.

Statistics

Citations

Altmetrics

Downloads

135 downloads since deposited on 16 Feb 2021
94 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Neuroinformatics
Dewey Decimal Classification:570 Life sciences; biology
Language:English
Event End Date:1 May 2020
Deposited On:16 Feb 2021 08:16
Last Modified:24 Apr 2022 06:57
Publisher:ICLR
OA Status:Green
Publisher DOI:https://doi.org/10.1109/ICRA48506.2021.9560793

Download

Green Open Access

Download PDF  'Continual learning with hypernetworks'.
Preview
Content: Published Version
Filetype: PDF
Size: 2MB
View at publisher