Publication:

Sparse Attention with Linear Units

Date

Date

Date
2021
Conference or Workshop Item
Published version

Citations

Citation copied

Zhang, B., Titov, I., & Sennrich, R. (2021). Sparse Attention with Linear Units. 6507–6520. https://aclanthology.org/2021.emnlp-main.523

Abstract

Abstract

Abstract

Recently, it has been argued that encoder-decoder models can be made more interpretable by replacing the softmax function in the attention with its sparse variants. In this work, we introduce a novel, simple method for achieving sparsity in attention: we replace the softmax activation with a , and show that sparsity naturally emerges from such a formulation. Training stability is achieved with layer normalization with either a specialized initialization or an additional gating function. Our model, which we call Rectified Linear Attent

Metrics

Downloads

4 since deposited on 2021-11-08
Acq. date: 2025-11-12

Views

1 since deposited on 2021-11-08
Acq. date: 2025-11-12

Additional indexing

Creators (Authors)

Event Title

Event Title

Event Title
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Event Location

Event Location

Event Location
Online and Punta Cana

Event Country

Event Country

Event Country
Dominican Republic

Event Start Date

Event Start Date

Event Start Date
2021-11-07

Event End Date

Event End Date

Event End Date
2021-11-11

Publisher

Publisher

Publisher
ACL Anthology

Page range/Item number

Page range/Item number

Page range/Item number
6507

Page end

Page end

Page end
6520

Item Type

Item Type

Item Type
Conference or Workshop Item

Dewey Decimal Classifikation

Dewey Decimal Classifikation

Dewey Decimal Classifikation

Language

Language

Language
English

Date available

Date available

Date available
2021-11-08

OA Status

OA Status

OA Status
Green

Free Access at

Free Access at

Free Access at
Official URL

Official URL

Official URL

Official URL

Metrics

Downloads

4 since deposited on 2021-11-08
Acq. date: 2025-11-12

Views

1 since deposited on 2021-11-08
Acq. date: 2025-11-12

Citations

Citation copied

Zhang, B., Titov, I., & Sennrich, R. (2021). Sparse Attention with Linear Units. 6507–6520. https://aclanthology.org/2021.emnlp-main.523

Green Open Access
Loading...
Thumbnail Image

Files

Files

Files
Files available to download:1

Files

Files

Files
Files available to download:1
Loading...
Thumbnail Image