Navigation auf zora.uzh.ch

Search

ZORA (Zurich Open Repository and Archive)

What Do Language Representations Really Represent?

Bjerva, Johannes; Östling, Robert; Veiga, Maria Han; Tiedemann, Jörg; Augenstein, Isabelle (2019). What Do Language Representations Really Represent? Computational Linguistics, 45(2):381-389.

Abstract

A neural language model trained on a text corpus can be used to induce distributed representations of words, such that similar words end up with similar representations. If the corpus is multilingual, the same model can be used to learn distributed representations of languages, such that similar languages end up with similar representations. We show that this holds even when the multilingual corpus has been translated into English, by picking up the faint signal left by the source languages. However, just as it is a thorny problem to separate semantic from syntactic similarity in word representations, it is not obvious what type of similarity is captured by language representations. We investigate correlations and causal relationships between language representations learned from translations on one hand, and genetic, geographical, and several levels of structural similarity between languages on the other. Of these, structural similarity is found to correlate most strongly with language representation similarity, whereas genetic relationships—a convenient benchmark used for evaluation in previous work—appears to be a confounding factor. Apart from implications about translation effects, we see this more generally as a case where NLP and linguistic typology can interact and benefit one another.

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Institute for Computational Science
Dewey Decimal Classification:530 Physics
Scopus Subject Areas:Social Sciences & Humanities > Language and Linguistics
Social Sciences & Humanities > Linguistics and Language
Physical Sciences > Computer Science Applications
Physical Sciences > Artificial Intelligence
Uncontrolled Keywords:Linguistics and Language, Artificial Intelligence, Language and Linguistics, Computer Science Applications
Language:English
Date:1 June 2019
Deposited On:18 Feb 2020 14:14
Last Modified:06 Sep 2024 03:31
Publisher:MIT Press
ISSN:0891-2017
OA Status:Gold
Free access at:Publisher DOI. An embargo period may apply.
Publisher DOI:https://doi.org/10.1162/coli_a_00351
Download PDF  'What Do Language Representations Really Represent?'.
Preview
  • Content: Published Version
  • Licence: Creative Commons: Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)

Metadata Export

Statistics

Citations

Dimensions.ai Metrics
22 citations in Web of Science®
40 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

24 downloads since deposited on 18 Feb 2020
3 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications