Header

UZH-Logo

Maintenance Infos

Binomials in Swedish corpora – ‘Ordpar 1965’ revisited


Volk, Martin; Graën, Johannes (2022). Binomials in Swedish corpora – ‘Ordpar 1965’ revisited. In: Volodina, Elena; Dannélls, Dana; Berdicevskis, Aleksandrs; Forsberg, Markus; Virk, Shafqat. Live and Learn : Festschrift in honor of Lars Borin. Göteborg: Department of Swedish, Multilingualism and Language Technology, University of Gothenburg, 139-144.

Abstract

This paper describes a corpus study on Swedish binomials, a special type of multi-word expressions. Binomials are of the type "X conjunction Y" where X and Y are words, typically of the same part-of-speech. Bendz (1965) investigated the various use cases and functions of such binomials and included a list of more than 1000 candidates in his appendix. We were curious to what extent these binomials can still be found in modern corpora. We therefore checked this list against the Swedish Europarl and OpenSubtitles corpora. We found that many of the binomials are still in use today even in these diverse text genres. The relative frequency of binomials in Europarl is much higher than in OpenSubtitles.

Abstract

This paper describes a corpus study on Swedish binomials, a special type of multi-word expressions. Binomials are of the type "X conjunction Y" where X and Y are words, typically of the same part-of-speech. Bendz (1965) investigated the various use cases and functions of such binomials and included a list of more than 1000 candidates in his appendix. We were curious to what extent these binomials can still be found in modern corpora. We therefore checked this list against the Swedish Europarl and OpenSubtitles corpora. We found that many of the binomials are still in use today even in these diverse text genres. The relative frequency of binomials in Europarl is much higher than in OpenSubtitles.

Statistics

Altmetrics

Downloads

13 downloads since deposited on 05 Dec 2022
6 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Book Section, not_refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
06 Faculty of Arts > Zurich Center for Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Uncontrolled Keywords:Corpus Linguistics, Multi-word Expressions, Binomials, Mutual Information Score
Language:English
Date:18 November 2022
Deposited On:05 Dec 2022 06:27
Last Modified:21 Apr 2023 16:25
Publisher:Department of Swedish, Multilingualism and Language Technology, University of Gothenburg
ISBN:978-91-87850-82-0
OA Status:Green
  • Content: Published Version
  • Language: English
  • Licence: Creative Commons: Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)