UZH-Logo

Maintenance Infos

Reflections and a Proposal for a Query and Reporting Language for Richly Annotated Multiparallel Corpora


Clematide, Simon (2015). Reflections and a Proposal for a Query and Reporting Language for Richly Annotated Multiparallel Corpora. In: Gintare, Grigonyte; Clematide, Simon; Utka, Andrius; Volk, Martin. Proceedings of the Workshop on Innovative Corpus Query and Visualization Tools at NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania. Linköping, Sweden: Linköping University Electronic Press, Linköpings universitet, 6-16.

Abstract

Large and open multiparallel corpora are a valuable resource for contrastive corpus linguists if the data is annotated and stored in a way that allows precise and flexible ad hoc searches. A linguistic query language should also support computational linguists in automated multilingual data mining. We review a broad range of approaches for linguistic query and reporting languages according to usability criteria such as expressibility, expressiveness, and efficiency. We propose an architecture that tries to strike the right balance to suit practical purposes.

Large and open multiparallel corpora are a valuable resource for contrastive corpus linguists if the data is annotated and stored in a way that allows precise and flexible ad hoc searches. A linguistic query language should also support computational linguists in automated multilingual data mining. We review a broad range of approaches for linguistic query and reporting languages according to usability criteria such as expressibility, expressiveness, and efficiency. We propose an architecture that tries to strike the right balance to suit practical purposes.

Altmetrics

Downloads

21 downloads since deposited on 27 Aug 2015
20 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Book Section, refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
420 English & Old English languages
Uncontrolled Keywords:linguistic query systems,parallel corpora
Language:English
Date:2015
Deposited On:27 Aug 2015 06:37
Last Modified:30 May 2016 20:21
Publisher:Linköping University Electronic Press, Linköpings universitet
Series Name:Linköping Electronic Conference Proceedings
Number:111
ISSN:1650-3740
ISBN:978-91-7519-035-8
Funders:SNF
Official URL:http://www.ep.liu.se/ecp_home/index.en.aspx?issue=111
Permanent URL: https://doi.org/10.5167/uzh-112066

Download

[img]
Preview
Content: Published Version
Filetype: PDF
Size: 199kB

TrendTerms

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.
You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.

Author Collaborations