Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Migration von ZORA auf die Software DSpace

ZORA will change to a new software on 8th September 2025. Please note: deadline for new submissions is 21th July 2025!

Information & dates for training courses can be found here: Information on Software Migration.

Innovations in parallel corpus search tools

Volk, Martin; Graën, Johannes; Callegaro, Elena (2014). Innovations in parallel corpus search tools. In: Ninth International Conference on Language Resources and Evaluation (LREC'14), Reykjavik, 26 May 2014 - 31 May 2014, European Language Resources Association (ELRA).

Abstract

Recent years have seen an increased interest in and availability of parallel corpora. Large corpora from international organizations (e.g. European Union, United Nations, European Patent Office), or from multilingual Internet sites (e.g. OpenSubtitles) are now easily available and are used for statistical machine translation but also for online search by different user groups. This paper gives an overview of different usages and different types of search systems. In the past, parallel corpus search systems were based on sentence-aligned corpora. We argue that automatic word alignment allows for major innovations in searching parallel corpora. Some online query systems already employ word alignment for sorting translation variants, but none supports the full query functionality that has been developed for parallel treebanks. We propose to develop such a system for efficiently searching large parallel corpora with a powerful query language.

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
08 Research Priority Programs > Language and Space
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Scopus Subject Areas:Social Sciences & Humanities > Linguistics and Language
Social Sciences & Humanities > Library and Information Sciences
Social Sciences & Humanities > Education
Social Sciences & Humanities > Language and Linguistics
Language:English
Event End Date:31 May 2014
Deposited On:14 Jul 2014 07:01
Last Modified:26 Mar 2022 08:05
Publisher:European Language Resources Association (ELRA)
ISBN:978-2-9517408-8-4
OA Status:Green
Official URL:http://www.lrec-conf.org/proceedings/lrec2014/pdf/504_Paper.pdf
Related URLs:http://www.lrec-conf.org/proceedings/lrec2014/summaries/504.html
Download PDF  'Innovations in parallel corpus search tools'.
Preview
  • Content: Published Version

Metadata Export

Statistics

Citations

12 citations in Web of Science®
12 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

413 downloads since deposited on 14 Jul 2014
31 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications