UZH-Logo

Maintenance Infos

Semantic annotation for concept-based cross-language medical information retrieval


Volk, M; Ripplinger, B; Vintar, S; Buitelaar, P; Raileanu, D; Sacaleanu, B (2002). Semantic annotation for concept-based cross-language medical information retrieval. International Journal of Medical Informatics, 67(1-3):97-112.

Abstract

We present a framework for concept-based cross-language information retrieval in the medical domain, which is under development in the MUCHMORE pro ject. Our approach is based on using the Unified Medical Language System (UMLS) as the primary source of semantic data. Documents and queries are annotated with multiple layers of linguistic information. Linguistic processing includes part-of-speech tagging, morphological analysis, phrase recognition and the identification of medical terms and semantic relations between them.
The paper describes experiments in monolingual and cross-language document retrieval, performed on a corpus of medical abstracts. Results show that linguistic processing, especially lemmatization and compound analysis for German, is a crucial step to achieving a good baseline performance. On the other hand they show that semantic information, specifically the combined use of concepts and relations, increases the performance in monolingual and cross-language retrieval.

We present a framework for concept-based cross-language information retrieval in the medical domain, which is under development in the MUCHMORE pro ject. Our approach is based on using the Unified Medical Language System (UMLS) as the primary source of semantic data. Documents and queries are annotated with multiple layers of linguistic information. Linguistic processing includes part-of-speech tagging, morphological analysis, phrase recognition and the identification of medical terms and semantic relations between them.
The paper describes experiments in monolingual and cross-language document retrieval, performed on a corpus of medical abstracts. Results show that linguistic processing, especially lemmatization and compound analysis for German, is a crucial step to achieving a good baseline performance. On the other hand they show that semantic information, specifically the combined use of concepts and relations, increases the performance in monolingual and cross-language retrieval.

Citations

12 citations in Web of Science®
31 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

75 downloads since deposited on 24 Aug 2009
29 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Date:2002
Deposited On:24 Aug 2009 11:49
Last Modified:05 Apr 2016 13:19
Publisher:Elsevier
ISSN:1386-5056
Publisher DOI:https://doi.org/10.1016/S1386-5056(02)00058-8
Permanent URL: https://doi.org/10.5167/uzh-20335

Download

[img]
Preview
Filetype: PDF
Size: 1MB
View at publisher

TrendTerms

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.
You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.

Author Collaborations