UZH-Logo

Maintenance Infos

Using Distributional Similarity to Organise BioMedical Terminology


Weeds, J; Dowdall, J; Schneider, G; Keller, B; Weir, D (2005). Using Distributional Similarity to Organise BioMedical Terminology. Terminology, 11(1):3-4.

Abstract

We investigate an application of distributional similarity techniques to the problem of structural organisation of biomedical terminology. Our application domain is the relatively small GENIA corpus. Using terms that havebeen accurately marked-up by hand within the corpus, we consider the problem of automatically determining semantic proximity. Terminological units are defined for our purposes as normalised classes of individual terms. Syntactic analysis of the corpus data is carried out using the Pro3Gres parser and provides the data required to calculate distributional similarity using a variety of measures. Evaluation is performed against a hand-crafted gold standard for this domain in the form of the GENIA ontology. We show that distributional similarity can be used to predict semantic type with a good degree of accuracy, reaching an optimal value of 63.1%.

Abstract

We investigate an application of distributional similarity techniques to the problem of structural organisation of biomedical terminology. Our application domain is the relatively small GENIA corpus. Using terms that havebeen accurately marked-up by hand within the corpus, we consider the problem of automatically determining semantic proximity. Terminological units are defined for our purposes as normalised classes of individual terms. Syntactic analysis of the corpus data is carried out using the Pro3Gres parser and provides the data required to calculate distributional similarity using a variety of measures. Evaluation is performed against a hand-crafted gold standard for this domain in the form of the GENIA ontology. We show that distributional similarity can be used to predict semantic type with a good degree of accuracy, reaching an optimal value of 63.1%.

Citations

Downloads

0 downloads since deposited on 12 Jun 2009
0 downloads since 12 months

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Date:2005
Deposited On:12 Jun 2009 15:53
Last Modified:05 Apr 2016 13:15
Publisher:John Benjamins
ISSN:0929-9971
Official URL:http://search.ebscohost.com/login.aspx?direct=true&db=ufh&AN=17786336&loginpage=Login.asp&site=ehost-live

Download

[img]
Filetype: PDF - Registered users only
Size: 1MB

TrendTerms

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.
You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.

Author Collaborations