Header

UZH-Logo

Maintenance Infos

Using Distributional Similarity to Organise BioMedical Terminology


Weeds, J; Dowdall, J; Schneider, G; Keller, B; Weir, D (2005). Using Distributional Similarity to Organise BioMedical Terminology. Terminology, 11(1):3-4.

Abstract

We investigate an application of distributional similarity techniques to the problem of structural organisation of biomedical terminology. Our application domain is the relatively small GENIA corpus. Using terms that havebeen accurately marked-up by hand within the corpus, we consider the problem of automatically determining semantic proximity. Terminological units are defined for our purposes as normalised classes of individual terms. Syntactic analysis of the corpus data is carried out using the Pro3Gres parser and provides the data required to calculate distributional similarity using a variety of measures. Evaluation is performed against a hand-crafted gold standard for this domain in the form of the GENIA ontology. We show that distributional similarity can be used to predict semantic type with a good degree of accuracy, reaching an optimal value of 63.1%.

Abstract

We investigate an application of distributional similarity techniques to the problem of structural organisation of biomedical terminology. Our application domain is the relatively small GENIA corpus. Using terms that havebeen accurately marked-up by hand within the corpus, we consider the problem of automatically determining semantic proximity. Terminological units are defined for our purposes as normalised classes of individual terms. Syntactic analysis of the corpus data is carried out using the Pro3Gres parser and provides the data required to calculate distributional similarity using a variety of measures. Evaluation is performed against a hand-crafted gold standard for this domain in the form of the GENIA ontology. We show that distributional similarity can be used to predict semantic type with a good degree of accuracy, reaching an optimal value of 63.1%.

Statistics

Citations

Downloads

0 downloads since deposited on 12 Jun 2009
0 downloads since 12 months

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Date:2005
Deposited On:12 Jun 2009 15:53
Last Modified:05 Apr 2016 13:15
Publisher:John Benjamins
ISSN:0929-9971
Official URL:http://search.ebscohost.com/login.aspx?direct=true&db=ufh&AN=17786336&loginpage=Login.asp&site=ehost-live

Download