UZH-Logo

Detecting Protein-Protein Interactions in Biomedical Literature Using a Parser


Schneider, Gerold (2009). Detecting Protein-Protein Interactions in Biomedical Literature Using a Parser. In: Clematide, Simon; Klenner, Manfred; Volk, Martin. Searching Answers. Münster: MV Verlag, 109-118.

Abstract

We describe the task of automatically detecting interactions between proteins in biomedical literature. We use a syntactic parser, a corpus annotated for proteins, and manual decisions as training material. After automatically parsing the GENIA corpus, which is manually annotated for proteins, all syntactic paths between proteins are extracted. These syntactic paths are manually disambiguated between meaningful paths and irrelevant paths. Meaningful paths are paths that express an interaction between the syntactically connected proteins, irrelevant paths are paths that do not convey any interaction. The resource created by these manual decisions is used in two ways. First, words that appear frequently inside a meaningful path are learnt using simple machine learning. Second, these resources are applied to the task of automatically detecting interactions between proteins in biomedical literature.

We describe the task of automatically detecting interactions between proteins in biomedical literature. We use a syntactic parser, a corpus annotated for proteins, and manual decisions as training material. After automatically parsing the GENIA corpus, which is manually annotated for proteins, all syntactic paths between proteins are extracted. These syntactic paths are manually disambiguated between meaningful paths and irrelevant paths. Meaningful paths are paths that express an interaction between the syntactically connected proteins, irrelevant paths are paths that do not convey any interaction. The resource created by these manual decisions is used in two ways. First, words that appear frequently inside a meaningful path are learnt using simple machine learning. Second, these resources are applied to the task of automatically detecting interactions between proteins in biomedical literature.

Downloads

52 downloads since deposited on 23 Dec 2009
19 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Book Section, not refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
06 Faculty of Arts > English Department
Dewey Decimal Classification:000 Computer science, knowledge & systems
820 English & Old English literatures
410 Linguistics
Uncontrolled Keywords:IR, NLP, text mining, parsing, biomedicine
Language:English
Date:2009
Deposited On:23 Dec 2009 13:29
Last Modified:06 Jun 2016 07:42
Publisher:MV Verlag
ISBN:978-3-642-00381-3
Funders:Swiss National Science Fund, Grant 100014-118396/1
Related URLs:https://biblio.unizh.ch/F/BF7ST932V6CMIRNS815B6T74N23MEE6IT75F668GNMVDF48M8M-12370?func=full-set-set&set_number=012424&set_entry=000001&format=001
Permanent URL: http://doi.org/10.5167/uzh-24602

Download

[img]
Preview
Filetype: PDF
Size: 1MB

TrendTerms

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.
You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.

Author Collaborations