Abstract
Research scientists and companies working in the domains of biomedicine and genomics are increasingly
faced with the problem of efficiently locating, in the vast amount of published scientific results, the critical
pieces of information that are needed in order to assess current and future research investment.
In this paper we describe approaches taken within the scope of the second Biocreative competition in
order to solve two aspects of this problem: the detection of novel protein interactions reported in scientific
articles, and the detection of the experimental method that was used to confirm the interaction.
Our approach is based on a high-recall protein annotation step, followed by two sharp disambiguation
steps. The remaining proteins are then combined according to a number of lexico-syntactic filters, which
deliver high-precision results, while maintaining a reasonable recall.