Header

UZH-Logo

Maintenance Infos

UZH@CRAFT-ST: a Sequence-labeling Approach to Concept Recognition


Furrer, Lenz; Cornelius, Joseph; Rinaldi, Fabio (2019). UZH@CRAFT-ST: a Sequence-labeling Approach to Concept Recognition. In: Proceedings of The 5th Workshop on BioNLP Open Shared Tasks, Hong Kong, China, 4 November 2019 - 5 November 2019. Association for Computational Linguistics, 185-195.

Abstract

As our submission to the CRAFT shared task 2019, we present two neural approaches to concept recognition. We propose two different systems for joint named entity recognition (NER) and normalization (NEN), both of which model the task as a sequence labeling problem. Our first system is a BiLSTM network with two separate outputs for NER and NEN trained from scratch, whereas the second system is an instance of BioBERT fine-tuned on the concept-recognition task. We exploit two strategies for extending concept coverage, ontology pretraining and backoff with a dictionary lookup. Our results show that the backoff strategy effectively tackles the problem of unseen concepts, addressing a major limitation of the chosen design. In the cross-system comparison, BioBERT proves to be a strong basis for creating a concept-recognition system, although some entity types are predicted more accurately by the BiLSTM-based system.

Abstract

As our submission to the CRAFT shared task 2019, we present two neural approaches to concept recognition. We propose two different systems for joint named entity recognition (NER) and normalization (NEN), both of which model the task as a sequence labeling problem. Our first system is a BiLSTM network with two separate outputs for NER and NEN trained from scratch, whereas the second system is an instance of BioBERT fine-tuned on the concept-recognition task. We exploit two strategies for extending concept coverage, ontology pretraining and backoff with a dictionary lookup. Our results show that the backoff strategy effectively tackles the problem of unseen concepts, addressing a major limitation of the chosen design. In the cross-system comparison, BioBERT proves to be a strong basis for creating a concept-recognition system, although some entity types are predicted more accurately by the BiLSTM-based system.

Statistics

Citations

Dimensions.ai Metrics

Altmetrics

Downloads

75 downloads since deposited on 15 Nov 2019
9 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
08 Research Priority Programs > Digital Society Initiative
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:5 November 2019
Deposited On:15 Nov 2019 16:24
Last Modified:18 Feb 2022 08:25
Publisher:Association for Computational Linguistics
OA Status:Hybrid
Free access at:Official URL. An embargo period may apply.
Publisher DOI:https://doi.org/10.18653/v1/D19-5726
Official URL:https://www.aclweb.org/anthology/D19-5726/
Project Information:
  • : FunderSNSF
  • : Grant IDCR30I1_162758
  • : Project TitleMelanoBase
  • Content: Published Version
  • Language: English
  • Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)