Header

UZH-Logo

Maintenance Infos

BNC Dependency Bank 1.0


Lehmann, Hans Martin; Schneider, Gerold (2012). BNC Dependency Bank 1.0. In: Oksefjell, Signe; Ebeling, Jarle; Hasselgard, Hilde. Aspects of corpus linguistics: compilation, annotation, analysis. Helsinki: Research Unit for Variation, Contacts, and Change in English, online.

Abstract

In this paper we present the first release version of our dependency bank for the British National Corpus. We describe the process of annotating the corpus with syntactic information, discuss the resulting dependency annotation and outline a database storage model for the annotation. We then present a web-based interface to the syntactically annotated data and provide an overview of its functionality. The use of fully automatically parsed data without massive manual intervention is far from unproblematic, given the limited accuracy of state of the art parsers. We discuss the problems inherent to automatic annotation and present strategies for coping with them. The purpose of this project is to give general linguists access to the wealth of syntactic and distributional information present in a large corpus like the British National Corpus.

Abstract

In this paper we present the first release version of our dependency bank for the British National Corpus. We describe the process of annotating the corpus with syntactic information, discuss the resulting dependency annotation and outline a database storage model for the annotation. We then present a web-based interface to the syntactically annotated data and provide an overview of its functionality. The use of fully automatically parsed data without massive manual intervention is far from unproblematic, given the limited accuracy of state of the art parsers. We discuss the problems inherent to automatic annotation and present strategies for coping with them. The purpose of this project is to give general linguists access to the wealth of syntactic and distributional information present in a large corpus like the British National Corpus.

Statistics

Additional indexing

Item Type:Book Section, refereed, original work
Communities & Collections:06 Faculty of Arts > English Department
Dewey Decimal Classification:820 English & Old English literatures
Language:English
Date:2012
Deposited On:04 Jan 2013 12:31
Last Modified:07 Dec 2017 17:26
Publisher:Research Unit for Variation, Contacts, and Change in English
Series Name:Studies in Variation, Contacts and Change in English
Number:12
ISSN:1797-4453
Free access at:Official URL. An embargo period may apply.
Official URL:http://www.helsinki.fi/varieng/journal/volumes/12/lehmann_schneider/

Download

Full text not available from this repository.