Publication:

Parsing early and late modern English corpora

Date

Date

Date
2015
Journal Article
Published version

Citations

Citation copied

Schneider, G., Lehmann, H. M., & Schneider, P. (2015). Parsing early and late modern English corpora. Literary and Linguistic Computing, 30(3), 423–439. https://doi.org/10.1093/llc/fqu001

Abstract

Abstract

Abstract

We describe, evaluate, and improve the automatic annotation of diachronic corpora at the levels of word-class, lemma, chunks, and dependency syntax. As corpora we use the ARCHER corpus (texts from 1,600 to 2,000) and the ZEN corpus (texts from 1,660 to 1,800). Performance on Modern English is considerably lower than on Present Day English (PDE). We present several methods that improve performance. First we use the spelling normalization tool VARD to map spelling variants to their PDE equivalent, which improves tagging. We investigate

Metrics

Downloads

250 since deposited on 2015-02-20
242last week
Acq. date: 2025-11-12

Views

354 since deposited on 2015-02-20
353last week
Acq. date: 2025-11-12

Additional indexing

Creators (Authors)

  • Schneider, Gerold
    affiliation.icon.alt
  • Lehmann, Hans Martin
    affiliation.icon.alt
  • Schneider, Peter
    affiliation.icon.alt

Journal/Series Title

Journal/Series Title

Journal/Series Title

Volume

Volume

Volume
30

Number

Number

Number
3

Page range/Item number

Page range/Item number

Page range/Item number
423

Page end

Page end

Page end
439

Item Type

Item Type

Item Type
Journal Article

Dewey Decimal Classifikation

Dewey Decimal Classifikation

Dewey Decimal Classifikation

Language

Language

Language
English

Publication date

Publication date

Publication date
2015

Date available

Date available

Date available
2015-02-20

Publisher

Publisher

Publisher

ISSN or e-ISSN

ISSN or e-ISSN

ISSN or e-ISSN
0268-1145

Additional Information

Additional Information

Additional Information
This is a pre-copyedited, author-produced PDF of an article accepted for publication in Literary and Linguistic Computing following peer review. The definitive publisher-authenticated version [Parsing Early and Late Modern English corpora Gerold Schneider , Hans Martin Lehmann , Peter Schneider, Digital Scholarship in the Humanities Feb 2014, DOI: 10.1093/llc/fqu001 ] is available online at: http://dsh.oxfordjournals.org/content/early/2014/12/02/llc.fqu001.

OA Status

OA Status

OA Status
Green

Free Access at

Free Access at

Free Access at
Unspecified

Metrics

Downloads

250 since deposited on 2015-02-20
242last week
Acq. date: 2025-11-12

Views

354 since deposited on 2015-02-20
353last week
Acq. date: 2025-11-12

Citations

Citation copied

Schneider, G., Lehmann, H. M., & Schneider, P. (2015). Parsing early and late modern English corpora. Literary and Linguistic Computing, 30(3), 423–439. https://doi.org/10.1093/llc/fqu001

Green Open Access
Loading...
Thumbnail Image

Files

Files

Files
Files available to download:2

Files

Files

Files
Files available to download:2
Loading...
Thumbnail Image