Publication:

Comparing Rule-based and SMT-based Spelling Normalisation for English Historical Texts

Date

Date

Date
2017
Conference or Workshop Item
Published version

Citations

Citation copied

Schneider, G., Pettersson, E., & Percillier, M. (2017). Comparing Rule-based and SMT-based Spelling Normalisation for English Historical Texts (No. 133). 133, 40–46. http://www.ep.liu.se/ecp/article.asp?issue=133&article=008&volume=#

Abstract

Abstract

Abstract

To be able to use existing natural language processing tools for analysing historical text, an important preprocessing step is spelling normalisation, converting the original spelling to present-day spelling, before applying tools such as taggers and parsers. In this paper, we compare a probablistic, language-independent approach to spelling normalisation based on statistical machine translation (SMT) techniques, to a rule-based system combining dictionary lookup with rules and non-probabilistic weights. The rule-based system reaches

Metrics

Downloads

85 since deposited on 2017-05-30
Acq. date: 2025-11-12

Views

290 since deposited on 2017-05-30
Acq. date: 2025-11-12

Citations

Additional indexing

Creators (Authors)

  • Schneider, Gerold
  • Pettersson, Eva
  • Percillier, Michael

Event Title

Event Title

Event Title
Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language

Event Location

Event Location

Event Location
Gothenburg

Event Start Date

Event Start Date

Event Start Date
2017-05-22

Event End Date

Event End Date

Event End Date
2017-05-22

Publisher

Publisher

Publisher
Linköping University Electronic Press, Linköpings universitet

Page range/Item number

Page range/Item number

Page range/Item number
40

Page end

Page end

Page end
46

Item Type

Item Type

Item Type
Conference or Workshop Item

Dewey Decimal Classifikation

Dewey Decimal Classifikation

Dewey Decimal Classifikation

Language

Language

Language
English

Date available

Date available

Date available
2017-05-30

Number

Number

Number
133

OA Status

OA Status

OA Status
Green

Free Access at

Free Access at

Free Access at
Official URL

Metrics

Downloads

85 since deposited on 2017-05-30
Acq. date: 2025-11-12

Views

290 since deposited on 2017-05-30
Acq. date: 2025-11-12

Citations

Citations

Citation copied

Schneider, G., Pettersson, E., & Percillier, M. (2017). Comparing Rule-based and SMT-based Spelling Normalisation for English Historical Texts (No. 133). 133, 40–46. http://www.ep.liu.se/ecp/article.asp?issue=133&article=008&volume=#

Green Open Access
Loading...
Thumbnail Image

Files

Files

Files
Files available to download:1

Files

Files

Files
Files available to download:1
Loading...
Thumbnail Image