Publication:

Supervised OCR Error Detection and Correction Using Statistical and Neural Machine Translation Methods

Date

Date

Date
2018
Journal Article
Published version

Citations

Citation copied

Amrhein, C., & Clematide, S. (2018). Supervised OCR Error Detection and Correction Using Statistical and Neural Machine Translation Methods. Journal for Language Technology and Computational Linguistics, 33(1), 49–76. https://jlcl.org/content/2-allissues/1-heft1-2018/jlcl_2018-1_3.pdf

Abstract

Abstract

Abstract

For indexing the content of digitized historical texts, optical character recognition (OCR) errors are a hampering problem. To explore the effectivity of new strategies for OCR post-correction, this article focuses on methods of character-based machine translation, specifically neural machine translation and statistical machine translation. Using the ICDAR 2017 data set on OCR post-correction for English and French, we experiment with different strategies for error detection and error correction. We analyze how OCR post-correction wit

Metrics

Downloads

1113 since deposited on 2019-02-01
1081last week
Acq. date: 2025-11-12

Views

1106 since deposited on 2019-02-01
1104last week
Acq. date: 2025-11-12

Citations

Additional indexing

Creators (Authors)

Journal/Series Title

Journal/Series Title

Journal/Series Title

Volume

Volume

Volume
33

Number

Number

Number
1

Page range/Item number

Page range/Item number

Page range/Item number
49

Page end

Page end

Page end
76

Item Type

Item Type

Item Type
Journal Article

Dewey Decimal Classifikation

Dewey Decimal Classifikation

Dewey Decimal Classifikation

Keywords

OCR post-correction Machine Learning Neural Machine Translation Statistical Machine Translation

Language

Language

Language
English

Publication date

Publication date

Publication date
2018

Date available

Date available

Date available
2019-02-01

ISSN or e-ISSN

ISSN or e-ISSN

ISSN or e-ISSN
0175-1336

OA Status

OA Status

OA Status
Green

Free Access at

Free Access at

Free Access at
DOI

Metrics

Downloads

1113 since deposited on 2019-02-01
1081last week
Acq. date: 2025-11-12

Views

1106 since deposited on 2019-02-01
1104last week
Acq. date: 2025-11-12

Citations

Citations

Citation copied

Amrhein, C., & Clematide, S. (2018). Supervised OCR Error Detection and Correction Using Statistical and Neural Machine Translation Methods. Journal for Language Technology and Computational Linguistics, 33(1), 49–76. https://jlcl.org/content/2-allissues/1-heft1-2018/jlcl_2018-1_3.pdf

Green Open Access
Loading...
Thumbnail Image

Files

Files

Files
Files available to download:1

Files

Files

Files
Files available to download:1
Loading...
Thumbnail Image