Publication:

Crowdsourcing an OCR Gold Standard for a German and French Heritage Corpus

Date

Date

Date
2016
Conference or Workshop Item
Published version

Citations

Citation copied

Clematide, S., Furrer, L., & Volk, M. (2016). Crowdsourcing an OCR Gold Standard for a German and French Heritage Corpus. 975–982. http://www.lrec-conf.org/proceedings/lrec2016/pdf/917_Paper.pdf

Abstract

Abstract

Abstract

Crowdsourcing approaches for post-correction of OCR output (Optical Character Recognition) have been successfully applied to several historic text collections. We report on our crowd-correction platform Kokos, which we built to improve the OCR quality of the digitized yearbooks of the Swiss Alpine Club (SAC) from the 19th century. This multilingual heritage corpus consists of Alpine texts mainly written in German and French, all typeset in Antiqua font. Finding and engaging volunteers for correcting large amounts of pages into high qu

Metrics

Downloads

233 since deposited on 2016-07-05
Acq. date: 2025-11-13

Views

1528 since deposited on 2016-07-05
Acq. date: 2025-11-13

Additional indexing

Creators (Authors)

Event Title

Event Title

Event Title
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

Event Location

Event Location

Event Location
Portorož

Event Country

Event Country

Event Country
Slovenia

Event Start Date

Event Start Date

Event Start Date
2016-05-23

Event End Date

Event End Date

Event End Date
2016-05-28

Publisher

Publisher

Publisher
European Language Resources Association (ELRA)

Page range/Item number

Page range/Item number

Page range/Item number
975

Page end

Page end

Page end
982

Item Type

Item Type

Item Type
Conference or Workshop Item

Dewey Decimal Classifikation

Dewey Decimal Classifikation

Dewey Decimal Classifikation

Language

Language

Language
English

Date available

Date available

Date available
2016-07-05

ISBN or e-ISBN

ISBN or e-ISBN

ISBN or e-ISBN
978-2-9517408-9-1

OA Status

OA Status

OA Status
Green

Free Access at

Free Access at

Free Access at
Official URL

Metrics

Downloads

233 since deposited on 2016-07-05
Acq. date: 2025-11-13

Views

1528 since deposited on 2016-07-05
Acq. date: 2025-11-13

Citations

Citation copied

Clematide, S., Furrer, L., & Volk, M. (2016). Crowdsourcing an OCR Gold Standard for a German and French Heritage Corpus. 975–982. http://www.lrec-conf.org/proceedings/lrec2016/pdf/917_Paper.pdf

Green Open Access
Loading...
Thumbnail Image

Files

Files

Files
Files available to download:1

Files

Files

Files
Files available to download:1
Loading...
Thumbnail Image