Publication:

MemSum: Extractive Summarization of Long Documents using Multi-step Episodic Markov Decision Processes

Date

Date

Date
2021
Working Paper
dc.contributor.institutionCornell University
dc.date.accessioned2022-03-28T12:40:19Z
dc.date.available2022-03-28T12:40:19Z
dc.date.issued2021
dc.description.abstract

We introduce MemSum (Multi-step Episodic Markov decision process extractive SUMmarizer), a reinforcement-learning-based extractive summarizer enriched at any given time step with information on the current extraction history. Similar to previous models in this vein, MemSum iteratively selects sentences into the summary. Our innovation is in considering a broader information set when summarizing that would intuitively also be used by humans in this task: 1) the text content of the sentence, 2) the global text context of the rest of the document, and 3) the extraction history consisting of the set of sentences that have already been extracted. With a lightweight architecture, MemSum nonetheless obtains state-of-the-art test-set performance (ROUGE score) on long document datasets (PubMed, arXiv, and GovReport). Supporting analysis demonstrates that the added awareness of extraction history gives MemSum robustness against redundancy in the source document.

dc.identifier.issn2331-8422
dc.identifier.urihttps://www.zora.uzh.ch/handle/20.500.14742/195120
dc.language.isoeng
dc.subject.ddc570 Life sciences; biology
dc.title

MemSum: Extractive Summarization of Long Documents using Multi-step Episodic Markov Decision Processes

dc.typeworking_paper
dcterms.accessRightsinfo:eu-repo/semantics/openAccess
dcterms.bibliographicCitation.number2107.08929
dcterms.bibliographicCitation.urlhttps://arxiv.org/abs/2107.08929
dspace.entity.typePublicationen
uzh.contributor.authorGu, Nianlong
uzh.contributor.authorAsh, Elliott
uzh.contributor.authorHahnloser, Richard H R
uzh.contributor.correspondenceYes
uzh.contributor.correspondenceNo
uzh.contributor.correspondenceNo
uzh.document.availabilitypublished_version
uzh.eprint.datestamp2022-03-28 12:40:19
uzh.eprint.lastmod2023-09-22 13:09:57
uzh.eprint.statusChange2022-03-28 12:40:19
uzh.harvester.ethYes
uzh.harvester.nbNo
uzh.identifier.doi10.5167/uzh-217773
uzh.oastatus.zoraGreen
uzh.publication.citationGu, Nianlong; Ash, Elliott; Hahnloser, Richard H R (2021). MemSum: Extractive Summarization of Long Documents using Multi-step Episodic Markov Decision Processes. ArXiv.org 2107.08929, Cornell University.
uzh.publication.freeAccessAtofficialurl
uzh.publication.seriesTitleArXiv.org
uzh.workflow.eprintid217773
uzh.workflow.fulltextStatuspublic
uzh.workflow.revisions12
uzh.workflow.rightsCheckoffen
uzh.workflow.statusarchive
Files

Original bundle

Name:
2107.08929.pdf
Size:
877.52 KB
Format:
Adobe Portable Document Format
Publication available in collections: