Detecting Summary-worthy Sentences: the Effect of Discourse Features
Abstract
We examine the benefit of a variety of discourse and semantic features for the identification of summary-worthy content in narrative stories. Using logistic regression models, we find that the most informative features are those that relate to the narrative structure of a text. We show that automatic methods for feature extraction perform significantly worse than full manual annotation, but that with optimization, a fully automatic approach can outperform a variety of existing extractive approaches to summarization.