Technical Program

Paper Detail

Paper Title An information theoretic model for summarization, and some basic results
Paper IdentifierMO4.R1.2
Authors Eric Graves, ARL, United States; Qiang Ning, University of Illinois at Urbana-Champaign, United States; Prithwish Basu, Raytheon BBN Technologies, United States
Session Information Theory and Learning I
Location Le Théatre (Parterre), Level -1
Session Time Monday, 08 July, 16:40 - 18:00
Presentation Time Monday, 08 July, 17:00 - 17:20
Manuscript  Click here to download the manuscript
Abstract A basic information theoretic model for summarization is formulated. Here summarization is considered as the process of taking a report of $v$ binary objects, and producing from it a $j$ element subset that captures most of the important features of the original report, with importance being defined via an arbitrary set function endemic to the model. The loss of information is measured by a weight average of variational distances, which we term the semantic loss. Our results include both cases where the probability distribution generating the $v$-length reports are known and unknown. In the case where the generating distribution is known, our results demonstrate how to construct minimal semantic loss summarizers. For the case where the probability distribution is unknown, we show how to construct summarizers which minimize the semantic loss averaged uniformly over all possible distribution converges to the minimum.