Return to DUC Homepage

Document
Understanding
Conferences

Introduction
Publications
Data
Guidelines

Procedure for human comparison of model (reference) and peer (system-generated and other) abstracts using SEE

For each document set (in randomized order):

For each summary type (single-document or multi-document summary):

For each peer summary (in randomized order) - composed of peer units (PUs), which will be sentences:

If the peer target size is greater than 10, the evaluator reads the peer summary and then makes overall judgments as to the peer summary's quality, independent of the model. The answers are chosen in every case from the following set of 5 ordered categories.

Here is the text of the questions

View the model summary - composed of model units (MUs), which are human-corrected chunks of a type to be determined

Evaluator steps through the MUs. For each MU s/he:

marks any/all PU(s) sharing content with the current MU
indicates whether the marked PUs, taken together, express about 0%, 20%, 40%, 60%, 80%, or 100% of the content in the current MU.

Evaluator reviews unmarked PUs and indicates once for the entire peer summary that:

About 0%, 20%, 40%, 60%, 80%, or 100% of the unmarked PUs are related but needn't be included in the model summary

(Evaluators will be allowed to review and revise earlier peer summary judgments before moving to the next document set - to mitigate learning effects.)

For data, past results, mailing list or other general information
contact: Lori Buckland ([email protected])
For other questions contact: Paul Over ([email protected])
Last updated:
Date created: Thursday, 25-March-04