Skip to main navigation Skip to search Skip to main content

A Unified Multi-Reference Framework for Training and Evaluation in Abstractive Summarization

  • Abishek B. Rao
  • , Shivani G. Aithal
  • , Sanjay Singh*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Abstractive summarization models rely on a single gold reference for training and evaluation, assuming it is optimal. We show that machine-generated summaries frequently exhibit stronger semantic alignment with source articles than gold references on CNN/DM and XSum. To address this, we propose a unified framework comprising the Multi-Reference Training Framework (MRTF) and Multi-Reference Evaluation Framework (MREF), both driven by an LSI-based Semantic Selection Mechanism (LSI-SSM). Grounded in the Eckart–Young–Mirsky theorem, LSI-SSM uses only k=2 dimensions to achieve 98.9% of neural embedding quality at zero GPU cost. Across BRIO, BART, PEGASUS, and DistilBART, MRTF yields statistically significant (p < 0.001) ROUGE-1 gains of +2.13 to +2.61 on CNN/DM. MREF reveals that traditional evaluation undervalues summaries by 3–5 ROUGE-1 points, with distributional bias analysis confirming 83–87% of corrections reflect genuine quality. Cross-domain validation on Multi-News, SAMSum, and PubMed confirms generalizability.

Original languageEnglish
JournalIEEE Access
DOIs
Publication statusAccepted/In press - 2026

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • General Materials Science
  • General Engineering

Fingerprint

Dive into the research topics of 'A Unified Multi-Reference Framework for Training and Evaluation in Abstractive Summarization'. Together they form a unique fingerprint.

Cite this