Skip to main navigation Skip to search Skip to main content

Unified extractive-abstractive summarization: a hybrid approach utilizing BERT and transformer models for enhanced document summarization

  • S. Divya*
  • , N. Sripriya
  • , J. Andrew*
  • , Manuel Mazzara
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

With the exponential proliferation of digital documents, there arises a pressing need for automated document summarization (ADS). Summarization, a compression technique, condenses a source document into concise sentences that encapsulate its salient information for summary generation. A primary challenge lies in crafting a dependable summary, contingent upon both extracted features and human-established parameters. This article introduces an intelligent methodology that seamlessly integrates extractive and abstractive techniques to ensure heightened relevance between the input document and its summary. Initially, input sentences undergo transformation into representations utilizing BERT, subsequently transposed into a symmetric matrix based on their similarity. Semantically congruent sentences are then extracted from this matrix to construct an extractive summary. The transformer model integrates an objective function highly symmetric and invariant under unitary transformation for language generation. This model refines the extracted informative sentences and generates an abstractive summary akin to manually crafted summaries. Employing this hybrid summarization technique on the CNN/DailyMail dataset and DUC2004, we evaluate its efficacy using ROUGE metrics. Results demonstrate the superiority of our proposed technique over conventional summarization methods.

Original languageEnglish
Article numbere2424
Pages (from-to)1-26
Number of pages26
JournalPeerJ Computer Science
Volume10
DOIs
Publication statusPublished - 2024

All Science Journal Classification (ASJC) codes

  • General Computer Science

Fingerprint

Dive into the research topics of 'Unified extractive-abstractive summarization: a hybrid approach utilizing BERT and transformer models for enhanced document summarization'. Together they form a unique fingerprint.

Cite this