Using Graph-Based Maximum Independent Sets with Large Language Models for Extractive Text Summarization

HARK, CENGİZ

doi:10.3390/app15126395

Using Graph-Based Maximum Independent Sets with Large Language Models for Extractive Text Summarization

HARK C.

Applied Sciences (Switzerland), cilt.15, sa.12, 2025 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 15 Sayı: 12
Basım Tarihi: 2025
Doi Numarası: 10.3390/app15126395
Dergi Adı: Applied Sciences (Switzerland)
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Aerospace Database, Agricultural & Environmental Science Database, Applied Science & Technology Source, Communication Abstracts, INSPEC, Metadex, Directory of Open Access Journals, Civil Engineering Abstracts
Anahtar Kelimeler: graph theory, LLM, maximum independent set, NLP, text summarization
İnönü Üniversitesi Adresli: Evet

Özet

Large Language Models (LLMs) have shown a strong performance across various tasks but still face challenges in automatic text summarization. While they are effective in capturing semantic patterns from large corpora, they typically lack mechanisms for encoding structural relationships between sentences or paragraphs. Their high hardware requirements and limited analysis as to processing efficiency further constrain their applicability. This paper proposes a framework employing the Graph Independent Set approach to extract the essence of textual graphs and address the limitations of LLMs. The framework encapsulates nodes and relations into structural graphs generated through Natural Language Processing (NLP) techniques based on the Maximum Independent Set (MIS) theory. The incorporation of graph-derived structural features enables more semantically cohesive and accurate summarization outcomes. Experiments on the Document Understanding Conference (DUC) and Cable News Network (CNN)/DailyMail datasets are conducted with different summary lengths to evaluate the performance of the framework. The proposed method provides up to a 41.05% (Recall-Oriented Understudy for Gisting Evaluation, ROUGE-2 F1) increase in summary quality and a 60.71% improvement in response times on models such as XLNet, Pegasus, and DistilBERT. The proposed framework enables more informative and concise summaries by embedding structural relationships into LLM-driven semantic representations, while reducing computational costs. In this study, we explore whether integrating MIS-based graph filtering with LLMs significantly enhances both the accuracy and efficiency of extractive text summarization.