TY - GEN
T1 - LLM-Driven Summarization and Distinguish Analysis of Multiple Entities in RDF Graphs
AU - Iqbal, Hamza
AU - Stefanidis, Kostas
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2025
Y1 - 2025
N2 - This research implements the application of Large Language Models (LLMs) in the summarization and distinguish analysis of multiple entities within Resource Description Framework (RDF) graphs. As the volume of structured data on the web is growing exponentially, the need for efficient and effective methods to interpret and summarize this data becomes increasingly important. This study focuses on utilizing LLMs to generate human-readable summaries from RDF graphs and particularly emphasizing on distinguishing between multiple entities. The study apply SPARQL queries to extract relevant data from DBpedia, subsequently a thorough process of frequency analysis and property unification to refine the dataset. Three LLMs including ChatGPT, DeepSeek, and Mistral have been evaluated for their ability to generate coherent and informative summaries. The evaluation process combines human-based assessments with automated metrics for the thorough analysis of generated texts. Key outcomes include the effectiveness of LLMs in generating summaries that are both informative and contextually relevant. The research also reflects the importance of data preprocessing techniques, such as frequency analysis and property unification in enhancing the quality of the generated summaries. Moreover, the study provides insights into the strengths and limitations of different LLMs in summarizing RDF data that offers a foundation for future research in this area. A framework for evaluating the performance of LLMs in summarization tasks has been designed in this research opens the way for future explorations in the application of advanced AI technologies in data interpretation and knowledge representation.
AB - This research implements the application of Large Language Models (LLMs) in the summarization and distinguish analysis of multiple entities within Resource Description Framework (RDF) graphs. As the volume of structured data on the web is growing exponentially, the need for efficient and effective methods to interpret and summarize this data becomes increasingly important. This study focuses on utilizing LLMs to generate human-readable summaries from RDF graphs and particularly emphasizing on distinguishing between multiple entities. The study apply SPARQL queries to extract relevant data from DBpedia, subsequently a thorough process of frequency analysis and property unification to refine the dataset. Three LLMs including ChatGPT, DeepSeek, and Mistral have been evaluated for their ability to generate coherent and informative summaries. The evaluation process combines human-based assessments with automated metrics for the thorough analysis of generated texts. Key outcomes include the effectiveness of LLMs in generating summaries that are both informative and contextually relevant. The research also reflects the importance of data preprocessing techniques, such as frequency analysis and property unification in enhancing the quality of the generated summaries. Moreover, the study provides insights into the strengths and limitations of different LLMs in summarizing RDF data that offers a foundation for future research in this area. A framework for evaluating the performance of LLMs in summarization tasks has been designed in this research opens the way for future explorations in the application of advanced AI technologies in data interpretation and knowledge representation.
KW - Distinguish Analysis
KW - Large Language Models
KW - RDF
KW - Summarization
U2 - 10.1007/978-3-032-05727-3_31
DO - 10.1007/978-3-032-05727-3_31
M3 - Conference contribution
AN - SCOPUS:105017377389
SN - 9783032057266
T3 - Communications in Computer and Information Science
SP - 367
EP - 383
BT - New Trends in Database and Information Systems
A2 - Chrysanthis, Panos K.
A2 - Nørvåg, Kjetil
A2 - Stefanidis, Kostas
A2 - Zhang, Zheying
A2 - Quintarelli, Elisa
A2 - Zumpano, Ester
PB - Springer
T2 - European Conference on Advances in Databases and Information Systems
Y2 - 23 September 2025 through 26 September 2025
ER -