Skip to main navigation Skip to search Skip to main content

Transparent RDF Question Answering and Summarisation

  • Elizaveta Zimina

Research output: Book/ReportDoctoral thesisCollection of Articles

Abstract

The rapid expansion of information on the Web resulted in the development of collections of structured data known as knowledge bases (KBs), often storing information in triples in the Resource Description Framework (RDF) format. Finding necessary pieces of data in these KBs usually requires knowledge of their structure and query languages, such as SPARQL, which can be challenging for an inexperienced user. The dissertation aims to facilitate users’ communication with KBs, specifically by developing systems that perform question answering (QA) and summarisation over large RDF KBs, such as DBpedia and Wikidata, using natural language (NL).

The research process included publication of four papers. The first one describes the development of GQA, a system exploiting the Grammatical Framework technology to convert complex NL questions into semantic grammar parses. MuGQA, a multilingual extension of GQA answering questions in English, German, French, and Italian, is the focus of the second publication. The third one is devoted to TraQuLA, the most advanced QA system developed, involving flexible pattern matching and allowing users to trace the system’s reasoning in transforming NL questions into SPARQL queries, thus ensuring transparency. The fourth publication tackles the challenge of generating human-readable summaries for multiple RDF entities, whereas prior research has been focused on summarising individual entities.

The dissertation contributes to the fields of QA over knowledge graphs and entity summarisation by creating transparent, multilingual, and user-accessible systems that bridge the gap between extensive knowledge bases and non-expert users. Testing demonstrated that a rule-based QA system (TraQuLA) can successfully compete with advanced machine learning techniques over popular QA datasets, while remaining easily interpretable for users and developers. While exploring the novel field of NL summarisation of multiple RDF entities, we designed an experimental framework with evaluation criteria to assess the quality of machine-generated summaries and their effectiveness in helping humans in writing their own summaries. Overall, the dissertation advances QA and summarisation in the field of RDF data, tackling both technical challenges and user-focused aspects to enhance the accessibility of structured KBs.
Original languageEnglish
PublisherTampere University
ISBN (Electronic)978-952-03-3914-2
ISBN (Print)978-952-03-3913-5
Publication statusPublished - 2025
Publication typeG5 Doctoral dissertation (articles)

Publication series

NameTampere University Dissertations - Tampereen yliopiston väitöskirjat
Volume1230
ISSN (Print)2489-9860
ISSN (Electronic)2490-0028

Fingerprint

Dive into the research topics of 'Transparent RDF Question Answering and Summarisation'. Together they form a unique fingerprint.

Cite this