Abstrakti
The rapid expansion of information on the Web resulted in the
development of collections of structured data known as knowledge bases
(KBs), often storing information in triples in the Resource Description
Framework (RDF) format. Finding necessary pieces of data in these KBs
usually requires knowledge of their structure and query languages, such
as SPARQL, which can be challenging for an inexperienced user. The
dissertation aims to facilitate users’ communication with KBs,
specifically by developing systems that perform question answering (QA)
and summarisation over large RDF KBs, such as DBpedia and Wikidata,
using natural language (NL).
The research process included publication of four papers. The first one describes the development of GQA, a system exploiting the Grammatical Framework technology to convert complex NL questions into semantic grammar parses. MuGQA, a multilingual extension of GQA answering questions in English, German, French, and Italian, is the focus of the second publication. The third one is devoted to TraQuLA, the most advanced QA system developed, involving flexible pattern matching and allowing users to trace the system’s reasoning in transforming NL questions into SPARQL queries, thus ensuring transparency. The fourth publication tackles the challenge of generating human-readable summaries for multiple RDF entities, whereas prior research has been focused on summarising individual entities.
The dissertation contributes to the fields of QA over knowledge graphs and entity summarisation by creating transparent, multilingual, and user-accessible systems that bridge the gap between extensive knowledge bases and non-expert users. Testing demonstrated that a rule-based QA system (TraQuLA) can successfully compete with advanced machine learning techniques over popular QA datasets, while remaining easily interpretable for users and developers. While exploring the novel field of NL summarisation of multiple RDF entities, we designed an experimental framework with evaluation criteria to assess the quality of machine-generated summaries and their effectiveness in helping humans in writing their own summaries. Overall, the dissertation advances QA and summarisation in the field of RDF data, tackling both technical challenges and user-focused aspects to enhance the accessibility of structured KBs.
The research process included publication of four papers. The first one describes the development of GQA, a system exploiting the Grammatical Framework technology to convert complex NL questions into semantic grammar parses. MuGQA, a multilingual extension of GQA answering questions in English, German, French, and Italian, is the focus of the second publication. The third one is devoted to TraQuLA, the most advanced QA system developed, involving flexible pattern matching and allowing users to trace the system’s reasoning in transforming NL questions into SPARQL queries, thus ensuring transparency. The fourth publication tackles the challenge of generating human-readable summaries for multiple RDF entities, whereas prior research has been focused on summarising individual entities.
The dissertation contributes to the fields of QA over knowledge graphs and entity summarisation by creating transparent, multilingual, and user-accessible systems that bridge the gap between extensive knowledge bases and non-expert users. Testing demonstrated that a rule-based QA system (TraQuLA) can successfully compete with advanced machine learning techniques over popular QA datasets, while remaining easily interpretable for users and developers. While exploring the novel field of NL summarisation of multiple RDF entities, we designed an experimental framework with evaluation criteria to assess the quality of machine-generated summaries and their effectiveness in helping humans in writing their own summaries. Overall, the dissertation advances QA and summarisation in the field of RDF data, tackling both technical challenges and user-focused aspects to enhance the accessibility of structured KBs.
| Alkuperäiskieli | Englanti |
|---|---|
| Kustantaja | Tampere University |
| ISBN (elektroninen) | 978-952-03-3914-2 |
| ISBN (painettu) | 978-952-03-3913-5 |
| Tila | Julkaistu - 2025 |
| OKM-julkaisutyyppi | G5 Artikkeliväitöskirja |
Julkaisusarja
| Nimi | Tampere University Dissertations - Tampereen yliopiston väitöskirjat |
|---|---|
| Vuosikerta | 1230 |
| ISSN (painettu) | 2489-9860 |
| ISSN (elektroninen) | 2490-0028 |
Sormenjälki
Sukella tutkimusaiheisiin 'Transparent RDF Question Answering and Summarisation'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.Siteeraa tätä
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver