Exploring COVID-related relationship extraction: Contrasting data sources and analyzing misinformation

Research output: Contribution to journalArticleScientificpeer-review

4 Downloads (Pure)


The COVID-19 pandemic presented an unparalleled challenge to global healthcare systems. A central issue revolves around the urgent need to swiftly amass critical biological and medical knowledge concerning the disease, its treatment, and containment. Remarkably, text data remains an underutilized resource in this context. In this paper, we delve into the extraction of COVID-related relations using transformer-based language models, including Bidirectional Encoder Representations from Transformers (BERT) and DistilBERT. Our analysis scrutinizes the performance of five language models, comparing information from both PubMed and Reddit, and assessing their ability to make novel predictions, including the detection of “misinformation.” Key findings reveal that, despite inherent differences, both PubMed and Reddit data contain remarkably similar information, suggesting that Reddit can serve as a valuable resource for rapidly acquiring information during times of crisis. Furthermore, our results demonstrate that language models can unveil previously unseen entities and relations, a crucial aspect in identifying instances of misinformation.

Original languageEnglish
Article numbere26973
Issue number5
Publication statusPublished - 15 Mar 2024
Publication typeA1 Journal article-refereed


  • Artificial intelligence
  • Data science
  • Deep learning
  • Misinformation
  • Natural language processing
  • Public health
  • Relation extraction

Publication forum classification

  • Publication forum level 1

ASJC Scopus subject areas

  • General


Dive into the research topics of 'Exploring COVID-related relationship extraction: Contrasting data sources and analyzing misinformation'. Together they form a unique fingerprint.

Cite this