Efficient Search Algorithms for Identifying Synergistic Associations in High-Dimensional Datasets

Cillian Hourican, Jie Li, Pashupati P. Mishra, Terho Lehtimäki, Binisha H. Mishra, Mika Kähönen, Olli T. Raitakari, Reijo Laaksonen, Liisa Keltikangas-Järvinen, Markus Juonala, Rick Quax

Research output: Contribution to journalArticleScientificpeer-review

6 Downloads (Pure)

Abstract

In recent years, there has been a notably increased interest in the study of multivariate interactions and emergent higher-order dependencies. This is particularly evident in the context of identifying synergistic sets, which are defined as combinations of elements whose joint interactions result in the emergence of information that is not present in any individual subset of those elements. The scalability of frameworks such as partial information decomposition (PID) and those based on multivariate extensions of mutual information, such as O-information, is limited by combinational explosion in the number of sets that must be assessed. In order to address these challenges, we propose a novel approach that utilises stochastic search strategies in order to identify synergistic triplets within datasets. Furthermore, the methodology is extensible to larger sets and various synergy measures. By employing stochastic search, our approach circumvents the constraints of exhaustive enumeration, offering a scalable and efficient means to uncover intricate dependencies. The flexibility of our method is illustrated through its application to two epidemiological datasets: The Young Finns Study and the UK Biobank Nuclear Magnetic Resonance (NMR) data. Additionally, we present a heuristic for reducing the number of synergistic sets to analyse in large datasets by excluding sets with overlapping information. We also illustrate the risks of performing a feature selection before assessing synergistic information in the system.

Original languageEnglish
Article number968
Number of pages35
JournalEntropy
Volume26
Issue number11
DOIs
Publication statusPublished - Nov 2024
Publication typeA1 Journal article-refereed

Keywords

  • O-information
  • particle swarm optimization
  • simulated annealing
  • stochastic search
  • synergy

Publication forum classification

  • Publication forum level 0

ASJC Scopus subject areas

  • Information Systems
  • Mathematical Physics
  • Physics and Astronomy (miscellaneous)
  • General Physics and Astronomy
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Efficient Search Algorithms for Identifying Synergistic Associations in High-Dimensional Datasets'. Together they form a unique fingerprint.

Cite this