A multi-view genomic data simulator

Michele Fratello, Angela Serra, Vittorio Fortino, Giancarlo Raiconi, Roberto Tagliaferri, Dario Greco

Research output: Contribution to journalArticleScientificpeer-review

7 Citations (Scopus)

Abstract

BACKGROUND: OMICs technologies allow to assay the state of a large number of different features (e.g., mRNA expression, miRNA expression, copy number variation, DNA methylation, etc.) from the same samples. The objective of these experiments is usually to find a reduced set of significant features, which can be used to differentiate the conditions assayed. In terms of development of novel feature selection computational methods, this task is challenging for the lack of fully annotated biological datasets to be used for benchmarking. A possible way to tackle this problem is generating appropriate synthetic datasets, whose composition and behaviour are fully controlled and known a priori.

RESULTS: Here we propose a novel method centred on the generation of networks of interactions among different biological molecules, especially involved in regulating gene expression. Synthetic datasets are obtained from ordinary differential equations based models with known parameters. Our results show that the generated datasets are well mimicking the behaviour of real data, for popular data analysis methods are able to selectively identify existing interactions.

CONCLUSIONS: The proposed method can be used in conjunction to real biological datasets in the assessment of data mining techniques. The main strength of this method consists in the full control on the simulated data while retaining coherence with the real biological processes. The R package MVBioDataSim is freely available to the scientific community at http://neuronelab.unisa.it/?p=1722.

Original languageEnglish
Pages (from-to)151
JournalBMC Bioinformatics
Volume16
DOIs
Publication statusPublished - 12 May 2015
Externally publishedYes
Publication typeA1 Journal article-refereed

Keywords

  • Algorithms
  • Computational Biology/methods
  • Computer Simulation
  • DNA Copy Number Variations
  • DNA Methylation
  • Datasets as Topic
  • Gene Expression Profiling/methods
  • Gene Expression Regulation
  • Gene Regulatory Networks
  • Genomics/methods
  • Humans
  • MicroRNAs/genetics

Fingerprint

Dive into the research topics of 'A multi-view genomic data simulator'. Together they form a unique fingerprint.

Cite this