Data Fusion Methods and an Application on Exploration of Gene Regulatory Mechanisms

Xiaofeng Dai

    Research output: Book/ReportDoctoral thesisCollection of Articles

    47 Downloads (Pure)


    Understanding the regulatory mechanisms of gene regulatory networks (GRN) is an important topic in the field of Systems Biology. It has been widely accepted that holistic approaches are needed to explore biological systems given, for example, the noisy dynamics of gene expression and the complex interactions between genes and between gene expression products and other cellular components. As new advanced high throughput technologies emerge, i.e., as more information sources become available, thorough investigation of this problem is becoming feasible to be addressed from multiple perspectives. The main objective of this thesis is to provide solutions to problems related to gene regulatory mechanisms with data fusion methods, aiming at a more precise understanding of a GRN's structure and its dynamics. This thesis can be divided into two parts: the presentation of the new data fusion methods here proposed to explore GRNs' topologies and, subsequently, the application of one method to investigate the dynamics of such networks. In the `Methods' chapter, two methods are proposed: one for transcription factor binding sites (TFBS) prediction and the other for gene clustering. The results from TFBS prediction can be used as an input for the gene clustering algorithm. Particularly, a new data fusion method is developed and novel information sources are explored to improve TFBS prediction accuracy in comparison with previous methods. Three finite joint mixture models are developed to cluster genes from multiple data sources: the beta-Gaussian mixture model (BGMM), the stratified beta-Gaussian mixture model (sBGMM) and the Gaussian-Bernoulli mixture model (GBMM). These methods are shown to significantly improve the accuracy of TFBS predictions and clustering results. In the `Application' chapter, one of the developed methods is applied to detect noisy attractors in delayed stochastic models of GRNs. The detection of noisy attractors is carried out for a model of a genetic toggle switch (TS) and for a model of an excitable genetic circuit of Bacillus subtilis responsible for phenotypic changes, by fusing multiple data sources extracted from the dynamics of the corresponding GRN. The results suggest that resorting to a single data source alone is, in general, insufficient to reveal the underlying structure of the GRN or to capture the changes in the dynamics of a GRN modeled according to the delayed stochastic framework. In summary, this thesis focuses on developing and applying data fusion methods to explore the topology and dynamics of a GRN, including TFBS prediction, gene clustering and noisy attractor detection. The developed algorithms and strategies are applicable to investigate real biological phenomena, and the findings can be used to guide future wet- or dry-lab experiments.
    Translated title of the contributionData Fusion Methods and an Application on Exploration of Gene Regulatory Mechanisms
    Original languageEnglish
    PublisherTampere University of Technology
    Number of pages85
    ISBN (Electronic)978-952-15-2312-0
    ISBN (Print)978-952-15-2299-4
    Publication statusPublished - 12 Jan 2010
    Publication typeG5 Doctoral dissertation (article)

    Publication series

    NameTampere University of Technology. Publication
    PublisherTampere University of Technology
    ISSN (Print)1459-2045


    Dive into the research topics of 'Data Fusion Methods and an Application on Exploration of Gene Regulatory Mechanisms'. Together they form a unique fingerprint.

    Cite this