Goodness-of-fit tests and heavy-tailed distributions in network traffic data analysis

M. Laurikkala

    Research output: Book/ReportDoctoral thesisMonograph

    171 Downloads (Pure)

    Abstract

    Network management system is a vital part of a modern telecommunication network. The duties of the system include, among other things, fault management, configuration management, and performance management. For these purposes the network management system collects vast amounts of data, the processing and analysis of which has developed into a whole discipline. Network traffic data analysis involves, for example, change detection, prediction, and modelling. This thesis concentrates on network traffic data analysis with statistical tools, goodness-of-fit tests in particular. Instead of artificially generated data, data sets collected from real networks serve as case examples. Since real network data fit poorly to analytical distributions or textbook examples, Monte Carlo simulation is used for modelling the properties of the data. The various quantities measured from telecommunication networks reportedly exhibit heavy-tailed distributions. Heavy-tailed distributions possess special features (such as infinite variance) that make them problematic for statistical analysis as well as network management. This is why heavy-tailed distributions are one of the premises of this work. The network management system usually does not allow tailoring the measurements for a specific purpose but the analysis has to adapt to the data available. A histogram is one of the most popular means to compress data, that is, the data from the system often come as a histogram. This work develops a method for change detection of histogram data. Furthermore, classical goodness-of-fit tests are largely inadequate for network traffic data. In addition to heavy-tailed distributions, the huge amount of data causes problems. This thesis collects several test statistics proposed in the literature for testing heavy-tailed distributions. Their usefulness is assessed through a power study, where a scenario of true traffic change detection is created. According to the results, the plain median outperforms all the more complicated test statistics in change detection. A suitable sample size is sought with a similar power study, because the large amount of data may easily ruin the feasibility of the test. Some sources cite predictability as an advantage of heavy-tailed distributions, but this feature has never been exploited. This thesis first generalizes the predictability to the time-continuous domain and then develops it further to a model that tries to predict traffic volume. However, the usefulness of the predictability remains limited, because several assumptions have to be made that do not necessarily hold in real network applications.
    Translated title of the contributionGoodness-of-fit tests and heavy-tailed distributions in network traffic data analysis
    Original languageEnglish
    Place of PublicationTampere
    PublisherTampere University of Technology
    Number of pages127
    ISBN (Electronic)978-952-15-2233-8
    ISBN (Print)978-952-15-2191-1
    Publication statusPublished - 21 Aug 2009
    Publication typeG4 Doctoral dissertation (monograph)

    Publication series

    NameTampere University of Technology. Publication
    PublisherTampere University of Technology
    Volume823
    ISSN (Print)1459-2045

    Publication forum classification

    • No publication forum level

    Fingerprint

    Dive into the research topics of 'Goodness-of-fit tests and heavy-tailed distributions in network traffic data analysis'. Together they form a unique fingerprint.

    Cite this