Data Vault Mappings to Dimensional Model Using Schema Matching

Mikko Puonti, Timo Raitalaakso

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

In data warehousing, business driven development defines data requirements to fulfill reporting needs. A data warehouse stores current and historical data in one single place. Data warehouse architecture consists of several layers and each has its own purpose. A staging layer is a data storage area to assists data loadings, a data vault modelled layer is the persistent storage that integrates data and stores the history, whereas publish layer presents data using a vocabulary that is familiar to the information users. By following the process which is driven by business requirements and starts with publish layer structure, this creates a situation where manual work requires a specialist, who knows the data vault model. Our goal is to reduce the number of entities that can be selected in a transformation so that the individual developer does not need to know the whole solution, but can focus on a subset of entities (partial schema). In this paper, we present two different schema matchers, one based on attribute names, and another based on data flow mapping information. Schema matching based on data flow mappings is a novel addition to current schema matching literature. Through the example of Northwind, we show how these two different matchers affect the formation of a partial schema for transformation source entities. Based on our experiment with Northwind we conclude that combining schema matching algorithms produces correct entities in the partial schema.

Original languageEnglish
Title of host publicationResearch and Practical Issues of Enterprise Information Systems - 13th IFIP WG 8.9 International Conference, CONFENIS 2019, Proceedings
EditorsPetr Doucek, Josef Basl, Antonin Pavlicek, A Min Tjoa, Katrin Detter, Maria Raffai
PublisherSpringer
Pages55-64
Number of pages10
ISBN (Print)9783030376314
DOIs
Publication statusPublished - 13 Dec 2019
Publication typeA4 Article in conference proceedings
EventIFIP WG 8.9 Working Conference on Research and Practical Issues of Enterprise Information Systems - Prague, Czech Republic
Duration: 16 Dec 201917 Dec 2019

Publication series

NameLecture Notes in Business Information Processing
Volume375
ISSN (Print)1865-1348
ISSN (Electronic)1865-1356

Conference

ConferenceIFIP WG 8.9 Working Conference on Research and Practical Issues of Enterprise Information Systems
Country/TerritoryCzech Republic
CityPrague
Period16/12/1917/12/19

Keywords

  • Data flow
  • Data vault
  • Data warehouse
  • Dimensional model
  • Schema matching

Publication forum classification

  • Publication forum level 1

ASJC Scopus subject areas

  • Management Information Systems
  • Control and Systems Engineering
  • Business and International Management
  • Information Systems
  • Modelling and Simulation
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Data Vault Mappings to Dimensional Model Using Schema Matching'. Together they form a unique fingerprint.

Cite this