TREATS: Fairness-aware entity resolution over streaming data

Tiago Brasileiro Araújo, Vasilis Efthymiou, Vassilis Christophides, Evaggelia Pitoura, Kostas Stefanidis

Tutkimustuotos: ArtikkeliTieteellinenvertaisarvioitu

5 Lataukset (Pure)

Abstrakti

Currently, the growing proliferation of information systems generates large volumes of data continuously, stemming from a variety of sources such as web platforms, social networks, and multiple devices. These data, often lacking a defined schema, require an initial process of consolidation and cleansing before analysis and knowledge extraction can occur. In this context, Entity Resolution (ER) plays a crucial role, facilitating the integration of knowledge bases and identifying similarities among entities from different sources. However, the traditional ER process is computationally expensive, and becomes more complicated in the streaming context where the data arrive continuously. Moreover, there is a lack of studies involving fairness and ER, which is related to the absence of discrimination or bias. In this sense, fairness criteria aim to mitigate the implications of data bias in ER systems, which requires more than just optimizing accuracy, as traditionally done. Considering this context, this work presents TREATS, a schema-agnostic and fairness-aware ER workflow developed for managing streaming data incrementally. The proposed fairness-aware ER framework tackles constraints across various groups of interest, presenting a resilient and equitable solution to the related challenges. Through experimental evaluation, the proposed techniques and heuristics are compared against state-of-the-art approaches over five real-world data source pairs, in which the results demonstrated significant improvements in terms of fairness, without degradation of effectiveness and efficiency measures in the streaming environment. In summary, our contributions aim to propel the ER field forward by providing a workflow that addresses both technical challenges and ethical concerns.

AlkuperäiskieliEnglanti
Artikkeli102506
JulkaisuInformation Systems
Vuosikerta129
Varhainen verkossa julkaisun päivämäärä12 jouluk. 2024
DOI - pysyväislinkit
TilaJulkaistu - maalisk. 2025
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

Julkaisufoorumi-taso

  • Jufo-taso 2

!!ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture

Sormenjälki

Sukella tutkimusaiheisiin 'TREATS: Fairness-aware entity resolution over streaming data'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä