MUTUAL: Multi-Domain Sentiment Classification via Uncertainty Sampling

Katerina Katsarou, Roxana Jeney, Kostas Stefanidis

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

Multi-domain sentiment classification trains a classifier using multiple domains and then tests the classifier on one of the domains. Importantly, no domain is assumed to have sufficient labeled data; instead, the goal is leveraging information between domains, making multi-domain sentiment classification a very realistic scenario. Typically, labeled data is costly because humans must classify it manually. In this context, we propose the MUTUAL approach that learns general and domain-specific sentence embeddings that are also context-aware due to the attention mechanism. In this work, we propose using a stacked BiLSTM-based Autoencoder with an attention mechanism to generate the two above-mentioned types of sentence embeddings. Then, using the Jensen-Shannon (JS) distance, the general sentence embeddings of the four most similar domains to the target domain are selected. The selected general sentence embeddings and the domain-specific embeddings are concatenated and fed into a dense layer for training. Evaluation results on public datasets with 16 different domains demonstrate the efficiency of our model. In addition, we propose an active learning algorithm that first applies the elliptic envelope for outlier removal to a pool of unlabeled data that the MUTUAL model then classifies. Next, the most uncertain data points are selected to be labeled based on the least confidence metric. The experiments show higher accuracy for querying 38% of the original data than random sampling.

Original languageEnglish
Title of host publicationProceedings of the 38th ACM/SIGAPP Symposium on Applied Computing, SAC 2023
PublisherACM
Pages331-339
Number of pages9
ISBN (Electronic)9781450395175
DOIs
Publication statusPublished - 27 Mar 2023
Publication typeA4 Article in conference proceedings
EventAnnual ACM Symposium on Applied Computing - Tallinn, Estonia
Duration: 27 Mar 202331 Mar 2023

Publication series

NameProceedings of the ACM Symposium on Applied Computing

Conference

ConferenceAnnual ACM Symposium on Applied Computing
Country/TerritoryEstonia
CityTallinn
Period27/03/2331/03/23

Keywords

  • active learning
  • BiLSTM
  • jensen-shannon distance
  • multi-domain sentiment classification
  • self-attention
  • sentence embeddings
  • uncertainty sampling

Publication forum classification

  • Publication forum level 1

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'MUTUAL: Multi-Domain Sentiment Classification via Uncertainty Sampling'. Together they form a unique fingerprint.

Cite this