Abstract
Visualization is crucial in the first steps of data analysis. In visual data exploration with scatter plots, no single plot is sufficient to analyze complicated high-dimensional data sets. Given numerous visualizations created with different features or methods, meta-visualization is needed to analyze the visualizations together. We solve how to arrange numerous visualizations onto a meta-visualization display, so that their similarities and differences can be analyzed. Visualization has recently been formalized as an information retrieval task: we extend this approach, and formalize meta-visualization as an information retrieval task whose performance can be rigorously quantified and optimized. We introduce a machine learning approach to optimize the meta-visualization, based on an information retrieval perspective: two visualizations are similar if the analyst would retrieve similar neighborhoods between data samples from either visualization. Based on the approach, we introduce a nonlinear embedding method for meta-visualization: it optimizes locations of visualizations on a display, so that visualizations giving similar information about data are close to each other. In experiments we show such meta-visualization outperforms alternatives, and yields insight into data in several case studies.
Original language | English |
---|---|
Pages (from-to) | 189-229 |
Number of pages | 41 |
Journal | Machine Learning |
Volume | 99 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2015 |
Publication type | A1 Journal article-refereed |
Keywords
- Meta-visualization
- Neighbor embedding
- Nonlinear dimensionality reduction
Publication forum classification
- Publication forum level 3