# Linear regression for numeric symbolic variables: a least squares approach based on Wasserstein Distance

@article{Irpino2015LinearRF,
title={Linear regression for numeric symbolic variables: a least squares approach based on Wasserstein Distance},
author={Antonio Irpino and Rosanna Verde},
journal={Advances in Data Analysis and Classification},
year={2015},
volume={9},
pages={81-106}
}
• Published 7 February 2012
• Computer Science, Mathematics
• Advances in Data Analysis and Classification
In this paper we present a new linear regression technique for distributional symbolic variables, i.e., variables whose realizations can be histograms, empirical distributions or empirical estimates of parametric distributions. Such data are known as numerical modal data according to the Symbolic Data Analysis definitions. In order to measure the error between the observed and the predicted distributions, the $$\ell _2$$ℓ2 Wasserstein distance is proposed. Some properties of such a metric are…
Linear regression model with histogram‐valued variables
• Mathematics, Computer Science
Stat. Anal. Data Min.
• 2015
A new linear regression model for histogram‐valued variables is proposed that solves the quadratic optimization problem, subject to non‐negativity constraints on the unknowns; the error measure between the predicted and observed distributions uses the Mallows distance.
Distribution and Symmetric Distribution Regression Model for Histogram-Valued Variables
• Mathematics, Computer Science
• 2013
This work proposes a new linear regression model for histogram-valued variables that solves this problem, named Distribution and Symmetric Distribution Regression Model and is associated with a goodness-of-fit measure whose values range between 0 and 1.
Factor Analysis of Interval Data
• Mathematics
• 2017
This paper presents a factor analysis model for symbolic data, focusing on the particular case of interval-valued variables. The proposed method describes the correlation structure among the measured
Linear regression models for data with variability
• Mathematics
• 2013
Symbolic Data Analysis is concerned with data tables where the values in each cell are not single values but elements that express the variability of the records, e.g., intervals or histograms.
Linear regression with empirical distributions
In the classical data framework one numerical value or one category is associated with each individual (microdata). However, the interest of many studies lays in groups of records gathered according
Artificial Neural Network with Histogram Data Time Series Forecasting: A Least Squares Approach Based on Wasserstein Distance
• Computer Science
• 2020
The empirical results demonstrate that the AR—ANN model based Irpino-Verde approach performs better than other models.
New models for symbolic data analysis
• Mathematics, Computer Science
• 2018
This work introduces a new general method for constructing likelihood functions for symbolic data based on a desired probability model for the underlying measurement-level data, while only observing the distributional summaries.
Trajectories from Distribution-Valued Functional Curves: A Unified Wasserstein Framework
• Computer Science
MICCAI
• 2020
A novel, comprehensive framework which models their temporal evolution trajectories under the unifying scheme of Wasserstein distance metric and preserves the functional characteristics of the curve, models the temporal change in distribution profiles and forces the estimated distributions to be valid.
On the use of Wasserstein metric in topological clustering of distributional data
• Computer Science
ArXiv
• 2021
This paper deals with a clustering algorithm for histogram data based on a Self-Organizing Map (SOM) learning. It combines a dimension reduction by SOM and the clustering of the data in a reduced