Hierarchical Clustering for Boxplot Variables

@inproceedings{Arroyo2006HierarchicalCF,
  title={Hierarchical Clustering for Boxplot Variables},
  author={Javier Arroyo and Carlos Mat{\'e} and Antonio Mu{\~n}oz San Roque},
  booktitle={Data Science and Classification},
  year={2006}
}
Boxplots are well-known exploratory charts used to extract meaningful information from batches of data at a glance. Their strength lies in their ability to summarize data retaining the key information, which also is a desirable property of symbolic variables. In this paper, boxplots are presented as a new kind of symbolic variable. In addition, two different approaches to measure distances between boxplot variables are proposed. The usefulness of these distances is illustrated by means of a… 
Regularized boxplot via convex clustering
ABSTRACT A boxplot is a simple and effective exploratory data analysis tool for graphically summarizing a distribution of data. However, in cases where the quartiles in a boxplot are inaccurately
Functional boxplots for summarizing and detecting changes in environmental data coming from sensors
TLDR
A new strategy for summarizing and describing this kind of data based on functional data representation is proposed and discovered by using an informative exploratory tool: the functional boxplot.
THE DENSITY VALUED DATA ANALYSIS IN A TEMPORAL FRAMEWORK: THE DATA MODEL APPROACH
TLDR
The main advantage of using this kind of representation and the corresponding visualization is in their capacity to highlight anomalies or anticipate structural pattern changes in a beanplot time series, as well as to provide useful tools for short period forecasting.
Using the Boxplot analysis in marketing research
Taking into account the needs of decision makers inside the companies, marketing research is meant to provide the best information that really can to help the adoption of the best decisions. In this
Knowledge discovery methods for data streams (Methodological contributions and applications)
Questa tesi propone due metodologie per la sintesi e lo studio dell'evoluzione nel tempo di data stream. I data stream sono flussi di dati che vengono prodotti ad alta frequenza e continuamente nel

References

SHOWING 1-9 OF 9 REFERENCES
Some Implementations of the Boxplot
TLDR
This work examines alternatives and their consequences of the boxplot, discusses related background for boxplots (such as the probability that a sample contains one or more outside observations and the average proportion of outside observations in a sample), and offers recommendations that lead to a single standard form of theboxplot.
Opening the Box of a Boxplot
TLDR
Variations of the boxplot are suggested, in which the sides of the boxes are used to convey information about the density of the values in a batch, in a way designed to keep their ease of computation by computer.
From the Statistics of Data to the Statistics of Knowledge
TLDR
This article attempts to review the methods currently available to analyze symbolic data, and it quickly becomes clear that the range of methodologies available draws analogies with developments before 1900 that formed a foundation for the inferential statistics of the 1900s.
Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data
TLDR
This work focuses on Symbolic Data Analysis and the SODAS Project: Purpose, History, Perspective, and Symbolic Objects, where H.H. Bock and E. Diday focused on the former and the latter dealt with the latter.
Generalized Minkowski metrics for mixed feature-type data analysis
TLDR
The effectiveness of the generalized Minkowski metrics is presented, an approach to the hierarchical conceptual clustering, and a generalization of the principal component analysis for mixed feature data are presented.
Performance of Some Resistant Rules for Outlier Labeling
TLDR
The techniques of exploratory data analysis include a resistant rule for identifying possible outliers in univariate data that uses the lower and upper fourths, FL and FU (approximate quartiles), and defines the some-outside rate per sample as the probability that a sample will contain one or more outside observations.
QBIC project: querying images by content, using color, texture, and shape
TLDR
The main algorithms for color texture, shape and sketch query that are presented, show example query results, and discuss future directions are presented.
Analysis of Symbolic Data
Exploratory data analysis
  • Addison-Wesley series in behavioral science : quantitative methods
  • 1977