Learn More
We study fifteen months of human mobility data for one and a half million individuals and find that human mobility traces are highly unique. In fact, in a dataset where the location of an individual is specified hourly, and with a spatial resolution equal to that given by the carrier's antennas, four spatio-temporal points are enough to uniquely identify(More)
Motivation p High-dimensional data are p difficult to represent p difficult to understand p difficult to analyze p Example: MLP (Multi-Layer Perceptron) or RBFN (Radial-Basis Function Network) with many inputs: difficult convergence, local minima, etc. p Need to reduce the dimension of data while keeping information content! 2 Michel Verleysen 3 Motivation:(More)
Nearest neighbor search and many other numerical data analysis tools most often rely on the use of the euclidean distance. When data are high dimensional, however, the euclidean distances seem to concentrate; all distances between pairs of data elements seem to be very similar. Therefore, the relevance of the euclidean distance has been questioned in the(More)
Label noise is an important issue in classification, with many potential negative consequences. For example, the accuracy of predictions may decrease, whereas the complexity of inferred models and the number of necessary training samples may increase. Many works in the literature have been devoted to the study of label noise and the development of(More)
Dimensionality reduction aims at providing low-dimensional representations of high-dimensional data sets. Many new nonlinear methods have been proposed for the last years, yet the question of their assessment and comparison remains open. This paper first reviews some of the existing quality measures that are based on distance ranking and K-ary(More)
Dimension reduction techniques are widely used for the analysis and visualization of complex sets of data. This paper compares two recently published methods for nonlinear projection: Isomap and Curvilinear Distance Analysis (CDA). Contrarily to the traditional linear PCA, these methods work like multidimensional scaling, by reproducing in the projection(More)
Extreme learning machines are fast models which almost compare to standard SVMs in terms of accuracy, but are much faster. However, they optimise a sum of squared errors whereas SVMs are maximum-margin classifiers. This paper proposes to merge both approaches by defining a new kernel. This kernel is computed by the first layer of an extreme learning machine(More)
Dimensionality reduction aims at representing high-dimensional data in low-dimensional spaces, in order to facilitate their visual interpretation. Many techniques exist, ranging from simple linear projections to more complex nonlinear transformations. The large variety of methods emphasizes the need of quality criteria that allow for fair comparisons(More)
Long-term ECG recordings are often required for the monitoring of the cardiac function in clinical applications. Due to the high number of beats to evaluate, inter-patient computer-aided heart beat classification is of great importance for physicians. The main difficulty is the extraction of discriminative features from the heart beat time series. The(More)