#### Filter Results:

- Full text PDF available (13)

#### Publication Year

2008

2016

- This year (0)
- Last 5 years (14)
- Last 10 years (20)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Data Set Used

#### Key Phrases

Learn More

- Hoang Vu Nguyen, Hock Hee Ang, Vivekanand Gopalkrishnan
- DASFAA
- 2010

- Hoang Vu Nguyen, Vivekanand Gopalkrishnan
- ECML/PKDD
- 2009

In many real world applications data is collected in multi-dimensional spaces, with the knowledge hidden in subspaces (i.e., subsets of the dimensions). It is an open research issue to select meaningful subspaces without any prior knowledge about such hidden patterns. Standard approaches, such as pairwise correlation measures, or statistical approaches… (More)

- Hoang Vu Nguyen, Vivekanand Gopalkrishnan
- FSDM
- 2010

This work addresses the problem of feature extraction for boosting the performance of outlier detectors in high-dimensional spaces. Recent years have observed the prominence of multidimensional data on which traditional detection techniques usually fail to work as expected due to the curse of dimensionality. This paper introduces an efficient feature… (More)

Correlation analysis is one of the key elements of statistics, and has various applications in data analysis. Whereas most existing measures can only detect pairwise correlations between two dimensions, modern analysis aims at detecting correlations in multi-dimensional spaces. We propose MAC, a novel multivariate correlation measure designed for… (More)

- Hoang Vu Nguyen, Vivekanand Gopalkrishnan, Ira Assent
- DASFAA
- 2011

- Hoang Vu Nguyen, Jilles Vreeken
- SDM
- 2016

Most data is multi-dimensional. Discovering whether any subset of dimensions, or subspaces, of such data is significantly correlated is a core task in data mining. To do so, we require a measure that quantifies how correlated a subspace is. For practical use, such a measure should be universal in the sense that it captures correlation in subspaces of any… (More)

- Hoang Vu Nguyen, Jilles Vreeken
- ECML/PKDD
- 2015

Quantifying the difference between two distributions is a common problem in many machine learning and data mining tasks. What is also common in many tasks is that we only have empirical data. That is, we do not know the true distributions nor their form, and hence, before we can measure their divergence we first need to assume a distribution or perform… (More)

- Hoang Vu Nguyen, Emmanuel Müller, Jilles Vreeken, Klemens Böhm
- Data Mining and Knowledge Discovery
- 2014

Discretization is the transformation of continuous data into discrete bins. It is an important and general pre-processing technique, and a critical element of many data mining and data management tasks. The general goal is to obtain data that retains as much information in the continuous original as possible. In general, but in particular for exploratory… (More)

- Hoang Vu Nguyen, Emmanuel Müller, Klemens Böhm
- 2013 IEEE International Conference on Big Data
- 2013

In many real-world applications, data is collected in multi-dimensional spaces. However, not all dimensions are relevant for data analysis. Instead, interesting knowledge is hidden in correlated subsets of dimensions (i.e., subspaces of the original space). Detecting these correlated subspaces independent of the underlying mining task is an open research… (More)