An Unsupervised Feature Selection Framework for Social Media Data

@article{Tang2014AnUF,
  title={An Unsupervised Feature Selection Framework for Social Media Data},
  author={Jiliang Tang and Huan Liu},
  journal={IEEE Transactions on Knowledge and Data Engineering},
  year={2014},
  volume={26},
  pages={2914-2927}
}
  • Jiliang Tang, Huan Liu
  • Published 1 December 2014
  • Computer Science
  • IEEE Transactions on Knowledge and Data Engineering
The explosive usage of social media produces massive amount of unlabeled and high-dimensional data. Feature selection has been proven to be effective in dealing with high-dimensional data for efficient learning and data mining. Unsupervised feature selection remains a challenging task due to the absence of label information based on which feature relevance is often assessed. The unique characteristics of social media data further complicate the already challenging problem of unsupervised… 
Linked Unsupervised Based Advanced Feature Selection Framework with Artificial Bee Colony for Social Media Data
TLDR
A novel LUAFS-ABC has been proposed for linked data in social media to exploit linked information of selected features and the experimental results shows that the proposed method can effectively exploit link information in comparison with the state-of-the–art unsupervised feature selection methods.
An unsupervised approach for feature selection in linked data
TLDR
By optimizing a novel objective function, feature ranking is done and top features are extracted for further processing in social network datasets by incorporating both the relationship between users and information of users' features.
A Social-aware online short-text feature selection technique for social media
Unsupervised Spectral Sparse Regression Feature Selection using Social Media Datasets
TLDR
An Unsupervised Feature Selection method employing spectral analysis taking social media dataset have been proposed and experiments with the data sets from different social media to evaluate its accuracy and performance are conducted.
Short-text feature construction and selection in social media data: a survey
TLDR
This paper surveys feature selection techniques for dealing with short texts in both offline and online settings, and open issues and research opportunities for performing online feature selection over social media data are discussed.
Feature Selection on Linked Data: A Review
TLDR
Unsupervised feature selection method on linked data used for several applications is outlined and also the various challenges faced during the process are outlined.
Unsupervised Nonlinear Feature Selection from High-Dimensional Signed Networks
TLDR
A nonlinear unsupervised feature selection method for signed networks, called SignedLasso, which can select a small number of important features with nonlinear associations between inputs and output from a high-dimensional data and the use of a deep learning-based node embedding to represent node similarity without label information is proposed.
Feature Selection Methods for Linked Data: Limitations, Capabilities and Potentials
TLDR
A review of current feature selection techniques for linked data is presented and several approaches are examined in various contexts so that performance issues and ongoing challenges can be assessed.
Exploiting Hierarchical Structures for Unsupervised Feature Selection
TLDR
This work provides a principled method to exploit hierarchical structures of features and proposes a novel framework HUFS, which utilizes the given hierarchical structures to help select features without labels.
...
...

References

SHOWING 1-10 OF 54 REFERENCES
Unsupervised feature selection for linked social media data
TLDR
The differences between social media data and traditional attribute-value data are studied, if the relations revealed in linked data can be used to help select relevant features are investigated, and a novel unsupervised feature selection framework, LUFS, is proposed for linked social media Data Mining.
Feature Selection with Linked Data in Social Media
TLDR
The dierences between attributevalue data and social media data are illustrated, if linked data can be exploited in a new feature selection framework by taking advantage of social science theories are investigated, and the eects of user-user and user-post relationships manifested in linked data on feature selection are evaluated.
Unsupervised feature selection for multi-cluster data
TLDR
Inspired from the recent developments on manifold learning and L1-regularized models for subset selection, a new approach is proposed, called Multi-Cluster Feature Selection (MCFS), for unsupervised feature selection, which select those features such that the multi-cluster structure of the data can be best preserved.
Relational learning via latent social dimensions
TLDR
This work proposes to extract latent social dimensions based on network information, and then utilize them as features for discriminative learning, and outperforms representative relational learning methods based on collective inference, especially when few labeled data are available.
Laplacian Score for Feature Selection
TLDR
This paper proposes a "filter" method for feature selection which is independent of any learning algorithm, based on the observation that, in many real world classification problems, data from the same class are often close to each other.
Discovering Overlapping Groups in Social Media
TLDR
A novel co-clustering framework is proposed, which takes advantage of networking information between users and tags in social media, to discover these overlapping communities.
Feature selection for unsupervised and supervised inference: the emergence of sparsity in a weighted-based approach
  • Lior Wolf, A. Shashua
  • Computer Science
    Proceedings Ninth IEEE International Conference on Computer Vision
  • 2003
TLDR
A definition of "relevancy" based on spectral properties of the Affinity (or Laplacian) of the features' measurement matrix is presented, which shows that sparse solutions for the ranking values naturally emerge as a result of a "biased nonnegativity" of a key matrix in the process.
Towards feature selection in network
TLDR
This paper presents a supervised feature selection method based on Laplacian Regularized Least Squares (LapRLS) for networked data which uses linear regression to utilize the content information, and adopt graph regularization to consider the link information.
Feature Selection for Unsupervised Learning
TLDR
This paper explores the feature selection problem and issues through FSSEM (Feature Subset Selection using Expectation-Maximization (EM) clustering) and through two different performance criteria for evaluating candidate feature subsets: scatter separability and maximum likelihood.
l2, 1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning
TLDR
This work incorporates discriminative analysis and l2,1-norm minimization into a joint framework for unsupervised feature selection under the assumption that the class label of input data can be predicted by a linear classifier.
...
...