Email Surveillance Using Non-negative Matrix Factorization

@article{Berry2005EmailSU,
  title={Email Surveillance Using Non-negative Matrix Factorization},
  author={Michael W. Berry and Murray Browne},
  journal={Computational \& Mathematical Organization Theory},
  year={2005},
  volume={11},
  pages={249-264}
}
  • M. Berry, M. Browne
  • Published 1 October 2005
  • Computer Science
  • Computational & Mathematical Organization Theory
In this study, we apply a non-negative matrix factorization approach for the extraction and detection of concepts or topics from electronic mail messages. For the publicly released Enron electronic mail collection, we encode sparse term-by-message matrices and use a low rank non-negative matrix factorization algorithm to preserve natural data non-negativity and avoid subtractive basis vector and encoding interactions present in techniques such as principal component analysis. Results in topic… 

Figures and Tables from this paper

Non-negative Matrix Factorization, A New Tool for Feature Extraction: Theory and Applications Workshop invited key lecture
TLDR
The underlaying mathematical NMF theory is described along with some extensions and several relevant applications from different scientific areas are presented.
Automating the Detection of Anomalies and Trends from Text
TLDR
Recent studies with documents from the Aviation Safety Reporting System have shown that (known) anomalies of training documents can be directly mapped to NMF-generated feature vectors and used to generate anomaly relevance scores for those documents.
Non-negative Matrix Factorization, A New Tool for Feature Extraction: Theory and Applications
TLDR
The underlaying mathematical NMF theory is described along with some extensions and several relevant applications from different scientific areas are presented.
Gröbner Basis of Non-Negative Matrix Factorization and Feature Extraction of Cross-Site Scripting Attacks
TLDR
This paper investigates on an affine algebraic variety of NMF, and proposes the feature extraction method of cross-site scripting attacks for non negative matrix factorization.
Pairwise Constraints-Guided Non-negative Matrix Factorization for Document Clustering
  • Yu-Jiu Yang, B. Hu
  • Computer Science
    IEEE/WIC/ACM International Conference on Web Intelligence (WI'07)
  • 2007
TLDR
This paper addresses the text clustering problem via a novel strategy, called Pairwise Constraintsguided Non-negative Matrix Factorization (PCNMF), which can capture the available abundance prior constraints in original space, which result in more effective for clustering or information retrieval.
Non-Negative Matrix Factorization for Stock Market Pricing
  • Tang Liu
  • Computer Science
    2009 2nd International Conference on Biomedical Engineering and Informatics
  • 2009
TLDR
Non-negative matrix factorization (NMF) is used to analyze the data from stock market and decomposes the data matrix V of the daily closing prices of the 40 stocks into two matrices W and H, in which the columns of W correlate to the underlying trends.
Finding Hierarchy of Topics from Twitter Data
TLDR
This paper introduces a conceptual topic modeling based on the idea of stability analysis to detect a hierarchy of topics given a text source and applies this approach to a large-scale Twitter dataset to investigate the content topics.
Automatic Relevance Determination in Nonnegative Matrix Factorization
TLDR
This paper addresses the important issue of nonnegative matrix factorization by using a Bayesian approach to estimate the latent dimensionality and selecting the model order via automatic relevance determination (ARD), a technique that has been employed in Bayesian PCA and sparse Bayesian learning.
Unsupervised Multi-Level Non-Negative Matrix Factorization Model: Binary Data Case
TLDR
This paper proposes an unsupervised multi-level non-negative matrix factorization model to extract the hidden data structure and seek the rank of base matrix, and demonstrates that this approach is able to retrieve the hidden structure of data and determine the correct rank of Base matrix.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 28 REFERENCES
Document clustering based on non-negative matrix factorization
TLDR
This paper proposes a novel document clustering method based on the non-negative factorization of the term-document matrix of the given document corpus that surpasses the latent semantic indexing and the spectral clustering methods not only in the easy and reliable derivation of document clustered results, but also in document clusters accuracies.
Document clustering using nonnegative matrix factorization
Structure in the Enron Email Dataset
TLDR
It is shown that relative changes to individuals' word usage over time can be used to identify key players in major company events and that word use is correlated to function within the organization, as expected.
Determining a suitable metric when using non-negative matrix factorization
TLDR
The use of the Earth Mover's Distance (EMD) is introduced as a relevant metric that takes into account the positive definition of the NMF bases, leading to better recognition results when the dimensionality of the problem is correctly chosen.
Algorithms for Non-negative Matrix Factorization
TLDR
Two different multiplicative algorithms for non-negative matrix factorization are analyzed and one algorithm can be shown to minimize the conventional least squares error while the other minimizes the generalized Kullback-Leibler divergence.
Summarizing video using non-negative similarity matrix factorization
  • M. Cooper, J. Foote
  • Computer Science
    2002 IEEE Workshop on Multimedia Signal Processing.
  • 2002
TLDR
A novel approach to automatically extracting summary excerpts from audio video and video by maximizing the average similarity between the excerpt and the source to generate a summary comprised of excerpts from the main components.
Non-negative sparse coding
  • P. Hoyer
  • Computer Science
    Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing
  • 2002
TLDR
A simple yet efficient multiplicative algorithm for finding the optimal values of the hidden components of non-negative sparse coding and how the basis vectors can be learned from the observed data is shown.
When Does Non-Negative Matrix Factorization Give a Correct Decomposition into Parts?
TLDR
Theoretical results are shown to be predictive of the performance of published NMF code, by running the published algorithms on one of the synthetic image articulation databases.
Learning the parts of objects by non-negative matrix factorization
TLDR
An algorithm for non-negative matrix factorization is demonstrated that is able to learn parts of faces and semantic features of text and is in contrast to other methods that learn holistic, not parts-based, representations.
Matrices, Vector Spaces, and Information Retrieval
TLDR
The purpose of this paper is to show how fundamental mathematical concepts from linear algebra can be used to manage and index large text collections.
...
1
2
3
...