Russell Albright

Learn More
It is well-known that good initializations can improve the speed and accuracy of the solutions of many nonnegative matrix factorization (NMF) algorithms [56]. Many NMF algorithms are sensitive with respect to the initialization of W or H or both. This is especially true of algorithms of the alternating least squares (ALS) type [55], including the two new(More)
The need to process and conceptualize large sparse matrices effectively and efficiently (typically via low-rank approximations) is essential for many data mining applications, including document and image analysis, recommendation systems, and gene expression analysis. The nonnegative matrix factorization (NMF) has many advantages to alternative techniques(More)
The ranking of sports teams is of significant importance to those who are involved with or interested in the various professional and amateur leagues that exist around the world. We present a ranking algorithm that is simple to implement in SAS code and which gives results that are consistent with some of the best and most well-known computer methods for(More)
Learning from your customers and your competitors has become a real possibility because of the massive amount of web and social media data available. However, this abundance of data requires significantly more time and computer memory to perform analytical tasks. This paper introduces high-performance text mining technology for SAS ® HighPerformance(More)
Applying category labels to textual documents can be useful for 1) Search Indexing, 2) Document Filtering, and 3) Summarization. Many different algorithms have been proposed for applying category labels to text documents. We compare and contrast different approaches to text mining using Enterprise Miner for Text. INTRODUCTION The automatic classification of(More)
Text mining models routinely represent each document with a vector of weighted term frequencies. This bag-of-words approach has many strengths, one of which is representing the document in a compact form that can be used by standard data mining tools. However, this approach loses most of the contextual information that is conveyed in the relationship of(More)
Sparse data sets are common in applications of text and data mining, social network analysis, and recommendation systems. In SAS software, sparse data sets are usually stored in the coordinate list (COO) transactional format. Two major drawbacks are associated with this sparse data representation: First, most SAS procedures are designed to handle dense data(More)