- Patrick O. Perry
- 2009

This article presents a form of bi-cross-validation (BCV) for choosing the rank in outer product models, especially the singular value decomposition (SVD) and the nonnegative matrix factorization (NMF). Instead of leaving out a set of rows of the data matrix, we leave out a set of rows and a set of columns, and then predict the left out entries by low rank… (More)

- Patrick O. Perry, Patrick J. Wolfe
- ArXiv
- 2010

Network data often take the form of repeated interactions between senders and receivers tabulated over time. A primary question to ask of such data is which traits and behaviors are predictive of interaction. To answer this question, a model is introduced for treating directed interactions as a multivariate point process: a Cox multiplicative intensity… (More)

- Patrick O. Perry, Patrick J. Wolfe
- IEEE Journal of Selected Topics in Signal…
- 2010

Rank estimation is a classical model order selection problem that arises in a variety of important statistical signal and array processing systems, yet is addressed relatively infrequently in the extant literature. Here we present sample covariance asymptotics stemming from random matrix theory, and bring them to bear on the problem of optimal rank… (More)

- Patrick O. Perry, Patrick J. Wolfe
- ArXiv
- 2012

The analysis of datasets taking the form of simple, undirected graphs continues to gain in importance across a variety of disciplines. Two choices of null model, the logistic-linear model and the implicit log-linear model, have come into common use for analyzing such network data, in part because each accounts for the heterogeneity of network node degrees… (More)

- Patrick O. Perry, Michael W. Mahoney
- NIPS
- 2011

Recently, Mahoney and Orecchia demonstrated that popular diffusion-based procedures to compute a quick approximation to the first nontrivial eigenvector of a data graph Laplacian exactly solve certain regularized Semi-Definite Programs (SDPs). In this paper, we extend that result by providing a statistical interpretation of their approximation procedure.… (More)

A data set with n measurements on p variables can be represented by an n × p data matrix X. In highdimensional settings where p is large, it is often desirable to work with a low-rank approximation to the data matrix. The most prevalent low-rank approximation is the singular value decomposition (SVD). Given X, an n × p data matrix, the SVD factorizes X as X… (More)

- Patrick O. Perry, Art B. Owen
- Journal of Machine Learning Research
- 2010

In multivariate regression models we have the opportunity to look for hidden structure unrelated to the observed predictors. However, when one fits a model involving such latent variables it is important to be able to tell if the structure is real, or just an artifact of correlation in the regression errors. We develop a new statistical test based on random… (More)

- Patrick O. Perry, Kenneth Benoit
- ArXiv
- 2017

Probabilistic methods for classifying text form a rich tradition in machine learning and natural language processing. For many important problems, however, class prediction is uninteresting because the class is known, and instead the focus shifts to estimating latent quantities related to the text, such as affect or ideology. We focus on one such problem of… (More)

- Patrick O. Perry
- 2002

The Military Airspace Management System (MAMS) is a multi-user distributed scheduling prototype designed to support the scheduling of Special Use Airspace in the CONUS region. The prototype has emphasized the user interface design of the scheduling system as the primary means of producing de-conflicted schedules. This paper reports on work in progress and… (More)