- Published 2014 in Biostatistics

Immunological experiments that record primary molecular sequences of T-cell receptors produce moderate to high-dimensional categorical data, some of which may be subject to extra-multinomial variation caused by technical constraints of cell-based assays. Motivated by such experiments in melanoma research, we develop a statistical procedure for testing the equality of two discrete populations, where one population delivers multinomial data and the other is subject to a specific form of overdispersion. The procedure computes a conditional-predictive p-value by splitting the data set into two, obtaining a predictive distribution for one piece given the other, and using the observed predictive ordinate to generate a p-value. The procedure has a simple interpretation, requires fewer modeling assumptions than would be required of a fully Bayesian analysis, and has reasonable operating characteristics as evidenced empirically and by asymptotic analysis.

@inproceedings{Pei2014ACP,
title={A conditional predictive p-value to compare a multinomial with an overdispersed multinomial in the analysis of T-cell populations},
author={Qinglin Pei and Cindy L. Zuleger and Michael D. Macklin and Mark R Albertini and Michael A. Newton},
booktitle={Biostatistics},
year={2014}
}