Share This Author
Breast cancer classification and prognosis based on gene expression profiles from a population-based study
Gene expression patterns were found to be strongly associated with estrogen receptor (ER) status and moderately associated with grade, but not associated with menopausal status, nodal status, or tumor size, in an unselected group of 99 node-negative and node-positive breast cancer patients.
Benign overfitting in linear regression
- P. Bartlett, Philip M. Long, G. Lugosi, Alexander Tsigler
- Computer ScienceProceedings of the National Academy of Sciences
- 26 June 2019
A characterization of linear regression problems for which the minimum norm interpolating prediction rule has near-optimal prediction accuracy shows that overparameterization is essential for benign overfitting in this setting: the number of directions in parameter space that are unimportant for prediction must significantly exceed the sample size.
The Singular Values of Convolutional Layers
It is shown that this is an effective regularizer; for example, it improves the test error of a deep residual network using batch normalization on CIFAR-10 from 6.2\% to 5.3\%.
Random classification noise defeats all convex potential boosters
This paper shows that for a broad class of convex potential functions, any such boosting algorithm is highly susceptible to random classification noise, and there is a simple data set of examples which is efficiently learnable by such a booster if there is no noise, but which cannot be learned to accuracy better than 1/2 if there are random classification noises.
The Power of Localization for Efficiently Learning Linear Separators with Noise
This work provides the first polynomial-time active learning algorithm for learning linear separators in the presence of malicious noise or adversarial label noise, and achieves a label complexity whose dependence on the error parameter ϵ is polylogarithmic (and thus exponentially better than that of any passive algorithm).
The Relaxed Online Maximum Margin Algorithm
This work describes a new incremental algorithm for training linear threshold functions: the Relaxed Online Maximum Margin Algorithm, or ROMMA, and proves a mistake bound for ROMMA that is the same as that proved for the perceptron algorithm.
Comparative full-length genome sequence analysis of 14 SARS coronavirus isolates and common mutations associated with putative origins of infection
Improved bounds on the sample complexity of learning
A new general upper bound on the number of examples required to estimate all of the expectations of a set of random variables uniformly well is presented and implies improved bounds on the sample complexity of learning according to Haussler's decision theoretic model.
Comment on " 'Stemness': Transcriptional Profiling of Embryonic and Adult Stem Cells" and "A Stem Cell Molecular Signature" (I)
Comparing the same three “stem cells”— embryonic stem cells (ESCs), neural stem Cells (NSCs), referred to as neural progenitor/stem cells (NPCs) in the present study; and hematopoietic stem cell (HSCs) with their counterparts is compared.
Gradient Descent with Identity Initialization Efficiently Learns Positive-Definite Linear Transformations by Deep Residual Networks
It is shown that if the least-squares matrix Φ is symmetric and has a negative eigenvalue, then all members of a class of algorithms that perform gradient descent with identity initialization, and optionally regularize toward the identity in each layer, fail to converge.