Learn More
In this paper we introduce the idea of improving the performance of parametric temporal-difference (TD) learning algorithms by selectively emphasizing or de-emphasizing their updates on different time steps. In particular, we show that varying the emphasis of linear TD(λ)'s updates in a particular way causes its expected update to become stable under(More)
One of the main obstacles to broad application of reinforcement learning methods is the parameter sensitivity of our core learning algorithms. In many large-scale applications, online computation and function approximation represent key strategies in scaling up reinforcement learning algorithms. In this setting, we have effective and reasonably well(More)
BACKGROUND Shared decision-making has been advocated; however there are relatively few studies on physician preferences for, and experiences of, different styles of clinical decision-making as most research has focused on patient preferences and experiences. The objectives of this study were to determine 1) physician preferences for different styles of(More)
Automated feature discovery is a fundamental problem in machine learning. Although classical feature discovery methods do not guarantee optimal solutions in general, it has been recently noted that certain subspace learning and sparse coding problems can be solved efficiently, provided the number of features is not restricted a priori. We provide an(More)
An investigational 10% liquid intravenous immunoglobulin (IVIG) was studied in 63 patients with primary immunodeficiency (PID) at 15 study sites. Patients were treated every 3 or 4 weeks with 254–1029 mg/kg/infusion of IVIG. Overall, Biotest-IVIG infusions were well tolerated. The proportion of infusions that were associated with adverse events during(More)
Robust regression and classification are often thought to require non-convex loss functions that prevent scalable, global training. However, such a view neglects the possibility of reformulated training methods that can yield practically solvable alternatives. A natural way to make a loss function more robust to outliers is to truncate loss values that(More)
Training principles for unsupervised learning are often derived from motivations that appear to be independent of supervised learning. In this paper we present a simple unification of several supervised and unsupervised training principles through the concept of <i>optimal reverse prediction</i>: predict the inputs from the target labels, optimizing both(More)