Learn More
Data often comes in the form of a graph. When it does not, it often makes sense to represent it as a graph for learning tasks that rely on the similarities or relationships between data points. As data size grows, traditional methods for learning on graphs often become computationally intractable in terms of time and space requirements. We describe new(More)
We investigate the p-voltages algorithm, which labels nodes in a graph based on their theoretical voltages in a reformulated system of electricity. Building on previous work concerning p-electric networks, we prove that the p-voltage solution is well-formed and has desirable properties for semisupervised learning. Our experiments confirm that the pvoltages(More)
In several mission-critical domains (e.g., selfdriving cars, cybersecurity, robotics) where machine learning algorithms are being used heavily, it is becoming increasingly important to ensure that the learned models satisfy some domain properties (e.g., temporal constraints). Towards this goal, we propose Trusted Machine Learning (TML), wherein we combine(More)
We propose density-ratio bagging (dragging), a semi-supervised extension of bootstrap aggregation (bagging) method. Additional unlabeled training data are used to calculate the weight on each labeled training point by a density-ratio estimator. The weight is then used to construct a weighted labeled empirical distribution, from which bags of bootstrap(More)
PURPOSE Prior text analysis of R01 critiques suggested that female applicants may be disadvantaged in National Institutes of Health (NIH) peer review, particularly for renewals. NIH altered its review format in 2009. The authors examined R01 critiques and scoring in the new format for differences due to principal investigator (PI) sex. METHOD The authors(More)
When machine learning algorithms are used in life-critical or mission-critical applications (e.g., self driving cars, cyber security, surgical robotics), it is important to ensure that they provide some high-level correctness guarantees. We introduce a paradigm called Trusted Machine Learning (TML) with the goal of making learning techniques more(More)
BACKGROUND Women are less successful than men in renewing R01 grants from the National Institutes of Health. Continuing to probe text mining as a tool to identify gender bias in peer review, we used algorithmic text mining and qualitative analysis to examine a sample of critiques from men's and women's R01 renewal applications previously analyzed by(More)
We propose hierarchical topic model for the image categorization task. Motivated by standard topic models such as PLSA and LDA, and augmented with prior knowledge extracted from WordNet, our model explicitly specifies the latent topics with emphasis on their semantic relationships. The proposed model offers several advantages over current approaches in the(More)