This work empirically demonstrates that its algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks.Expand

It is possible to use gradient descent without seeing anything more than the value of the functions at a single point, and the guarantees hold even in the most general case: online against an adaptive adversary.Expand

It is shown that a very simple idea, used in Hannan's seminal 1957 paper, gives efficient solutions to all of these problems, including a (1+∈)-competitive algorithm as well as a lazy one that rarely switches between decisions.Expand

This work gives a simple approach for doing nearly as well as the best single decision, where the best is chosen with the benefit of hindsight, and these follow-the-leader style algorithms extend naturally to a large class of structured online problems for which the exponential algorithms are inefficient.Expand

The algorithm runs in polynomial time for the case of parity functions that depend on only the first O(log n log log n) bits of input, which provides the first known instance of an efficient noise-tolerant algorithm for a concept class that is not learnable in the Statistical Query model of Kearns [1998].Expand

An algorithm that, given n objects, learns a similarity matrix over all n2 pairs, from crowdsourced data alone is introduced, and SVMs reveal that the crowd kernel captures prominent and subtle features across a number of domains.Expand

We give the first algorithm that (under distributional assumptions) efficiently learns halfspaces in the notoriously difficult agnostic framework of Kearns, Schapire, & Sellie, where a learner is… Expand

For the universal algorithm of Cover, a simple analysis is provided which naturally extends to the case of a fixed percentage transaction cost (commission), answering a question raised in Cover, and a simple randomized implementation that is significantly faster in practice is presented.Expand

This work provides a polynomial-time algorithm for this problem for the case of two Gaussians in $n$ dimensions (even if they overlap), with provably minimal assumptions on theGaussians, and polynometric data requirements, and efficiently performs near-optimal clustering.Expand

One of the advantages of simulated annealing, in addition to avoiding poor local minima, is that in these problems it converges faster to the minima that it finds, and it is concluded that under certain general conditions, the Boltzmann-Gibbs distributions are optimal on these convex problems.Expand