We show how the regularizer of Transductive Support Vector Machines (TSVM) can be trained by stochastic gradient descent for linear models and multi-layer architectures. The resulting methods can be trained <i>online</i>, have vastly superior training and testing speed to existing TSVM algorithms, can encode prior knowledge in the network architecture, and(More)
