Machine learning algorithms have been successfully applied to learning classifiers in many domains such as computer vision, fraud detection, and brain image analysis. Typically, classifiers are trained to predict a class value given a set of labeled training data that includes all possible class values, and sometimes additional unlabeled training data. Little research has been performed where the possible values for the class variable include values that have been omitted from the training examples. This is an important problem setting, especially in domains where the class value can take on many values, and the cost of obtaining labeled examples for all values is high. We show that the key to addressing this problem is not predicting the held-out classes directly, but rather by recognizing the semantic properties of the classes such as their physical or functional attributes. We formalize this method as zero-shot learning and show that by utilizing semantic knowledge mined from large text corpora and crowd-sourced humans, we can discriminate classes without explicitly collecting examples of those classes for a training set. As a case study, we consider this problem in the context of thought recognition, where the goal is to classify the pattern of brain activity observed from a non-invasive neural recording device. Specifically, we train classifiers to predict a specific concrete noun that a person is thinking about based on an observed image of that person’s neural activity. We show that by predicting the semantic properties of the nouns such as “is it heavy?” and “is it edible?”, we can discriminate concrete nouns that people are thinking about, even without explicitly collecting examples of those nouns for a training set. Further, this allows discrimination of certain nouns that are within the same category with significantly higher accuracies than previous work. In addition to being an important step forward for neural imaging and braincomputer-interfaces, we show that the zero-shot learning model has important implications for the broader machine learning community by providing a means for learning algorithms to extrapolate beyond their explicit training set.