Learn More
This paper presents a syntax-driven approach to question answering, specifically the answer-sentence selection problem for short-answer questions. Rather than using syntactic features to augment existing statistical classifiers (as in previous work), we build on the idea that questions and their (correct) answers relate to each other via loose but(More)
We present a novel classifier-based deter-ministic parser for Chinese constituency parsing. Our parser computes parse trees from bottom up in one pass, and uses classifiers to make shift-reduce decisions. Trained and evaluated on the standard training and test sets, our best model (using stacked classifiers) runs in linear time and has labeled precision and(More)
Many problems in NLP require solving a cascade of subtasks. Traditional pipeline approaches yield to error propagation and prohibit joint train-ing/decoding between subtasks. Existing solutions to this problem do not guarantee non-violation of hard-constraints imposed by subtasks and thus give rise to inconsistent results, especially in cases where(More)
A range of Natural Language Processing tasks involve making judgments about the semantic relatedness of a pair of sentences , such as Recognizing Textual En-tailment (RTE) and answer selection for Question Answering (QA). A key challenge that these tasks face in common is the lack of explicit alignment annotation between a sentence pair. We capture the(More)
Many network attacks forge the source address in their IP packets to block traceback. Recently, research activity has focused on packet-tracing mechanisms to counter this deception. Unfortunately, these mechanisms are either too expensive or ineffective against distributed attacks where traffic comes from multiple directions, and the volume in each(More)
NLP models have many and sparse features, and regularization is key for balancing model overfitting versus underfitting. A recently re-popularized form of regularization is to generate fake training data by repeatedly adding noise to real data. We reinterpret this noising as an explicit regularizer, and approximate it with a second-order formula that can be(More)
We consider a multilingual weakly supervised learning scenario where knowledge from annotated corpora in a resource-rich language is transferred via bitext to guide the learning in other languages. Past approaches project labels across bitext and use them as features or gold labels for training. We propose a new method that projects model expectations(More)
We describe the Stanford entries to the SANCL 2012 shared task on parsing non-canonical language. Stanford submitted three entries: (i) a self-trained generative constituency parser, (ii) a graph-based dependency parser, and (iii) a stacked dependency parser using the output from the constituency parser as features while parsing. The stacked parser obtained(More)
While social interactions are critical to understanding consumer behavior, the relationship between social and commerce networks has not been explored on a large scale. We analyze Taobao, a Chinese consumer marketplace that is the world's largest e-commerce website. What sets Taobao apart from its competitors is its integrated instant messaging tool, which(More)