Sean Massung

Learn More
—We propose and study novel text representation features created from parse tree structures. Unlike the traditional parse tree features which include all the attached syntactic categories to capture linguistic properties of text, the new features are solely or primarily defined based on the tree structure, and thus better reflect the pure structural(More)
META is developed to unite machine learning, information retrieval, and natural language processing in one easy-to-use toolkit. Its focus on indexing allows it to perform well on large datasets, supporting online classification and other out-of-core algorithms. META's liberal open source license encourages contributions, and its extensive online(More)
In this paper, we formally define the problem of representing and leveraging abstract event causality to power downstream applications. We propose a novel solution to this problem, which build an abstract causality network and embed the causality network into a continuous vector space. The abstract causality network is generalized from a specific one, with(More)
In this year's WMT translation task, Finnish-English was introduced as a language pair of competition for the first time. We present experiments examining several variations on a morphologically-aware statistical phrase-based machine translation system for translating Finnish into English. Our system variations attempt to mitigate the issue of rich(More)
We prove that log-linearly interpolated backoff language models can be efficiently and exactly collapsed into a single normalized backoff model, contradicting Hsu (2007). While prior work reported that log-linear interpolation yields lower per-plexity than linear interpolation, normalizing at query time was impractical. We normalize the model offline in(More)
  • 1