• Publications
  • Influence
The Penn Chinese TreeBank: Phrase structure annotation of a large corpus
Several Chinese linguistic issues and their implications for treebanking efforts are discussed and how to address these issues when developing annotation guidelines are addressed, and engineering strategies to improve speed while ensuring annotation quality are described. Expand
Gibson Env: Real-World Perception for Embodied Agents
This paper investigates developing real-world perception for active agents, proposes Gibson Environment for this purpose, and showcases a set of perceptual tasks learned therein. Expand
Developing Guidelines and Ensuring Consistency for Chinese Text Annotation
This paper will address several challenges in building the corpus, namely, creating annotation guidelines, ensuring annotation accuracy and maintaining a high level of community involvement. Expand
Improving a Statistical MT System with Automatically Learned Rewrite Patterns
This work proposes to use automatically learned rewrite patterns to preprocess the source sentences so that they have a word order similar to that of the target language. Expand
A Multi-Representational and Multi-Layered Treebank for Hindi/Urdu
This paper describes the simultaneous development of dependency structure and phrase structure treebanks for Hindi and Urdu, as well as a PropBank, to ensure successful conversion. Expand
Parameter Server for Distributed Machine Learning
We propose a parameter server framework to solve distributed machine learning problems. Both data and workload are distributed into client nodes, while server nodes maintain globally sharedExpand
HiKV: A Hybrid Index Key-Value Store for DRAM-NVM Memory Systems
HiKV, a persistent key-value store with the central idea of constructing a hybrid index in hybrid memory, exploits the distinct merits of hash index and B+-Tree index and adopts ordered-write consistency to ensure crash consistency. Expand
Automatic grammar generation from two different perspectives
Two systems that automatically generate grammars are built that solve two major problems in grammar development: namely, the redundancy caused by the reuse of structures in a grammar and the lack of explicit generalizations over the structures inA grammar. Expand
Power--Aware Performance Adaptation of Concurrent Applications in Heterogeneous Many-Core Systems
A novel runtime optimization approach with the aim of achieving maximized power normalized performance considering dynamic variation of workload and application scenarios is proposed and it is demonstrated that it is possible to continuously adapt system configuration through a low-cost and linear-complexity runtime algorithm, which can improve the IPS/Watt by up to 125% compared to the existing approach. Expand