• Publications
  • Influence
Chinese Named Entity Recognition with a Sequence Labeling Approach: Based on Characters, or Based on Words?
TLDR
The results show that without the global features the person names and the location names have good result based on characters, but the organization names are more suitable based on words when global features are added, and the performance ofbased on words improved significantly. Expand
Demographics Should Not Be the Reason of Toxicity: Mitigating Discrimination in Text Classifications with Instance Weighting
TLDR
A model-agnostic debiasing training framework is proposed by recovering the non-discrimination distribution using instance weighting, which does not require any extra resources or annotations apart from a pre-defined set of demographic identity-terms. Expand
Research on image classification based on a combination of text and visual features
TLDR
This paper considers utilizing description information to help image classification and proposes a novel image classification method focusing on text-image co-occurrence data, which is efficient and enhances the accuracy of image classification. Expand
Attention-Fused Deep Matching Network for Natural Language Inference
TLDR
An attention-fused deep matching network (AF-DMN) for natural language inference that takes two sentences as input and iteratively learns the attention-aware representations for each side by multi-level interactions and adds a self-attention mechanism to fully exploit local context information within each sentence. Expand
Improving Pivot-Based Statistical Machine Translation by Pivoting the Co-occurrence Count of Phrase Pairs
TLDR
This paper presents a novel approach to calculate the translation probability by pivoting the co-occurrence count of source-pivot and pivot-target phrase pairs, and shows that this method leads to significant improvements over the baseline systems. Expand
Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets
TLDR
This paper investigates the problem of selection bias on six NLSM datasets and finds that four out of them are significantly biased, and proposes a training and evaluation framework to alleviate the bias. Expand
Improving Pivot-Based Statistical Machine Translation Using Random Walk
TLDR
A novel approach that utilizes a machine learning method to improve pivot-based statistical machine translation (SMT) by utilizing Markov random walks to connect possible translation phrases between source and target language. Expand
A Unified Tagging Approach to Text Normalization
TLDR
A unified tagging approach to perform the text normalization task using Conditional Random Fields (CRF) and Experimental results on email data cleaning show that the proposed method significantly outperforms the approach of using cascaded models and that of employing independent models. Expand
Image Classification Based on the Combination of Text Features and Visual Features
TLDR
In experiments on the data set extracted from Google Image Search, the benefit of using context to help image classification is demonstrated and the classifier fusion approach improves the classification accuracy. Expand
Understanding Data Augmentation in Neural Machine Translation: Two Perspectives towards Generalization
TLDR
Inspired by recent theoretic advances in deep learning, the paper understands DA from two perspectives towards the generalization ability of a model: input sensitivity and prediction margin, which are defined independent of specific test set thereby may lead to findings with relatively low variance. Expand
...
1
2
3
4
5
...