BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision
- Chen Liang, Yue Yu, Chao Zhang
- Computer ScienceKnowledge Discovery and Data Mining
- 28 June 2020
A new computational framework -- BOND, which leverages the power of pre-trained language models to improve the prediction performance of NER models and demonstrates the superiority of BOND over existing distantly supervised NER methods.
PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
- Wei Zeng, Xiaozhe Ren, Yonghong Tian
- Computer ScienceArXiv
- 26 April 2021
The experimental results demonstrate the superior capabilities of PanGu-α in performing various tasks under few-shot or zero-shot settings and investigate the effect of model scales on the few- shot performances across a broad range of Chinese NLP tasks.
Denoising Multi-Source Weak Supervision for Neural Text Classification
- Wendi Ren, Yinghao Li, Hanting Su, David Kartchner, Cassie S. Mitchell, Chao Zhang
- Computer ScienceFindings
- 9 October 2020
A label denoiser is designed, which estimates the source reliability using a conditional soft attention mechanism and then reduces label noise by aggregating rule-annotated weak labels, which address the rule coverage issue.
Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach
- Yue Yu, Simiao Zuo, Haoming Jiang, Wendi Ren, T. Zhao, Chao Zhang
- Computer ScienceNorth American Chapter of the Association for…
- 15 October 2020
This work develops a contrastive self-training framework, COSINE, to enable fine-tuning LMs with weak supervision, underpinned by contrastive regularization and confidence-based reweighting, which gradually improves model fitting while effectively suppressing error propagation.
Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data
- Lingkai Kong, Haoming Jiang, Yuchen Zhuang, Jie Lyu, T. Zhao, Chao Zhang
- Computer ScienceConference on Empirical Methods in Natural…
- 22 October 2020
The proposed regularized fine-tuning method outperforms existing calibration methods for text classification in terms of expectation calibration error, misclassification detection, and OOD detection on six datasets.
Learning from Language Description: Low-shot Named Entity Recognition via Decomposed Framework
- Yaqing Wang, Haoda Chu, Chao Zhang, Jing Gao
- Computer ScienceConference on Empirical Methods in Natural…
- 11 September 2021
A novel NER framework is proposed, namely SpanNER, which learns from natural language supervision and enables the identification of never-seen entity classes without using in-domain labeled data.
SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup
- Rongzhi Zhang, Yue Yu, Chao Zhang
- Computer Science, BiologyConference on Empirical Methods in Natural…
- 5 October 2020
This work proposes a simple but effective data augmentation method to improve the label efficiency of active sequence labeling, SeqMix, which simply augments the queried samples by generating extra labeled sequences in each iteration.
SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization
- Yue Yu, Kexin Huang, Chao Zhang, Lucas Glass, Jimeng Sun, Cao Xiao
- Computer ScienceBioinform.
- 4 October 2020
A new method SumGNN is proposed: knowledge summarization graph neural network, which is enabled by a subgraph extraction module that can efficiently anchor on relevant subgraphs from a KG, a self-attention based subgraph summarization scheme to generate reasoning path within the subgraph, and a multi-channel knowledge and data integration module that utilizes massive external biomedical knowledge for significantly improved multi-typed DDI predictions.
BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition
- Yinghao Li, Pranav Shetty, Lu Liu, Chao Zhang, Le Song
- Computer ScienceAnnual Meeting of the Association for…
- 26 May 2021
A conditional hidden Markov model (CHMM), which can effectively infer true labels from multi-source noisy labels in an unsupervised way, and outperforms state-of-the-art weakly supervised NER models by wide margins.
A Survey on Programmatic Weak Supervision
- Jieyu Zhang, Cheng-Yu Hsieh, Yue Yu, Chao Zhang, Alexander J. Ratner
- Computer ScienceArXiv
- 11 February 2022
A comprehensive survey of recent advances in programmatic weak supervision and identifies several critical challenges that remain under-explored in the area to hopefully inspire future research directions in the field.
...
...