Corpus ID: 231942308

Transferability of Neural Network-based De-identification Systems

@article{Lee2021TransferabilityON,
  title={Transferability of Neural Network-based De-identification Systems},
  author={Kahyun Lee and Nicholas J. Dobbins and Bridget T. McInnes and Meliha Yetisgen and Ozlem Uzuner},
  journal={ArXiv},
  year={2021},
  volume={abs/2102.08517}
}
Methods and Materials: We investigated transferability of neural network-based de-identification sys-tems with and without domain generalization. We used two domain generalization approaches: a novel approach Joint-Domain Learning (JDL) as developed in this paper, and a state-of-the-art domain general-ization approach Common-Specific Decomposition (CSD) from the literature. First, we measured trans-ferability from a single external source. Second, we used two external sources and evaluated… Expand

Figures and Tables from this paper

References

SHOWING 1-10 OF 32 REFERENCES
Efficient Domain Generalization via Common-Specific Low-Rank Decomposition
TLDR
It is shown that CSD either matches or beats state of the art approaches for domain generalization based on domain erasure, domain perturbed data augmentation, and meta-learning. Expand
Generalizing Across Domains via Cross-Gradient Training
TLDR
Empirical evaluation on three different applications establishes that (1) domain-guided perturbation provides consistently better generalization to unseen domains, compared to generic instance perturbations methods, and that (2) data augmentation is a more stable and accurate method than domain adversarial training. Expand
Learning to Generalize: Meta-Learning for Domain Generalization
TLDR
A novel meta-learning procedure that trains models with good generalization ability to novel domains for domain generalization and achieves state of the art results on a recent cross-domain image classification benchmark, as well demonstrating its potential on two classic reinforcement learning tasks. Expand
Comparing Rule-based, Feature-based and Deep Neural Methods for De-identification of Dutch Medical Records
TLDR
A varied dataset consisting of the medical records of 1260 patients is constructed by sampling data from 9 institutes and three domains of Dutch healthcare and shows that an existing rule-based method specifically developed for the Dutch language fails to generalize to this new data. Expand
Transfer Learning for Named-Entity Recognition with Neural Networks
TLDR
It is demonstrated that transferring an ANN model trained on a large labeled dataset to another dataset with a limited number of labels improves upon the state-of-the-art results on two different datasets for patient note de-identification. Expand
De-identification of patient notes with recurrent neural networks
TLDR
The first de-identification system based on artificial neural networks (ANNs), which requires no handcrafted features or rules, unlike existing systems, is introduced, which outperforms the state-of-the-art systems. Expand
De-identification of clinical notes via recurrent neural network and conditional random field.
TLDR
A hybrid system is developed that achieves the highest micro F1-scores under the "token, "strict" and "binary token" criteria respectively, ranking first in the 2016 CEGS N-GRID NLP challenge and outperforming other state-of-the-art systems. Expand
Leveraging existing corpora for de-identification of psychiatric notes using domain adaptation
TLDR
The results show that DA can increase deidentification performance over the baselines, indicating that it can effectively reduce annotation cost for the target psychiatric notes. Expand
A hybrid approach to automatic de-identification of psychiatric notes.
TLDR
A natural language processing system for automatic de-identification of psychiatric notes that combines machine leaning techniques and rule-based approaches is presented, which showed overall micro-averaged F-score 90.74 on the test set, second-best among all the participants of the CEGS N-GRID task. Expand
Hidden Markov model using Dirichlet process for de-identification
TLDR
A new non-parametric Bayesian hidden Markov model using a Dirichlet process to reduce task-specific feature engineering and to generalize well to new data is introduced for the 2014 i2b2/UTHealth de-identification challenge. Expand
...
1
2
3
4
...