Corpus ID: 11309794

Medical Text Classification using Convolutional Neural Networks

  title={Medical Text Classification using Convolutional Neural Networks},
  author={Mark Hughes and Irene Li and S. Kotoulas and T. Suzumura},
  journal={Studies in health technology and informatics},
We present an approach to automatically classify clinical text at a sentence level. [...] Key Method We train the network on a dataset providing a broad categorization of health information. Through a detailed evaluation, we demonstrate that our method outperforms several approaches widely used in natural language processing tasks by about 15%.Expand
Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach
This study shows that a supervised learning-based NLP approach is useful to develop medical subdomain classifiers, and the deep learning algorithm with distributed word representation yields better performance yet shallow learning algorithms with the word and concept representation achieves comparable performance with better clinical interpretability. Expand
Deep Learning Classification of Biomedical Text using Convolutional Neural Network
This research will specify the solution focusing in text classification of the biomedical abstracts using the deep learning network, Convolutional Neural Network which is an approach widely used in many domains such as pattern recognition, classification and so on. Expand
A hybrid medical text classification framework: Integrating attentive rule construction and neural network
Experimental results on real-world medical online query data clearly validate the superiority of the proposed approach in selecting domain-specific and topic-related features and illustrates the versatility of regular expressions as a user-level tool for focusing on desired patterns and providing interpretable solutions for human modification. Expand
Enhancing the performance of cancer text classification model based on cancer hallmarks
This paper enhances the convolutional neural network (CNN) algorithm to classify cancer articles according to cancer hallmarks by implementing a recent word embedding technique in the embedding layer. Expand
Deep neural network for hierarchical extreme multi-label text classification
An analysis of a Deep Learning architecture devoted to text classification, considering the extreme multi-class and multi-label text classification problem, when a hierarchical label set is defined and a methodology named Hierarchical Label Set Expansion (HLSE) is presented. Expand
Development of deep learning algorithms to categorize free-text notes pertaining to diabetes: convolution neural networks achieve higher accuracy than support vector machines
Deep learning classifiers can be used to identify EHR progress notes pertaining to diabetes and the CNN-based classifier can achieve a higher AUC than an SVM-basedclassifier. Expand
Multi-label Classification for Clinical Text with Feature-level Attention
  • Disheng Pan, Xi-zi Zheng, +5 authors P. Wang
  • Computer Science
  • 2020 IEEE 6th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS)
  • 2020
The proposed FAMLC-BERT (Feature-level Attention for Multi-label classification on BERT) uses feature-level attention with BERT to recognize the labels of EHRs and achieves significant improvements compared to other selected benchmarks. Expand
Clinical Text Classification with Word Embedding Features vs. Bag-of-Words Features
The study showed that the word2vec features performed better than the BOW-1-gram features, however, when 2-grams were added to BOW, comparison results were mixed. Expand
Unstructured Medical Text Classification using Linguistic Analysis: A Supervised Deep Learning Approach
This paper proposes to use a deep learning approach for unstructured medical text classification at the document level based on linguistic features that are extracted from the text and incorporates medical domain-specific terms/keywords as part of the classification feature set. Expand
Comparison of Convolutional Neural Network Architectures and their Influence on Patient Classification Tasks Relating to Altered Mental Status
This experiment evaluated several different convolutional neural network architectures that are typically used in text classification tasks and found that the multi-channel model performed the best with an accuracy of 92%, while the attention and sequential models performed worse with a accuracy of 90% and 89% respectively. Expand


Convolutional Neural Networks for Sentence Classification
The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors. Expand
A Convolutional Neural Network for Modelling Sentences
A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations. Expand
Research and applications: N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit
SVM-based classifiers can accurately identify procedure status and diagnoses among ICU patients, and including n-gram features improves performance, compared to existing methods. Expand
Accurate Cancer Classification Using Expressions of Very Few Genes
  • Lipo Wang, Feng Chu, Wei Xie
  • Computer Science, Medicine
  • IEEE/ACM Transactions on Computational Biology and Bioinformatics
  • 2007
In general, the method can significantly reduce the number of genes required for highly reliable diagnosis. Expand
Distributed Representations of Sentences and Documents
Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models. Expand
Automated methods for the summarization of electronic health records
This review examines work on automated summarization of electronic health record (EHR) data and in particular, individual patient record summarization with a particular focus on methods for detecting and removing redundancy, describing temporality, determining salience, accounting for missing data, and taking advantage of encoded clinical knowledge. Expand
Redundancy-Aware Topic Modeling for Patient Record Notes
A novel variant of Latent Dirichlet Allocation topic modeling, Red-LDA, which takes into account the inherent redundancy of patient records when modeling content of clinical notes and produces superior models to all three baseline strategies. Expand
Distributed Representations of Words and Phrases and their Compositionality
This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling. Expand
Return of Frustratingly Easy Domain Adaptation
This work proposes a simple, effective, and efficient method for unsupervised domain adaptation called CORrelation ALignment (CORAL), which minimizes domain shift by aligning the second-order statistics of source and target distributions, without requiring any target labels. Expand
Subspace Distribution Alignment for Unsupervised Domain Adaptation
A unified view of existing subspace mapping based methods is presented and a generalized approach that also aligns the distributions as well as the subspace bases is developed that shows improved results over published approaches. Expand