Corpus ID: 216056344

ktrain: A Low-Code Library for Augmented Machine Learning

@article{Maiya2020ktrainAL,
  title={ktrain: A Low-Code Library for Augmented Machine Learning},
  author={Arun S. Maiya},
  journal={ArXiv},
  year={2020},
  volume={abs/2004.10703}
}
We present ktrain, a low-code Python library that makes machine learning more accessible and easier to apply. As a wrapper to TensorFlow and many other libraries (e.g., transformers, scikit-learn, stellargraph), it is designed to make sophisticated, state-of-the-art machine learning models simple to build, train, inspect, and apply by both beginners and experienced practitioners. Featuring modules that support text data (e.g., text classification, sequence tagging, open-domain question… Expand
Classifying citizen complaints using pre-trained language models
This thesis shows the effectiveness of state-of-the-art Natural Language Models on a customer complaint data-set owned by the company Decos which consists of municipalities, their selfchosenExpand
A comparative study of deep learning based language representation learning models
TLDR
This paper highlights the most important language representation learning models in NLP and provides an insight of their evolution, and summarizes, compare and contrast these different models on sentiment analysis, and discusses their main strengths and limitations. Expand
Adult Content Detection on Arabic Twitter: Analysis and Experiments
With Twitter being one of the most popular social media platforms in the Arab region, it is not surprising to find accounts that post adult content in Arabic tweets; despite the fact that theseExpand
ArCorona: Analyzing Arabic Tweets in the Early Days of Coronavirus (COVID-19) Pandemic
TLDR
This work presents the largest manually annotated dataset of Arabic tweets related to COVID-19, describes annotation guidelines, analyzes the dataset and builds effective machine learning and transformer based models for classification. Expand
Bimodal Music Subject Classification via Context-Dependent Language Models
TLDR
This work presents a bimodal music subject classification method that uses two different inputs: lyrics and user interpretations of lyrics, suggesting that BERT’s context-dependent features can help the machine learning models uncover the poetic nature of song lyrics. Expand
CausalNLP: A Practical Toolkit for Causal Inference with Text
TLDR
The proposed CausalNLP employs the use of meta-learners for treatment effect estimation and supports using raw text and its linguistic properties as both a treatment and a “controlled-for” variable (e.g., confounder). Expand
Chatbot Interaction with Artificial Intelligence: Human Data Augmentation with T5 and Language Transformer Ensemble for Text Classification
TLDR
The Chatbot Interaction with Artificial Intelligence framework is presented as an approach to the training of deep learning chatbots for task classification and an ensemble of the five best-performing transformer models via Logistic Regression of output label predictions led to an accuracy of 99.59% on the dataset of human responses. Expand
Emotion Classification in a Resource Constrained Language Using Transformer-based Approach
TLDR
Experimental outcomes indicate that XLM-R outdoes all other techniques by achieving the highest weighted f_1-score of 69.73% on the test data. Expand
Emotion-Aware, Emotion-Agnostic, or Automatic: Corpus Creation Strategies to Obtain Cognitive Event Appraisal Annotations
TLDR
This work analyzes two manual annotation settings and evaluates a purely automatic rule-based labeling strategy (inferring appraisal from annotated emotion classes), indicating that it might be possible to automatically create appraisal corpora for every domain for which emotion corpora already exist. Expand
Extraction of mitigation-related text from Endangered Species Act documents using machine learning: a case study
Various industrial and development projects have the potential to adversely affect threatened and endangered species and their habitats. The federal Endangered Species Act (ESA) requires preparationExpand
...
1
2
3
...

References

SHOWING 1-10 OF 17 REFERENCES
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks. Expand
A disciplined approach to neural network hyper-parameters: Part 1 - learning rate, batch size, momentum, and weight decay
TLDR
This report shows how to examine the training validation/test loss function for subtle clues of underfitting and overfitting and suggests guidelines for moving toward the optimal balance point and discusses how to increase/decrease the learning rate/momentum to speed up training. Expand
AutoML: A Survey of the State-of-the-Art
TLDR
A comprehensive and up-to-date review of the state-of-the-art (SOTA) in AutoML methods according to the pipeline, covering data preparation, feature engineering, hyperparameter optimization, and neural architecture search (NAS). Expand
2011
Peoples’ histories have been destroyed at the times of traumatic events in conflicts and wars. In the last decade, we have witnessed a radical transformation of cities in the Middle East and NorthExpand
fastai: A Layered API for Deep Learning
TLDR
This paper has used this library to successfully create a complete deep learning course, which was able to write more quickly than using previous approaches, and the code was more clear. Expand
Decoupled Weight Decay Regularization
TLDR
This work proposes a simple modification to recover the original formulation of weight decay regularization by decoupling the weight decay from the optimization steps taken w.r.t. the loss function, and provides empirical evidence that this modification substantially improves Adam's generalization performance. Expand
HuggingFace's Transformers: State-of-the-art Natural Language Processing
TLDR
The \textit{Transformers} library is an open-source library that consists of carefully engineered state-of-the art Transformer architectures under a unified API and a curated collection of pretrained models made by and available for the community. Expand
Ludwig: a type-based declarative deep learning toolbox
TLDR
Ludwig is a flexible, extensible and easy to use toolbox which allows users to train deep learning models and use them for obtaining predictions without writing code, and introduces a general modularized deep learning architecture called Encoder-Combiner-Decoder that can be instantiated to perform a vast amount of machine learning tasks. Expand
A Unified Approach to Interpreting Model Predictions
TLDR
A unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations), which unifies six existing methods and presents new methods that show improved computational performance and/or better consistency with human intuition than previous approaches. Expand
Bag of Tricks for Efficient Text Classification
TLDR
A simple and efficient baseline for text classification is explored that shows that the fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation. Expand
...
1
2
...