Corpus ID: 237352906

Natural Language Processing Models That Automate Programming Will Transform Chemistry Research and Teaching

  title={Natural Language Processing Models That Automate Programming Will Transform Chemistry Research and Teaching},
  author={Glen M. Hocky and Andrew D. White},
Natural language processing models have emerged that can generate usable software and automate a number of programming tasks with high fidelity. These tools have yet to have an impact on the chemistry community. Yet, our initial testing demonstrates that this form of Artificial Intelligence is poised to transform chemistry and chemical engineering research. Here, we review developments that brought us to this point, examine applications in chemistry, and give our perspective on how this may… Expand

Figures from this paper


Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm
It is suggested that the function of few-shot examples in these cases is better described as locating an already learned task rather than meta-learning, which motivates rethinking the role of prompts in controlling and evaluating powerful language models. Expand
Voice-controlled quantum chemistry
ChemVox is an interactive Amazon Alexa skill that uses speech recognition to perform quantum chemistry calculations and interfaces Alexa with cloud computing and returns the results through a capable device. Expand
Language Models are Unsupervised Multitask Learners
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations. Expand
CHEMDNER: The drugs and chemical names extraction challenge
This task allowed a comparative assessment of the performance of various methodologies using a carefully prepared collection of manually labeled text prepared by specially trained chemists as Gold Standard data, and expected that the tools and resources resulting from this effort will have an impact in future developments of chemical text mining applications. Expand
Will robots kill chemistry?
I’m just barely a midcareer professional, and tumult is one word that I can use to describe the economic shifts within chemistry that I’ve witnessed. In my short time in the industry, I’ve seenExpand
Automated Code Generation for Maximizing Performance of Detailed Chemistry Calculations in OpenFOAM
In direct numerical simulation of turbulent combustion, the majority of the total simulation time is often spent on evaluating chemical reaction rates from detailed reaction mechanisms. In this work,Expand
PySCF: the Python‐based simulations of chemistry framework
The capabilities and design philosophy of the current version of the PySCF package are document, which is as efficient as the best existing C or Fortran‐based quantum chemistry programs. Expand
VMD: visual molecular dynamics.
VMD is a molecular graphics program designed for the display and analysis of molecular assemblies, in particular biopolymers such as proteins and nucleic acids. VMD can simultaneously display anyExpand
Attention is All you Need
A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data. Expand
Semi-supervised Sequence Learning
Two approaches to use unlabeled data to improve Sequence Learning with recurrent networks are presented and it is found that long short term memory recurrent networks after pretrained with the two approaches become more stable to train and generalize better. Expand