Heuristic Adaptation of Scientific Process Models

Abstract

Scientific models are seldom constructed from scratch; more often they are adapted from some existing models. In this paper, we present a computational approach to this adaptation task in the context of quantitative process models. We review the paradigm of inductive process modeling and discuss RPM, a recent system that operates on this problem. After this, we describe APM, a new system that adapts a process model to a new setting by revising its parameters or altering its component processes. Next we report experiments that demonstrate the system’s basic abilities and compare its efficiency relative to using RPM. We conclude by discussing other research on model revision and outlining plans for additional work. 1. Background and Motivation Research on computational scientific discovery (Shrager & Langley, 1990; Džeroski, Langley, & Todorovski, 2007) addresses the construction of laws and models in established scientific formalisms. Much work on this topic has dealt with finding empirical relations that describe regularities in data, such as those appearing early in the stages of a field’s development. There has been much less research on the construction of explanatory models that move beyond the data to account for observations at a deeper level. Work in this area incorporates structured representations and multi-step reasoning over these structures, making the task especially relevant to the cognitive systems community. In this paper, we focus on the problem of inductive process modeling (Langley, Sanchez, Todorovski, & Džeroski, 2002). Here one is provided with multivariate time series for some dynamic system and background knowledge about the types of processes that can occur in the domain. The goal is to generate a quantitative process model, including numerical parameters, that reproduces the observed trajectories and that predicts new values accurately. Such a model compiles into a set of linked differential equations, but it also accounts for the data in terms of unobserved processes. This distinguishes research in the area from work on differential equation discovery such as that by Džeroski and Todorovski (2008), which describes but does not explain them in deeper terms. However, in many cases, scientists are less concerned with creating a model from the ground up than with adapting an existing model to a new setting. This may occur when they believe the system under study has changed, so that the model no longer fits observations as well as those for which they devised it. Or they may have developed the model to explain data from one area and c © 2016 Cognitive Systems Foundation. All rights reserved. A. ARVAY AND P. LANGLEY find that it does not fare as well on data for an adjacent area. In either case, they may need to revise the model’s parameters to address quantitative changes, or even need to alter the model’s structure by removing, adding, or replacing some of its component processes. In the sections that follow, we describe one approach to the task of adapting a process model to explain data in a new setting. We start by reviewing RPM, a recently developed system for process model induction that is both more reliable and more efficient than earlier approaches. After this, we describe APM, a new system for model adaptation that builds on the ideas that underlie RPM’s successes. Next we report empirical studies designed to show that APM operates as intended and that it offers efficiency gains over inducing a model from scratch. Finally, we discuss previous work on model revision and propose directions for future research. 2. Review of the RPM System In recent research, we have reported a new approach to inductive process modeling and its implementation in the RPM system (Langley & Arvay, 2015). The new framework builds on earlier ones (Borrett, Bridewell, Langley, & Arrigo, 2007) but also introduces important new ideas about representation and processing. In this section, we review these two aspects of RPM in turn, along with some experimental results. 2.1 Representation in RPM Like its predecessors, RPM organizes differential equation models into distinct processes. These identify aspects of the equations that must stand or fall together. For example, ecosystem models include processes such as predation, grazing, growth, loss, and nutrient absorption. However, the system differs from earlier ones by making four key assumptions: • All processes concern changes over time and effect these changes at a specific rate. For instance, a chemical reaction describes interactions among a set of substances, but its rate of operation can vary over time. • Each process has one or more associated derivatives that are proportional to its rate. Some variables are inputs to a process, which it consumes and thus have negative coefficients, while others are outputs, which a process produces and thus have positive coefficients. • The rate of each process is determined by a parameter-free algebraic expression. RPM assumes that rates are always positive and inherently unobservable, so it can adopt any measurement scale it likes, avoiding the need for coefficients. • If a variable appears in the rate expression for a process, then it must also appear as a derivative associated with that process. Along with the standard supposition that the effects of different processes are additive, these postulates mean that one can compile a process model into a set of differential equations that are linear combinations of algebraic rate expressions. When joined with a fifth assumption, that all variables are observed on each time step, this suggests a novel approach to inducing process models that it both efficient and robust.

13 Figures and Tables

Cite this paper

@inproceedings{Arvay2016HeuristicAO, title={Heuristic Adaptation of Scientific Process Models}, author={Adam Arvay and Pat Langley and Patrick W. Langley}, year={2016} }