The Design Space of Frame Knowledge Representation Systems

Abstract

In the past 20 years, AI researchers in knowledge representation (KR) have implemented over 50 frame knowledge representation systems (FRSs). KR researchers have explored a large space of alternative FRS designs. This paper surveys the FRS design space in search of design principles for FRSs. The FRS design space is de ned by the set of alternative features and capabilities | such as the representational constructs | that an FRS designer might choose to include in a particular FRS, as well as the alternative implementations that might exist for a particular feature. The paper surveys the architectural variations explored by di erent system designers for the frame, the slot, the knowledge base, for accessoriented programming, and for object-oriented programming. We nd that few design principles exist to guide an FRS designer as to how particular design decisions will a ect qualities of the resulting FRS, such as its worst-case and average-case theoretical complexity, its actual performance on real-world problems, the expressiveness and succinctness of the representation language, the runtime exibility of the FRS, the modularity of the FRS, and the e ort required to implement the FRS. 4 INTRODUCTION In the past 20 years, AI researchers in knowledge representation (KR) have implemented over 50 frame knowledge representation systems (FRSs). KR researchers have explored a large space of alternative FRS designs. The central goal of this paper is to present and elucidate design principles for FRSs, and to note where such principles are lacking, i.e., to identify open problems in KR. The FRS design space is de ned by the set of alternative features and capabilities | such as the representational constructs | that an FRS designer might choose to include in a particular FRS. The design space also includes the alternative implementations that might exist for a particular feature. The foremost principle in any area of design is an understanding of what the design space is. This paper provides that understanding by surveying the FRS design space, and by providing a road map to the FRS literature. As well as elucidating what the FRS design space is, this survey should help to decrease the frequent duplication of e ort where di erent researchers rediscover the same new points in the design space. In addition, an understanding of the current range of FRS behaviors is of crucial importance to the standardization e orts now under way in the KR community [63]. Ideally, an \interlingua" for knowledge representation would be able to represent any knowledge that can be represented in any existing FRS. More realistically, those representational constructs that are excluded should be excluded intentionally, rather than due to ignorance of their existence. Additional principles should aid FRS designers in choosing the optimal FRS design for a given class of applications. The FRS design space is quite large, therefore this paper does not attempt to cover all of it. We do not consider FRS features such as graphical user interfaces, context mechanisms, truth maintenance systems, production-rule systems, or declarative query languages. Furthermore, FRSs form a subset of all KR systems. This paper is not concerned with other types of KR systems such as theorem provers, nonmonotonic reasoners, or temporal reasoners. It is concerned with substantial FRS implementations that have been employed in complex applications such as natural language understanding and medical diagnosis. FRSs are known by a variety of names, including semantic networks, frame systems, description logics, structural inheritance networks, conceptual graphs, and terminologic reasoners. Some researchers may object to my lumping all these types of KR systems together in one survey. Although di erences do exist between di erent subclasses of FRSs, and the plethora of names for these systems are not completely synonymous, past authors have interchanged these names enough that we would be hard pressed to provide precise, consistent de nitions for all of these di erent terms. Terminology aside, it is productive to survey these systems together because they were developed by closely related communities of researchers with similar problem-solving goals, because they have been tested in real-world application domains, because their design spaces overlap to a large degree, and because they share many of the same design principles. These properties do not apply to other KR systems, and a survey of all KR research would not t in a single paper, therefore my aggregation stops before that of Schubert, who sees a convergence in all the major KR schemes [82]. The intended audience for this paper is quite wide, ranging from experts in KR, to re5 searchers in other areas of AI, to database researchers. This paper should be of great interest to database researchers because of the high degree of similarity between FRSs and object-oriented databases since the two classes of systems are di erent variations of essentially the same data model. The paper will also be valuable to FRS users who wish to select the existing FRS whose capabilities best meet the requirements of a problem at hand. The paper begins with an overview of FRSs in a tutorial style for readers who are not well acquainted with these systems (Section 1). Section 2 discusses in more detail what design principles for FRSs should tell us, and what design principles have thus far been elucidated. Section 3 provides a terse road map to the FRS literature from a historical perspective by listing the major families of FRSs, and by listing the general literature citations for each FRS. This organization puts most FRS citations in one place and avoids the need to repeatedly list the same citations later in the paper. Section 4 begins the in-depth exploration for the FRS design space by considering the diverse models of the frame that di erent researchers have explored. Section 5 considers alternative slot designs, and Section 6 considers alternative models of inheritance. Section 7 considers mechanisms for providing persistent knowledge base storage, and Section 8 discusses object-oriented and access-oriented programming systems within FRSs. Finally, Section 9 discusses the classi cation operation and its role within FRSs. As a result of researching this survey I identi ed a number of methodological problems in KR research. To focus the subject of this paper, and because of space limitations, they will be discussed in a separate publication. 1 OVERVIEW OF FRAME REPRESENTATION This section presents a very brief, simple introduction to frame representation. It begins by describing the general classes of tasks that FRSs are used for. Next it presents the structure of frames. We then consider the services that FRSs provide to users and to application programs. This section simpli es the notion of a frame considerably to set the stage for the detailed analysis that follows in the remainder of the paper. Other introductions to frame representation can be found in [27, 30, 11, 26]. It is important to note what FRSs are not: FRSs are not equivalent to \knowledge-based systems". This term is very general and encompasses a large number of arti cial-intelligence techniques. FRSs are a subset of knowledge-representation systems, which in turn are used to build knowledge-based systems. 1.1 The Structure of Frames A frame is a data structure that is typically used to represent a single object, or a class of related objects, or a general concept (or predicate). Researchers have used a number of words synonymously for the word frame, including memory unit, and unit. Whereas some systems de ne only a single type of frame, other systems distinguish two or more types, 6 such as class frames and instance frames. The former represent classes or sets of things (such as the class of all computers in the VAX-11 family or the class of all VAX-11/780s) and the latter represent particular instances of things (such as a particular VAX-11/780). Frames are typically arranged in a taxonomic hierarchy1 in which each frame is linked to one (or in some systems, more than one) parent frame. A parent of a frame A represents a more general concept than does A (a superset of the set represented by A), and a child of A represents a more speci c concept than does A: A collection of frames in one or more inheritance hierarchies is a knowledge base (KB). Frames have components called slots. The slots of a frame describe attributes or properties of the thing represented by that frame, and can also describe binary relations between that frame and another frame. In addition to storing values, slots also contain restrictions on their allowable values. We might stipulate that the Word Size slot de ned in the COMPUTER frame must be an integer between 1 and 100. Slot de nitions often have other components in addition to the slot name, value, and value restriction, such as the name of a procedure than can be used to compute the value of the slot, and a justi cation (in the truth-maintenance sense) of how a slot value was computed. These di erent components of a slot are called its facets. Inheritance causes slot de nitions to propagate down the taxonomic hierarchy. For example, when we create a frame called VAX-8000 Family as a child of the VAX Family frame, VAX-8000 Family automatically acquires the slot Word Size, and the value of this slot automatically becomes 32. Thus VAX-8000 Family has inherited its Word Size slot from VAX Family. Similarly, VAX Family might have inherited the Word Size slot from the frame DIGITAL Computer, but we would not have given the Word Size slot the value 32 in DIGITAL Computer because not all computers manufactured by DIGITAL have a word size of 32. Thus, the value 32 was de ned locally in VAX Family. Inheritance is a tremendously useful tool for engineering complex knowledge bases. When a user creates a new frame and that frame inherits slots from its parent, the inherited slots form a template that guides the user in lling in knowledge about the new concept. Because all slot and facet information is available at run time (in contrast to object-oriented programming languages such as C++), it is accessible to a program such as a user interface that guides the user in entering new knowledge. For example, the user interface can directly determine the slot datatype and value restrictions. Inheritance also facilitates systematic changes to complex knowledge. If we discover that all Cray computers can run UNIX in addition to CrayOS, we can encode this information in one place: by altering the value of the Operating Systems slot in the Cray Computer frame. Some FRSs compute a relation between class frames called subsumption that allows the FRS to automatically determine the correct position of a class in a taxonomic hierarchy (to classify the class). Frame A subsumes Frame B if A de nes a more general concept than does B; meaning that every instance of the concept B is an instance of A. For example, because the value of the Manufacturer slot of DIGITAL Computer is DIGITAL, 1Synonyms: generalization{specialization hierarchy, is{a hierarchy, class{subclass hierarchy, AKO (a kind of) hierarchy, and inheritance hierarchy. 7 whereas the Manufacturer slot of Computer is constrained only to be an instance of the class Corporation, a FRS could infer that DIGITAL Computer is subsumed by Computer. 1.2 A Functional View of Frame Representation Systems This section discusses FRSs from a functional perspective by examining two classes of operations that these systems provide to problem-solving programs: direct storage and retrieval of knowledge, and inferential problem solving. 1.2.1 Storage and Retrieval of Knowledge Most FRSs provide a library of (usually LISP) functions that application programs can call to perform such actions as: adding a new value to a slot, deleting one or more values of a slot, retrieving the current value of a slot, creating a new frame with speci ed parents, changing the parents of a frame, deleting a frame, renaming a frame, adding a new slot to a class, and adding a facet to a slot. In addition to the function-call library, some systems allow the user to accomplish these same functions with a graphical user interface. For example, KEE, STROBE, CYCL, and KREME [2] allow the user to create graphical displays on a workstation of both the taxonomic hierarchy of a knowledge base, and of the slots within a given frame. Items within these displays are mouse sensitive, and can be used to call up menus from which the operations described in the previous paragraph can be selected. In an approach pioneered by KRYPTON, some FRSs provide a declarative language that users can employ to both query a KB, and to assert new facts into a KB. In PROTEUS, for example, the query (Manufacturer ?X:Computers DIGITAL) would return a list of assertions describing all children of the Computers frame whose Manufacturer slot contained the value DIGITAL. Little research has been performed on optimizing the evaluation of FRS queries. 1.2.2 Problem Solving and Inference Application programs that interact with a FRS typically employ either production rules or classi cation to perform inference based on knowledge stored in the FRS. Most FRSs provide either production rules or classi cation, whereas LOOM and CLASSIC support both. In KEE and CLASS [80], each production rule is itself encoded as a single frame. In KEE and PROTEUS, queries such as those described in the previous section can invoke a backwardchaining production-rule interpreter to derive the queried slot value. Similarly, THEO users can attach PROLOG rules to slots to cause THEO to backward chain to derive queried slot values. KEE and PROTEUS can also invoke forward chaining when new slot values are asserted. 8 Classi cation is used to support inference in two di erent ways. First, the very act of classi cation can be a problem-solving action, for example, if a system can recognize a description of a patient as an instance of a disease class, it has computed a diagnosis. Second, in the KL-ONE family of FRSs, classi cation is a key component of the query processor that allows the system to reason about relationships among terms used in a query and terms in a knowledge base. These systems answer a query by translating the query into a concept description, and then classifying that concept to determine its placement in the taxonomic hierarchy. All concepts below the query concept in the hierarchy are subsumed by the query, and thus comprise the answer to the query. The principal principle of KR that has thus far been discovered concerns the complexity of computing subsumption (and therefore, classi cation). Researchers have compared the cost of computing subsumption in a number of di erent FRS representation languages, and have found that the more expressive the language, the higher the cost of computing subsumption within that language. This result is called the expressiveness{tractability tradeo , and will be discussed in more detail in Section 9.1. Some FRSs leverage their inference capabilities by combining them with context mechanisms and truth-maintenance systems [20]. These facilities are valuable for investigating alternative problem solutions in parallel, and for tracking the dependence of problem solutions on underlying assumptions. Context mechanisms exist in THEO, KEE [29], STROBE, CYCL [36], SRL, LOOM, and CRL; truth-maintenance systems are present in THEO, KEE, CYCL, LOOM, CLASSIC, KL-TWO, and PROTEUS. 2 DESIGN PRINCIPLES FOR KNOWLEDGE REPRESENTATION SYSTEMS The large size of the FRS design space implies that the designer of an FRS must make many decisions. For example, we will see that she must decide what model of the frame and the slot to utilize, what inheritance mechanism(s) to use, whether to employ classi cation, and what subsumption algorithm to use if classi cation is employed. A comprehensive set of principles of KR should guide FRS designers | and users | through a complex web of choices. FRS users need to know what combination of representational constructs will allow them to quickly build an application that has acceptable performance. They need to know what representational constructs can encode the knowledge in their domain most naturally and succinctly, to yield an application that can be maintained easily as it evolves. Users need to know the theoretical costs and bene ts of FRS features, and they need to know how a particular FRS will perform under the demands of their application. KR principles should guide users in choosing the optimal FRS for a particular problem | the one with the maximum bene ts and the minimum costs. When designing an FRS to solve one or more classes of application problems, implementors must address a superset of these issues. As well as anticipating what combination of representational constructs will yield su cient expressiveness and performance for the 9 applications, the implementors must make a number of engineering decisions. For example, implementors must decide among alternative implementation strategies for the representational constructs they have chosen. And although we might hope that the implementations of every FRS feature are independent, they often interact to yield an FRS that is not modular and is therefore di cult to develop, debug, maintain, and improve. I claim that comprehensive FRS design principles are largely lacking. The main principle of KR that has thus far been elucidated is the expressiveness{tractability tradeo that relates the expressiveness of a representation language with the cost of computing classi cation within that language. This principle is clearly valuable since it helps users and implementors understand the expressive bene ts and the worst-case computational costs of several representation languages. However, this principle describes only the worst-case theoretical impact of one class of representational constructs (concept-de nition constructs) on one type of FRS operation (classi cation). Many additional principles are needed to cover other representational constructs, other FRS operations, other theoretical performance besides worst case, actual performance in addition to theoretical performance, and other criteria besides performance and expressiveness. More speci cally: Classi cation is only one of many operations that FRSs compute. Some FRSs do not even compute classi cation. We must know the impact of di erent representational constructs on other operations such as computing inheritance, storing and retrieving slot values, and production-rule inference within an FRS. Expressiveness{tractability analyses have not considered representational constructs such as metaclasses, facets, and inheritance across multiple links. Although not all of these constructs will a ect the classi cation operation, they will certainly impact some FRS operations. In general we should know the e ects of every representational construct on the performance of every FRS operation. Worst-case theoretical results are not always representative of the average case, and theoretical results are not always constraining in practice. Theoretical principles concerning average-case behavior of various FRS operations are generally lacking. Engineering principles concerning choices of data structures and algorithms are even fewer. Performance and expressiveness are not the only factors to consider when choosing a representation. Two representations might have equal expressive power, but one might be much more succinct for a particular application. In addition, there is a tradeo between the run-time exibility of the FRS (the degree to which knowledge that the FRS maintains can be altered at run time as opposed to the time of de nition), and the performance of the FRS. Also, the modularity of an FRS implementation is a ected by the choice of representational constructs and the implementation of those constructs. Other factors concern the e ort involved in implementing a particular feature, and the frequency with which that feature is used in di erent applications; a feature that is di cult to implement but that is hardly ever used should probably be disregarded. We need much more knowledge about the costs and bene ts of di erent FRS features with respect to all of these factors. 10 3 THE FAMILIES OF FRAME KNOWLEDGE REPRESENTATION SYSTEMS This section provides a very brief overview of implemented frame representation systems of the past and present. An ideal treatment of the evolutionary relationships between FRSs would di er from the treatment presented here in several ways. Ideally we would like to know the general characteristics of each family, as well as how and why a given system di ers from its parent(s): what shortcomings did an author discern in the parent system, what di erences did the author introduce in the child FRS to remedy these shortcomings, were these modi cations successful, and what were the costs of these changes in terms of the metrics listed in Section 2? Unfortunately, space restrictions preclude a full treatment of these issues, and, more importantly, authors rarely document these aspects of their work systematically. This practice not only obscures the intellectual history of frame representation, but it makes principles of FRSs more di cult to derive. Therefore this section lists the one or two most in uential parents of a number of FRSs, either as identi ed by the FRS author, or as ascertained by the author of this paper with a high degree of certainty. I have been unable to make this determination for many systems. General literature references for particular FRSs are presented in this section of the paper only; later sections provide some references to support speci c points. Figure 1 shows several FRS families. The UNITS family originated at Stanford University in the late 1970s. Its members include the UNIT Package [89, 93], STROBE [88, 90, 91], CLASS, RLL [35, 34], CYCL [47, 49, 48], ARLO [37], THEO [57], JOSIE [60], OPUS [28], and the commercial systems KEE [42] and KAPPA. The KL-ONE family originated at Harvard University in the early 1970s. Its members include KL-ONE [15], NIKL [41, 6, 76], KANDOR [65], KL-TWO [98], K-REP [52], KREME [2], BACK [67, 99], MUNIN, SPHINX, KRIS [4], MESON, SB-ONE [43], KRYPTON [13], LOOM [101, 40, 51], and CLASSIC [12, 14, 69]. See [76, 51, ?] for historical overviews of the KL-ONE family. The SRL family originated at Carnegie-Mellon University in the early 1980s. Its members include SRL [31], FRAMEKIT [64], PARMENIDES [86], and the commercial system KNOWLEDGECRAFT. The FRL family originated at MIT in the mid 1970s. Its members include FRL [71, 70], HPRL [44, 72], and GOLDWORKS. Several other FRSs do not exist within a larger family. They include PROTEUS [73, 68], FROBS [58], OZONE [45], KRL [7, 46, 8], BB* [32], LOOPS [94, 17], KB [23], SNePS [84, 85], RHET[3, 56], TELOS [59], PARKA [24], ALGERNON [18, 19], FRAPPE [25], Conceptual Graphs [92], MOPS, ART, and NEXPERT. 11 HPRL Goldworks FRL PARMENIDES FrameKit CRL/KnowledgeCraft SRL Strobe Class JOSIE OPUS THEO ARLO CycL RLL Kappa KEE Unit Package LOOM KRYPTON KANDOR SPHINX KREP NIKL KL-ONE KL-TWO SB-ONE KRS King Kong KRIS BACK CLASSIC MUNIN Figure 1: The Unit-Package, SRL, FRL, and KL-ONE families of frame representation systems. 12 4 FRAMES Typically, FRSs de ne two di erent types of frames: class frames2 represent a class or set of things, a general concept, or an abstraction. Examples: the class of all computers, the set of all computers manufactured by IBM, the concept of a father, or of a mother. Instance frames3 represent individual things | concrete entities that exist in the world. Examples: the particular computer that I am using to type in this sentence, the person who is my father. Di erent researchers have di erent views as to the semantics of class and instance frames, and moreover they have de ned a variety of other types of frames. This section explores the diversity of frames themselves. 4.1 Link Terminology Before proceeding with a detailed discussion of frames, this section establishes a standard and comprehensive terminology for referring to the relationships among class and instance frames in a taxonomic hierarchy. Virtually every family of FRSs has its own terminology. The use of redundant and con icting terminology has hampered communication among knowledge-representation researchers, and has confused researchers in other areas of computer science who often assume that di erent terms must describe di erent concepts. I propose the following terminology, which I developed in the spirit of the presentation by Russino [73]. Footnotes translate my terminology to that used by previous researchers. We are interested in naming the relationships that exist between class frames and instance frames. If a class frame C1 is linked directly above a class frame C2 in the hierarchy, then we say that C1 is a direct-super of C2; and that C2 is a direct-sub of C1; 4 we call the link itself a super{sub link. If a class frame C is linked directly above an instance frame I then we say that C is a template of I; and that I is an instance of C: 5 We de ne the all-supers relation to be the transitive closure of the direct-supers relation, and we de ne the all-subs relation to be the transitive closure of the direct-sub relation (therefore C1 is an all-sub of C2 if C1 is a direct-sub of C2, or if C1 is a direct-sub of any all-sub of C2). 6 Similarly, we say that an instance frame I is in the all-instances of a class C if I is an instance of C or an all-sub of C; in which case we would say that C is in the all-templates of I: 7 Finally, we say that A is a parent of B if A is either a direct-super or a template of B (in which case B is a child of A). We de ne ancestor as the transitive closure of the parent relation, and descendant as the transitive closure of the child relation. Table 1 summarizes this terminology, and also presents a notation that I have developed to express these relationships more succinctly. 2Synonyms: concept (KL-ONE), collection (CYCL), specialization (UNIT Package), frame (KANDOR), set, schema, generic (FRL), template, node. 3Synonyms: individual (KL-ONE, STROBE, FRL), individual object (CYCL), instantiated frame. 4Synonyms: superclass and subclass (KEE), parent and child (PROTEUS), superC (KL-ONE). 5Synonyms: member-parent-of and member-of (KEE), type-of and instance-of (PROTEUS), individuates 13 Notation Meaning C1 > C2 Class C1 is a direct-super of class C2 Parents C I Class C is a template of instance I C1 < C2 Class C1 is a direct-sub of class C2 Children I C Instance I is an instance of class C C1 C2 Class C1 is an all-super of class C2 Ancestors C I Class C is an all-template of instance I C1 C2 Class C1 is an all-sub of class C2 Descendants I C Class I is an all-instance of class C Table 1: Notation and terminology for describing inheritance relations. 4.2 The Diversity of Frames Some FRSs employ only a single type of frame rather than both class and instance frames. NIKL provides for class frames only, because its authors consider NIKL's mission to support the de nition of concepts and the relations between them, not to facilitate reasoning about individuals.8 THEO also de nes only one type of frame, because its authors believe that the distinction between classes and instances is sometimes not well de ned. Every other FRS includes at least class and instance frames, but some systems employ the following additional types of frames: Metaclasses | PROTEUS utilizes frames called metaclasses to de ne sets of PROTEUS classes | every PROTEUS class is an instance of the metaclass called CLASS. Users can de ne other metaclasses as direct-subs of the metaclass CLASS. For example, we could de ne a metaclass called Computer Family of which VAX Family (a class) is an instance. CLASS is the class of all classes, or synonymously, the set of all sets. LOOPS also employs metaclasses, as does CYCL: every CYCL frame is an instance of either the frame Collection or the frame IndividualObject; the former types of frames are classes and the latter are instances. Thus the frame Collection is really a metaclass that is equivalent to the CLASS of PROTEUS. In CYCL, only class frames (instances of Collection) can have instances, direct-supers, or direct-subs, whereas only instance frames (instances of a class frame) can have parts. Inde nites | The UNIT Package and the CLASS FRS employ inde nite frames to represent instances whose identities are unknown (similar to the notion of skolem constants). Inde nites allow a user to say that two individuals are the same without knowing their identities [93]. (KL-ONE). 6Synonyms: superclass and subclass (PROTEUS). 7Synonyms: member (PROTEUS). 8NIKL does de ne \individual" class frames as classes that have only a single member | but classes nonetheless. To reason about individual objects a user must employ an additional system, for example KL-TWO [98] combines NIKL with a propositional reasoning system called RUP [54]. 14 Descriptions | These frames are variablized classes that are employed in the UNIT Package to represent goals in planning problems [93]. CLASS also has inde nites. Prototypes| KRL, RLL, and JOSIE employ prototype frames to represent information about a typical instance of a class, as opposed to the class itself and as opposed to actual instances of the class. In these systems instance frames inherit default information from prototype frames rather than from class frames (see Section 6). SlotUnits | In CYCL, slotUnits9 are frames that encode information about slots themselves (see Section 5). SeeUnits | CYCL employs seeUnits as \footnotes" or annotations for other frames. They can hold constraints on slot values, dependency information (such as what inference stored a value in a slot), and epistemological information (who believes a slot value to be true). CompactUnits|KEE's compactUnits are instance frames that consume less storage and are faster to access than are normal instance frames. A compactUnit must have only one template frame (normal KEE instances can have multiple templates), and the compactUnit must have exactly the same set of slot de nitions as its template (in KEE, users can de ne new slots in an instance frame that were not de ned in the template of that frame). LOOM has a similar mechanism; its CLOS instances can provide a more e cient implementation of instance frames. 4.3 Discussion Few KR principles exist to guide our understanding of the preceding features. Generally speaking, each feature adds expressiveness and potential succinctness to an FRS by allowing us to more faithfully render a complex epistemological landscape. Yet we have little knowledge of the performance costs or bene ts of these features, of the e ort required to implement them, of their e ect on system modularity, of the frequency with which they are used, or of the optimal data structures and algorithms for implementing them. CompactUnits provide faster performance with less expressiveness and exibility than normal instance frames, but we do not know how much more performance, nor exactly why restricting an instance to a single template is required to provide this speedup. Presumably it allows a xed mapping from a slot name to a location in an array. Are the same implementation techniques used in the KEE and LOOM implementations, and if not, exactly how do the speedups compare? Are the changes required to implement compactUnits fairly localized, or are they spread throughout the FRS code? Metaclasses, prototypes, seeUnits, descriptions, and inde nites extend the expressiveness of FRSs, and in ways not considered in past expressiveness{tractability analyses. Lenat and Guha provide a detailed discussion of the types of knowledge that metaclasses can represent [47, p57]. But once again, we do not have a clear picture of the costs of these constructs. 9Synonyms: relation frames (OPUS, JOSIE). 15 And even their expressiveness bene ts are not that clear: when are defaults provided by prototypes preferable to class-based defaults? Nado and Fikes argue that prototypes more clearly separate information about class members from information about the class itself [60]. Schoen reports (personal communication, 1991) that little use was made of inde nite frames in CLASS, whereas description frames were used. Some insight on an engineering issue comes from the PARMENIDES system. All of the FRSs in the UNIT Package family implement classes and instances in essentially the same manner. In contrast, PARMENIDES implements classes and instances quite di erently under the assumptions that many more instances than classes will exist in most KBs, and that instances will be accessed more frequently than classes will. PARMENIDES classes are implemented as association lists, whereas instances are implemented as adjustable arrays. Thus instances are more compact than are classes, and instances are faster to access since no search through an association list is required. However, this approach limits the runtime exibility of PARMENIDES: although new slots can be created at run time by adding elements at the end of the adjustable array, in order to remove or modify slot de nitions the user must restart the FRS to rebuild the knowledge base. Also, we have no data on exactly how much of a performance gain this technique yields. 5 WHAT'S IN A SLOT Every frame consists of a set of slots, which usually represent properties or attributes of the object or concept represented by the frame.10 Slots are also used to represent binary relations between their containing frame and another frame. The very simplest model of slots gives every slot a name (such as Manufacturer), and a value11 | such as Data General. Researchers have embellished this simple model in a number of ways. Almost all systems specify a few other attributes for each slot besides its value, such as a slot datatype (described in more detail in Section 5.4), and restrictions on the allowable values for the slot (Section 5.5). Some researchers generalized this notion to allow slots to have arbitrary properties called facets, of which name, value, datatype, and value restriction are the usual complement. Other typical facets that we nd are: a comment, a measure of belief such as a MYCIN-like certainty factor (used in CYCL), an explanation or justi cation that speci es what other slot values the current value was inferred from (used in CYCL and in THEO), a non-value (used to represent negation in ALGERNON), a description of what agent believes this slot value (used in CYCL), attached procedures, default values, and a speci cation of an inheritance mechanism for that class (see Sections 8 and 6). THEO uses an even more general notion of slots: facets themselves can have facets, to any level | THEO allows an arbitrarily deep nesting of slots within slots within slots. Thus we could de ne a Comment subslot within the Manufacturer slot of the Computer frame. We could also create a subslot of Comment, perhaps to record the name of the user who 10The KL-ONE family of FRS calls slots roles because a slot such as Manufacturer names the entity that \plays the role of" a manufacturer for a given computer. 11Slot values are called llers in KL-ONE-speak and entries in CYCL-speak. 16 created the comment. As well as facilitating the attachment of meta-information, THEO uses subslots extensively to cache information derived by a variety of inference mechanisms. For example, imagine that a user has queried the value of Frank.Children. If no local value were found, one of the THEO inference mechanisms would cause it to attempt to obtain the value from Frank.Daughters if it could determine (from consulting the slotUnit of Children) that the Daughters relation is a specialization of the Children relation. THEO would locally cache within the Frank frame some of the information computed in the process of answering this query. For example, it would cache the fact that Daughters is a specialization of Children within the subslot Frank.Children.Slotspecs. The JOSIE system provides a way of viewing slot values as classes, that allows a wider set of assertions to be made about slot values. That is, if the values of a slot comprise a set, this mechanism lets us treat that set like a class, and use the JOSIE class-de nition language to make assertions about that set. Section 6.1 elaborates on this idea. OZONE takes yet another approach: slots are partitioned into separate groups called spaces. Each frame typically has three spaces called the system space (these slots list the name of the frame and the parents of the frame), function space (these slots contain attached procedures), and the variable space (used for most user-de ned slots). Spaces provide two major bene ts: they de ne separate name spaces for de ning di erent types of slots, for example to prevent name clashes between system slots and user slots; they also facilitate incremental loading of frame contents from secondary storage (see Section 7) | systemspace slots can be loaded independent of slots in other spaces. FRSs also vary as to whether users can introduce new slots into an instance frame that did not exist in its template. KEE allows this but the KL-ONE family does not, under the interpretation that because a class frame strictly de nes what it means to be an instance of some concept, introducing new slots into an instance frame would violate the de nition of the concept. 5.1 Slot Notation This section presents a notation for de ning paths through complex frame structures that is an adaptation of the CYCL notation [47]. This notation should prove useful both in expositions and in declarative frame query languages. To refer to the value of the Manufacturer slot of the Computer frame we write Computer.Manufacturer; to refer to the value of its Comment subslot we write Computer.Manufacturer.Comment. Note that since the value of a slot is simply a distinguished subslot, Computer.Manufacturer.value is equivalent to Computer.Manufacturer. Now imagine that we wish to refer to the number of employees of the manufacturer of the VAX-11/780. Manufacturer is a binary relation that names a frame which describes the manufacturing corporation. To refer to the Number Of Employees slot of that frame indirectly, we write VAX-11/780.Manufacturer->Number Of Employees. Note the di erence between this speci cation and the speci cation VAX-11/780.Manufacturer.Number Of Employees. The for17 mer refers to a slot within the frame Digital whereas the latter refers to a subslot of the slot VAX-11/780.Manufacturer. THEO uses a list notation for the former type of reference only: (Vax-11/780 Manufacturer Number Of Employees). 5.2 SlotUnits Several FRSs contain a type of frame that the authors of CYCL call a slotUnit. A slotUnit is a frame that holds de nitional information about a single slot that describes the use of that slot throughout a KB. A slotUnit might specify the domain and range of a slot S (the domain of Manufacturer is the set of frames in which it makes sense to use this slot | the class Manufactured Objects), and the range of this slot describes its allowable values (instances of the class Corporations). Lenat's experience in CYCL was that it was often desirable to represent a wide variety of information about slots. For example, CYCL (as well as FRAMEKIT and STROBE) use slotUnits to store inverse de nitions. The slotUnit for the Manufacturer slot might record that the inverse of the Manufacturer relation is the Manufactures relation. Thus, when we record that Digital is a value of Vax-11/780.Manufacturer, CYCL will automatically add the value Vax-11/780 to Digital.Manufactures. More generally, automatic maintenance of inverse links is equivalent to enforcing the constraint: 8G [G 2 F:S F 2 G:S 1] where S 1 is obtained from the slotUnit S as S:inverse. Although slotUnits provide a place to store general information about slots, note that some information about a slot must be stored in frames containing that slot. For example, although the slotUnit will specify a domain and range for a slot such as Manufacturer, additional value (range) restrictions that are de ned at a class such as Japanese Computermust be stored in the Japanese Computer frame because they are speci c to that class. We might want to constrain the Manufacturer of every Japanese Computer to be an all-instance of Japanese Corporation. Within the Units family, the idea of slotUnits arose in both RLL and OPUS, and is also used in THEO. SRL and FRAMEKIT use slot Units only for slots that represent a binary relation between two frames. The KL-ONE family creates a hierarchy of slot de nitions called a role hierarchy, however, each slot de nition in the role hierarchy is not implemented as a frame, but with a special data structure. The role hierarchy allowed a user to de ne role di erentiations | roles whose potential values are by de nition a subset of the potential values of another role. Thus, in KL-ONE we can de ne CPU Manufacturer to be a role di erentiation of Manufacturer because the allowable values of the former are a subset of the allowable values of the latter. Given this de nition, if Motorola was a value of SUN-3.CPU Manufacturer, the system could infer that Motorola must also be a value of SUN-3.Manufacturer, given the subset relation between the two slots. Note that inheritance within a slotUnit hierarchy can be used to encode role di erentiations. 18 5.3 Own Slots, Member Slots, and Bookkeeping Slots KEE distinguishes what its developers call member slots from own slots, as do FRAMEKIT and FROBS. Member slots reside only within class frames. They describe properties of instances of that class, and are inherited by children of that class. Own slots, in contrast, can reside within either class or instance frames. Own slots are not inherited by children of a class frame C because these slots represent properties of only the class represented by C; and not its children. (If own slots were employed in a FRS that used metaclasses, the own slots of a class C should be inherited from the template of C (a metaclass) since own slots describe a frame as an instance.) As an example, an own slot Fastest within the class Computers that named the fastest known processor should be de ned as an own slot because it describes a property of the set of all computers, not a property of the instances of that set. But Manufacturer should be a member slot of Computers because it denotes a property of every individual computer. In a similar vein, CYCL de nes bookkeepingSlots to be slots that describe a frame F itself rather than the concept that F represents; for example, bookkeepingSlots might be used to record the name of the user who created F; and the time of creation. 5.4 Slot Data Types Most FRSs model slot values as sets (such as CYCL, KEE, and the KL-ONE family). Other systems (such as THEO, OZONE, and the UNIT Package) treat slots as lists of values, the di erence between sets and lists being that in lists the order of elements is preserved and duplicate values are allowed. LOOM treats slot values as ordered sets: duplicates are not allowed, but order is signi cant. Interestingly, virtually no FRS allows the user to explicitly select a slot datatype from among a variety of data types such as sets, ordered sets, lists, and bags. Exceptions to this rule are K-REP, which provides both sets and bags; PARMENIDES, which provides two groups of slot-access functions that treat slots as sets and lists, respectively; and PROTEUS, which allows the user to specify whether a slot is single valued or multivalued. Sets have the advantage over lists of simplifying the logic of frames and of having a well-de ned subsumption relation. Some systems allow the user to specify the datatypes of the individual values in the set or list, for example the UNIT Package allows datatypes of atom, boolean, integer, interval, lisp, number, string, table, text, or unit; KEE provides a similar set of datatypes. Although not thought of as a datatype, LOOM allows the user to specify whether query operations on speci c slots have open or closed-world semantics. CLASSIC provides similar control over the slots in speci c instance frames. 19 5.5 Slot Value Constraints Researchers have de ned mechanisms for specifying constraints on the values of individual slots, and between the values of two di erent slots. These constraints are viewed as necessary conditions that must hold of the slots to which they are attached. A slot in an instance frame F is inherited from the all-templates of F . Typically the constraints present at a particular class C consist of those inherited from the supers of C, plus additional constraints de ned locally at C. C must be a more speci c class than its direct-supers, therefore we expect its value constraints to be more stringent. In this section we discuss the di erent types of constraints that di erent FRSs allow. These constraints are typically used for two purposes by FRS. First, when new values are assigned to a slot the FRS veri es that the values satisfy the constraints | if they are violated then an error is signaled. The second use of these constraints is for classi cation, which is discussed in Section 9. The KL-ONE family treats constraints as de nitional in nature: the constraints on the slots of a class C specify necessary and su cient conditions for what it means to be an instance of that class. That is, if a concept is a predicate, the constraints form the operational de nition of that predicate. 5.5.1 Constraints on Individual Slot Values KL-ONE value restrictions allow us to constrain the range of a slot such that its values may only be the names of all-instances of a particular class within the KB. More precisely, value restrictions specify that every value of a slot S in a frame F must name another frame that is an all-instance of a class frame C: 8 x[x 2 F:S x C] (henceforth we will omit the quanti er and simply write F:S C). For example, we might constrain the range of Computer.Manufacturer to be an all-instance of the class Corporations. KL-ONE employs number restrictions12 to specify upper and lower bounds on the number of values that a slot can have at one time. Both value restrictions and number restrictions are found in most FRSs. KEE de nes a fairly complex language of valueclass speci cations that can be used to specify value restrictions. The syntax of valueclass primitives in the KEE language and their meanings are as follows: (MEMBER.OF class) or class | this speci cation is equivalent to a KL-ONE value restriction: when applied to slot S of frame F it speci es that F:S class. (SUBCLASS.OF class) | every F:S must be an all-sub of class (F:S class) (ONE.OF atom1 .. atomN)| every F:S must take on one of the explicitly speci ed values (F:S 2 fatom1; ::; atomNg) 12Synonym: cardinality restrictions (KEE). 20 (NOT.ONE.OF atom1 .. atomN) | every F:S cannot take on one of the speci ed values (F:S 62 fatom1; ::; atomNg) KEE-INTERVAL | every F:S must lie within the interval speci ed over some ordered type such as the integers (low < F:S < high) (MEMBERP fn) | every F:S must satisfy the predicate de ned by the LISP function fn (TRUEP (fn(F:S))) The preceding primitives can be combined using the following compositional operators: (NOT.IN vcspec) | every F:S must not satisfy the valueclass speci cation vcspec (UNION vcspec1 .. vcspecN)| every F:S must satisfy one of the given valueclass speci cations (INTERSECTION vcspec1 .. vcspecN) | every F:S must satisfy all of the given valueclass speci cations KEE also has cardinality restrictions like those of KL-ONE. CLASSIC has enumerated types like those speci ed using the KEE ONE.OF operator, and requires each enumerated type to be an existing instance frame. FRAMEKIT provides a mechanism similar to KEE's MEMBERP operator: value restrictions can be speci ed as LISP S-expressions that FRAMEKIT evaluates on candidate slot values. LOOM allows several additional types of constraints. As well as having an analog of the KEE MEMBER.OF operator, a LOOM operator can specify that only some value of a slot must be an all-instance of a given class (rather than all values). Constraint expressions can also be formed that compare slot values to constant expressions using operators such as \=". LOOM also allows the user to write arbitrary slot constraints using the LOOM query language. 5.5.2 Constraints Between Slot Values As well as specifying absolute constraints on the value of a single slot, it is often desirable to specify a relationship that must hold between the values of two slots. For example, imagine that we wish to de ne a class frame that represented a horizontally integrated computer manufacturer. Such a corporation would manufacture the memory chip, the processor chip, and the disk controller used for their computer (we assume that each is described in a separate frame). In all instances I of such a class the values of I.Memory Chip->Manufacturer and I.Processor Chip->Manufacturer and I.Disk Controller->Manufacturermust be the same. The KL-ONE family of FRSs can express such constraints using role value maps [15] (called role constraints in NIKL [76]). These constraints allow users to specify that either an equality or a subset relationship must hold between the values of two slots. Since the slots 21 themselves might exist in di erent frames (as in the preceding chip example), the KL-ONE implementation uses pointers to explicitly identify the two sequences of slot compositions (or role chains) that are involved in the relationship. Role value maps are a special case of a more general constraint mechanism in KL-ONE called structural descriptions [15]. Whereas role value maps specify either an equality or a subset relation between the values of two slots, structural descriptions allow the user to specify an arbitrary relation between the values of two slots. This relation itself must be described as another KL-ONE frame | if the user wished to constrain the value of one slot to be less than the value of another slot, the required less-than predicate must be described in a KL-ONE frame [15]. The user speci es a structural description by creating KL-ONE links between the predicate frame and the slots (role chains) that the constraint relates. This approach to specifying constraints is fairly clumsy and yields constraints that are very di cult to understand; the LOOM query language can be used to specify similar types of constraints, but is much more readable. 5.6 Discussion One of the few performance studies in the FRS literature is that by Mitchell et al [57], who studied the e ect of their caching mechanism (as well as other learning mechanisms) on the performance of THEO. In one experiment, a series of 300 queries were made to a KB of family relationships, with caching enabled during all queries. Queries early in the series took on the order of 10 seconds, whereas queries later in the series took on the order of .5 seconds, showing that the accumulation of cached information did improve performance. This study would be improved if it presented data for the same series of queries with caching completely disabled, so as to control for the cost of performing caching in the early queries. That is, we wish to know not only that the system performs faster as it caches more information, but that on average it performs faster than if caching is not used. Such data is given in an abbreviated form later in the paper for a di erent set of experiments. Both facets and seeUnits (see Section 4.2) allow us to annotate slot values, but their relative merits remain to be determined. Perhaps by embedding annotations within the associated slot structure using facets that we achieve a locality of reference that increases performance. The meta information that they provide yields an increase in FRS expressiveness. Some of the functionality provided by slotUnits could also be achieved by attaching procedures to a slot, or by storing the information in facets. SlotUnits have the advantage of providing a more succinct representation for commonly required information about slots; attached procedures su er the additional disadvantage of being less declarative. In addition, when using slotUnits, information about slots is treated as a rst-class entity within the FRS in the following respects. We can arrange slotUnits in a generalization hierarchy to use the bene ts of inheritance in describing slots. Information about slots is represented globally in slots of the slotUnit, rather than locally (and perhaps redundantly or inconsistently) as facets in every frame that the slot is used in, and therefore a given slot has the same semantics throughout a KB, which increases the understandability and maintainability of 22 the KB. Finally, we can employ existing subsystems of the FRS (such as its query language) to manipulate information in slotUnits. Similar comments apply to valueclass speci cations since rules or attached procedures could be used to implement equivalent capabilities. Researchers have devised a special language that provides a succinct, declarative means of encoding commonly required classes of constraints. In the KL-ONE family, the declarative semantics of these constraints form the basis for classi cation. Fox et al note [31] that enforcing valueclass restrictions can slow down a running system signi cantly (although we are not told how signi cantly), so SRL provides a way of disabling value-restriction checks | they are typically enabled only during system development and debugging. 6 INHERITANCE Inheritance is an inference mechanism in which beliefs about a frame in a taxonomic hierarchy are acquired from its parents in the hierarchy. This section explores the inheritance mechanisms present in a variety of FRSs to answer such questions as: When does the computation of inheritance occur in di erent FRSs? Can con icts arise when a frame inherits information from multiple parents, and if so, how can these con icts be resolved? What di erent semantic modes of inheritance exist? And nally, exactly what information is manipulated during inheritance? 6.1 What Information is Inherited? Generally speaking, when we de ne a slot S in a class frame C; inheritance causes all children of C to contain the slot S: That is, to users it will appear that every child of C contains a slot named S that has the same datatype and value constraints as does the S in C:13 The intended semantics of this operation are that if a certain attribute or relation applies to a class of objects, or to a concept, then that attribute or relation must apply to every child of that class or concept, with at least as strict value restrictions as applied to the class or concept. A fundamental distinction between FRSs is whether or not child frames acquire slot values during inheritance. Systems in the the UNITS family, and the SRL family, do allow the inheritance of slot values. The semantics of this operation is that when a slot value is de ned in a class frame, we would like the FRS to assume by default that the same value holds in a child of that class, unless the user explicitly stores a local value in the slot of the child. For example, when we de ne 32 as the value of the Word Size slot in the VAX Family frame, we wish that value to be the default value of this slot in all children of VAX Family. This inference is a form of default reasoning. Another variation is that some FRSs allow inheritance of properties across other types of 13An exception is own slots, which are never inherited by child frames (see Section 5.3). 23 relations than the instance or super{sub relations. For example, in OZONE, PARMENIDES, CYCL, HYPERCLASS, FRAMEKIT, and SRL, a user could de ne inheritance across a part relation so that the parts of an object would inherit information from the containing object. SRL and CYCL users employ a slotUnit to describe how inheritance is computed for each relation. The slotUnit speci es what slots are inherited across that relation (for example, the owned-by slot might be inherited across the part relation since we expect the owner of an object to also own all of the parts of that object). In SRL this mechanism also allows users to specify a mapping of slot values across a relation, for example, if the relation previous-activity associates frame activity1 with frame activity2, we could specify that the value of activity1.finish time is mapped to the value of activity2.start time [31]. The JOSIE ability to treat slots as classes allows other types of inheritance relationships [60]. This facility would allow us to assert that Digital manufactures all members of the Vax family | that the values of a given slot Digital.Manufactures include all members of a class Vax Family. It allows us to assert that the values of a given slot in a given frame are a subset of the values of some other slot in some other frame. And we can assert that all of the computers manufactured by Digital are manufactured in the United States Digital.Manufactures.Location=USA. Inheritance in most of the KL-ONE family of FRSs does not include slot values because the notion of a default con icts with the de nitional view of concepts in the KL-ONE family (see Section 9.2). Thus, systems such as KL-ONE, NIKL, and KRYPTON cannot de ne default values. The JOSIE system distinguishes necessary defaults | those for which no exceptions exist | from defaults that can be overridden. 6.2 Semantic Modes of Inheritance The semantics of some slots requires a di erent interaction between default values and local values than that described in the previous section, where local information overrides inherited information. Therefore, some FRSs allow the user to specify one of several di erent modes of inheritance for slots. The KL-ONE and SRL families do not provide multiple types of inheritance, whereas the UNIT Package, KEE, and PROTEUS do | and each de nes a somewhat di erent set of inheritance types. In KEE users de ne a slot inheritance mode by setting the value of a special facet that is de ned for each slot. KEE supports the following inheritance modes to determine the observable value of a slot S in a child frame C (C:S) given the value of S in a parent frame P (P:S), and the value of S that is stored locally in C (C:S 0): OVERRIDE.VALUES | Local values override inherited values so that: C:S = fif (C:S 0 == NULL) then P:S else C:S 0g UNION | Local values are unioned with inherited values: C:S = P:S [ C:S 0 RUNION | Same as UNION but the order of the slot values is reversed. 24 SAME.VALUES | No di ering local values allowed: C:S = P:S UNIQUE.VALUES | Inheritance is blocked completely: C:S = C:S 0 MINIMUM| Minimum of local and parent values (a MAXIMUM mode exists also): C:S = min(P:S; C:S 0) METHOD | Used for attached procedures UNION.EACH.VALUE| Analogous to UNION except that the slot value must be a list, and individual elements of the lists are unioned VCSIMPLIFY | Used to simplify inherited slot constraints OVERRIDE.VALUES is the mode found in most FRSs, but the other modes do have utility. For example, bookeepingSlots (see Section 5.3) should use the UNIQUE.VALUES mode since information about a frame (such as the name of its creator) should not necessarily be inherited by the children of that frame. 6.3 Con icts in Inheritance Early systems such as the UNIT Package allowed a frame to have only a single parent in the inheritance hierarchy. Later systems such as KEE, CYCL, SRL, and the KL-ONE family allow a node to have multiple parents, and therefore to inherit information from more than one parent. The KL-ONE family does not allow multiple parents to provide con icting information to their children | this situation would be declared an inconsistency by the classi er (see Section 9). Most FRSs that do allow inheritance of con icting attributes provide di erent mechanisms to resolve potential con icts. FROBS allows the user to explicitly specify which parent the value should be inherited from, thus specifying the \context" from which the value should be taken. By default the STROBE inheritance mechanism uses the rst value that it nds in a breadthrst search of the child frame's ancestors, but the user can specify a depthrst search on a per-slot basis. FRAMEKIT allows the user to specify best{ rst or depth{ rst search, or an exhaustive search in which the nal inherited value is the union of all values encountered during the search. OZONE allows user-de ned search functions. Con icts in inheritance become an even greater problem when a FRS computes inheritance over multiple relations (such as the part relation discussed in Section 6.1). Now con icts can occur over multiple relations, in addition to the con icts from multiple parents over the super{sub relation. SRL users can specify the order in which links are to be searched from a given frame. 6.4 Time of Inheritance Di erent FRSs compute slot inheritance at di erent times. In systems that perform fetchtime inheritance, inheritance occurs at the time an application program requests the value 25 of a slot | the inheritance system climbs the inheritance hierarchy searching for a value to inherit (used by the UNIT Package and KEE, for example). The inherited value is then returned to the application, but is never physically stored at the child frame. Conversely, in systems that perform assertion-time inheritance,14 when a parent frame is de ned or altered or classi ed, inherited slot information is physically copied to all child frames (also used by the KL-ONE family). In many systems that compute inheritance at assertion time (such as NIKL), if a user wishes to change a slot de nition in a parent frame, the user must reload the entire KB because the system is unable to incrementally update the cached information. THEO combines these two approaches by allowing users to specify that inherited slot information should be cached in child frames at fetch time (RLL and CLASS provided a similar caching mechanism). THEO also records justi cations that describe the dependencies between the cached value and the slots it was derived from, so that cached values can be removed in response to changes in the values that they depend on. PARMENIDES users can specify for every slot whether inheritance is to occur at fetch time or at assertion time. When a user changes the de nition of a PARMENIDES slot that is inherited at assertion time, the system immediately propagates the new slot de nition to all children of that frame | but information de ned locally at a child frame is not overwritten. 6.5 Discussion Assertion-time and fetch-time inheritance have di erent speed and space requirements, which are not understood, and they also have di erent run-time exibility. Schoen made an initial step towards understanding this tradeo by measuring that in a system that performs fetch-time inheritance, each additional direct-super encountered during an inheritance search slowed down inheritance by an additional 17%, assuming 8 slots per class frame [81, p203]. By performing inheritance at assertion time, the authors of NIKL eliminate the need for fetch-time searches up the inheritance hierarchy. But they force the user to reload the entire KB when a class de nition changes, thus decreasing the run-time exibility of these de nitions, and providing an approach that probably will not scale up to very large KBs (neither of which disadvantages are noted in a recent assessment of NIKL [78]). Since Schmolze and Mark have stated that a major goal of implementing NIKL was to improve upon the speed of KL-ONE [78], we would expect to nd a thorough evaluation of exactly what improvements were made to NIKL, and of the speedup obtained through each improvement. Instead, this paper is an example of vague, imprecise analysis of an implemented FRS. The most detailed timing measurement we are given is: \Overall, the NIKL system was an order of magnitude faster than KL-ONE." [78, p9]. What contributed to this speedup? \This decision to trade o more space for less time has borne out well, given the current economics of computation and the needs of most applications" [78, p9]. We should like to know exactly how space was traded for time, and we should be given rigorous experimental data that convinces us that part of the speedup was not due to the use of faster hardware. Another secret of the performance improvement is explained as \hashing and other fast schemes were used when possible" [78, p9] | in other words, we 14Called propagation in PARMENIDES. 26 have no idea what the \other fast schemes" are, or exactly how any of these schemes were implemented. PARMENIDES also performs inheritance at assertion time, but it recomputes inheritance when new assertions are made about classes | providing more exibility but requiring a more complex implementation. We are not told how complex the implementation is. Similar to slotUnits and valueclass restrictions (see Section 5.6), multiple inheritance modes could also be implemented with attached procedures, but a special language provides a succinct, declarative means of encoding common operations. We lack information about how useful multiple inheritance modes are in practice, or the costs of implementing them. They would surely complicate the computation of subsumption were they introduced into the KL-ONE family. Similarly, inheritance across multiple link types could be implemented through rules or attached procedures, but may have bene ts of succinctness and declarativeness. The ability to explicitly establish an ordering for link searches in SRL is similar to a metarule facility, thus providing an additional type of expressibility. However, rules are such an obvious competitor to this approach that a more detailed analysis is needed. Is it preferable to use inheritance for inference over instance and super{sub relations, and rules for other types of inference, or is inheritance preferable for all these forms of inference? Although a production-rule facility could provide many of the same inferences as described in this section (more precisely, a default-rule facility), another key reason to prefer the inheritance mechanisms described in this section is that they optimize the performance of this special type of inference. Using rules to implement inheritance would require backwardchaining searches through a large rule base to compute inherited values | unless some type of compilation were used. The super{sub links in a taxonomic hierarchy can be viewed as a way of compiling rules into chains that support a particular set of inferences. Systems that compute inheritance at run time still search these rule chains, whereas systems that compute inheritance at assertion time have even compiled out the searches along rule chains. The publications that described alternative ways of resolving con icting inherited information (such as breadthrst, depthrst, and exhaustive search through the ancestors of a frame) do not report how acceptable the inferences produced by these methods are in practice, nor the computational costs of these methods. A number of researchers have investigated theoretical aspects of inheritance in networkbased KR systems [97, 96, 83, 74]. Although these researchers have obtained interesting results, their utility with respect to the FRSs discussed in this paper is unclear because their research was in the context of toy problems and has not been proven to generalize to real-world problems of the sort addressed by the FRSs discussed herein. In addition, their basic model of inheritance intimately involves a type of relation called IS-NOT-A that is not used by any of the FRSs discussed in this paper; it is not clear how easy it will be for either model to accommodate the other. 27 7 PERSISTENT KNOWLEDGE BASES If FRSs are to be employed to build and manage large knowledge repositories, they must provide sophisticated facilities for transferring frame knowledge bases between virtual memory | where all processing of frame information takes place | and persistent secondary storage. Such facilities should use modern database techniques such as transactions to protect data from corruption by events such as operating system crashes or hardware failures. Currently, the standard FRS approach to saving KBs in persistent form is for the user to explicitly execute an operation that saves an entire KB to disk storage. This approach does not scale as KB size increases because the time required for the save operation is proportional to the size of the KB, not to the number of updates made since the KB was saved last. The standard approach to moving frame data into virtual memory is to load an entire KB into memory before processing begins, which again takes time proportional to the size of the KB rather than to the amount of information that will be accessed. These simplistic approaches to loading and saving frame KBs will become less and less palatable as KBs increase in size. It is not the idea of keeping substantial amounts of frame data in virtual memory that is antiquated | in fact the trends in the database community toward main-memory databases support this basic model. Rather, FRSs must be more selective about what data is transferred between virtual memory and persistent storage to these transfers faster. This section reviews a number of approaches to the storage and retrieval of persistent frame KBs. THEO and CLASS provided an early mechanism for managing large KBs. Frames in a single KB are partitioned into separate les that can be saved or loaded independently; the system automatically tracks which frames belong in which les. LOOM provides a similar facility: it can automatically save all classes, instances, relations, or methods within a KB to separate les. THEO and CLASS manage only a single frame name space. In contrast, KEE, STROBE, ARLO, and LOOM provide a multiple KB facility. In a single session, users of these systems can load one or more KBs into virtual memory, where each KB has a separate frame name space, and can be saved to secondary storage (or deleted) independently of the other KBs. Frames in di erent KBs may reference one another, so that frame F1 in KB1 could be a child of frame F2 in KB2. In SB-ONE, ARLO, and K-REP, KBs themselves are linked in a hierarchy; all of the frames de ned in a knowledge base K also appear to the user to be de ned in all children of K. The authors of K-REP describe this facility as analogous to the COMMON LISP package system. The UNITPackage and RLL employed variants of an early frame storage management scheme that provided demand paging of frames [89, p69][87]. These techniques were developed to allow KB size to exceed virtual-memory size on machines of 1970s vintage that had limited (18 bit) address spaces. Frames are read into virtual memory when rst accessed, and are saved back to disk on a least-recently used basis during LISP garbage collections. One limitation of this approach is that the data stored on secondary storage is not protected from corruption. OZONE used similar techniques, but its slot spaces were used to distinguish slots that were read on demand from system slots that were always memory resident (such 28 as slots containing parent-link information). A recent paper by Mays et al [53] describes storage management facilities for K-REP that are a leap beyond the preceding capabilities. They interfaced K-REP to an object-oriented DBMS to support a versioned KB that can be updated by multiple users via transactions. In addition, Ballou et al interfaced PROTEUS to the ORION object-oriented DBMS [5], Abarbanel and Williams coupled KEE to a relational DBMS [1] to produce KEE Connection, and Peltason et al interfaced BACK to a relational DBMS [67]. McKay et al have developed an Intelligent Database Interface (IDI) [55] with essentially the same architecture as KEE Connection, although they have improved upon a number of the details of the AI/database coupling, for example, the IDI includes an intelligent cache of database information, and it can automatically obtain schema information from the database. A nal point of variability concerns the form in which KBs are stored persistently. Most KL-ONE descendants store KBs as a set of LISP forms (expressions) that, when executed, recreate the KB in virtual memory. One such form would both de ne a new concept, and then reclassify it into the KL-ONE hierarchy. Thus, all existing members of the KL-ONE family reclassify every concept every time a KB is loaded | the results of classi cation are not saved persistently. In contrast, most other systems store KBs as data les that contain a more direct representation of the virtual-memory form of the KB, and that are not processed by the LISP evaluator during loading. 7.1 Discussion A multiple KB facility is invaluable for users and applications that access many di erent KBs over the long term, but who access few KBs on a day-to-day basis, because it allows them to load only the KBs that they require at a given time. Users also bene t who update only a subset of the KBs that they have loaded, and therefore need only save a subset of the frames that they have loaded. Multiple name spaces are useful for operations such as simultaneously processing both a KB and a copy of that KB, which will necessarily contain many frames with the same name. COMMON LISP packages obviate the need for a multiple KB facility to a large degree since packages provide separate name spaces. But package systems do not provide a way of renaming all symbols to another package, as is required for a KB rename operation. Packages also negate a motivation of OZONE spaces, namely providing separate name spaces for user and system slots. The paper by Mays et al is notable in providing the best, most precise description of implementation techniques in the FRS literature. Otherwise we have little information about the implementation techniques employed for the capabilities in this section. We have essentially no knowledge of the performance of any of the techniques discussed in this section, such as the cost of providing inheritance among KBs themselves, or the performance of the storage management facilities. An exception is the work by McKay et al, which does give some performance measurements. Since performance is a key property of a storage system, it is virtually impossible to evaluate the relative merits of these alternatives (e.g., Ballou et al versus Mays et al) without comprehensive performance data. Even with 29 such data, a comparison will be di cult since, for example, the Ballou and Mays groups utilized di erent underlying DBMSs | STATICE and ORION | and therefore it will be di cult to determine how much of the performance depends on the DBMS itself. 8 OBJECT-ORIENTED AND ACCESS-ORIENTED PROGRAMMING Some FRSs contain an object-oriented programming (OOP) facility [33], or an accessoriented programming (AOP) facility [95], or both. As noted in [95], these two programming paradigms are duals of one another. 8.1 Object-Oriented Programming The UNIT Package provided support for OOP that was developed further in KEE and STROBE. A user of the UNIT Package could attach a LISP function (or method) to either a frame or a slot. Users could invoke a method by sending a message to that frame or slot. For example, we would attach a method called If-Deleted to a frame by creating a slot within that frame called If-Deleted, and storing the de nition of a LISP function in that slot. We could send a message to that method by calling a UNIT Package function called UNITMSG whose parameters include the name of the slot and the name of the method. Methods are associated with slots in a similar fashion (they are stored in a facet of the desired slot). Methods are inherited along the generalization hierarchy. The OOP facilities of STROBE and KEE di er in several respects. Like the UNIT Package, STROBE searches up the generalization hierarchy for a method to execute when a message is sent. But in addition to executing the rst method found in this search, the user can specify that STROBE execute all of the methods found in the ancestors of the frame to which the message was sent | in either parent{child or child{parent order. In this way frames can acquire specialized behaviors in addition to the general behaviors they inherit from their ancestors. In addition, if no method de nition is found via inheritance, STROBE will attempt to perform datatype rerouting: the message is sent to the frame that describes the datatype of the slot to which the original message was sent. Thus the user can describe how all slots of a given datatype should respond to a particular type of message. KEE provides a sophisticated mechanism whereby methods can by modi ed by inheritance such that methods de ned in frames high in the hierarchy can be altered by the children of those frames. Speci cally, every method consists of four sections called main code, before code, after code, and wrapper code. The before code and after code sections of a method de ne LISP code that is executed before and after the main code section, respectively, and which is inherited using union inheritance (see Section 6.2). The before code present in frame F is the concatenation of the before code de ned in F and the before code de ned in the parents of F . Main code is inherited using override inheritance, so that locally-de ned 30 main code overrides parental main code. The wrapper code section de nes code that is wrapped around the concatenation of a frame's before code, main code, and after code. 8.2 Access-Oriented Programming Access-oriented programming is more prevalent in FRSs than is OOP, and is used in systems such as the UNITS family, KL-ONE, LOOM, the SRL family, FROBS, and OZONE. An AOP facility allows the programmer to associate programmatic annotations with data, such that the annotations are automatically executed when di erent classes of operations are performed on the data (in a FRS, these data are slot values). The annotations can be dynamically attached to and removed from the data, and the existence of the annotations is transparent to programs that are not explicitly attempting to manipulate the annotations (i.e., to programs that are only attempting to manipulate the data). Usually the annotations are LISP procedures, but in THEO, for example, annotations can be PROLOG rules. One common type of annotation function is invoked when a user requests the value of the slot; the function computes the value of the slot and returns the value to the user in a transparent fashion | as if the value were stored in that slot. Another common type of annotation function is invoked when a user modi es the value of a slot. The annotation function might update other slots, or perhaps external databases, whose values depend on the modi ed slot. In FRAMEKIT for example, a user can annotate a slot by storing LISP functions in facets called If-Needed, If-Accessed, If-Added, and If-Erased. The If-Needed function will be invoked if a user attempts to get the value of the associated slot, and if that slot currently has no value; if the slot does have a value then the If-Accessed function (if any) will be invoked instead. The If-Added function will be invoked when the user adds a new value to a slot, and the If-Erased function will be called when the user erases the value of a slot. PARMENIDES provides for additional annotations called Pre-If-Set and Post-If-Set that are invoked before and after a new value is added to a slot, respectively. SRL | the ancestor of both of these systems | has a more general mechanism. An SRL annotation is itself described by a frame that speci es such things as: by what type of slot operation the annotation is to be invoked by (e.g., a get or a put operation); whether the annotation should be invoked before or after that slot operation is performed; and what \e ect" the annotation has | does it alter the value returned by the slot operation, does it have a side e ect that does not alter the returned value, or should the slot operation be completely blocked from occurring? Other FRSs allow similar types of annotations: STROBE allows annotations that are executed upon creation and deletion of frames and slots, and before and after modi cation of and access to slot values. KL-ONE allows attached procedures to be invoked before and after the KB operations Individuate, Restrict, Create, and Remove. FRSs that do not support AOP include NIKL, KRYPTON, and PROTEUS. 31 8.3 Discussion The authors of STROBE and SRL note that the use of AOP can impose a fairly heavy computational cost [88, 31]. However, no publication contains actual measurements of the performance cost. Both systems allow users to disable this facility [88, 31]. And in STROBE, the mechanism is usually disabled [79]. The authors of STROBE and of SRL state that the major component of the cost is performing inheritance searches to check for the existence of annotations whenever a slot value is accessed | this search must be performed even for slots that have no annotations in case inherited annotations exist. Smith and Carando suggest an optimization to avoid repeating these searches when the same slot is accessed multiple times, which is to cache a null value for the annotations locally within a slot when a search yields no annotations. This approach is used by THEO. Another approach would be for the FRS to automatically record in the slotUnit for slot S whether or not S is ever annotated anywhere in the KB; for slots that are unannotated, this global information could provide signi cant savings. We have no other knowledge about implementation techniques for OOP or AOP, nor about the frequency with which these capabilities are needed in applications. 9 CLASSIFICATION 9.1 Overview Intuitively, one class (or concept) de nition A subsumes a second class de nition B if the concept that A represents is more general than is the concept that B represents. For example, the class man subsumes the class father since all fathers are men. We can describe subsumption more precisely in set-theoretic terms. The extension of a concept is the set of instances of that concept that exist in the world. A subsumes B if the extension of A (written jAj) is a proper superset of the extension of B: A > B jAj jBj For example, the set of all fathers is a proper subset of the set of all men. Members of the KL-ONE family automatically compute whether one class frame subsumes another, based on the de nitions of the slots that comprise those classes. The computation of subsumption is the basis of classi cation. The operation positions a frame A into its proper position in the inheritance hierarchy. Positioning A as a direct-super of B is proper if and only if A is the most speci c subsumer of B, that is, if there exists no concept de nition C such that A subsumes C and C subsumes B. A realizer (also called a recognizer) positions an instance frame in the hierarchy. The realizer nds the most speci c concept A in the KB such that the instance is in the extension of A. The de nition of a concept C consists of two parts: a list of concepts that are more general than C, and a list of conditions that di erentiate C from those ancestor concepts. These conditions are a variant of the slot value restrictions discussed in Section 5.5. For example, 32 in KL-ONE we would de ne a father as an all-sub of man that has at least one value in the child slot. At rst glance classi cation may seem redundant since every class de nition already contains information about the placement of the concept in the inheritance hierarchy | the de nition of father explicitly notes that it is an all-sub of man. Classi cation is needed because these de nitions may not be precise (or even consistent) in a global sense. For example, the de nition of father tells us that father is more speci c than man, and therefore that father belongs below man in the taxonomy. But the de nition does not explicitly tell us that man is the most speci c subsumer of father, because fathermight be subsumed by other concepts that have been de ned relative to man. As another example, the class Husband might be de ned as an all-sub of Married Person, whose Wife slot is restricted to store values of type Woman. Husband may not belong directly below Married Person in the hierarchy. A KB describing royalty might de ne a concept called Royal Husband to take into account the fact that royal children sometimes marry. Royal Husband is de ned as an all-sub of Married Person whose Wife slot is restricted to take values of type Female (an ancestor of Woman). In this KB, a classi er would link Husband as a direct-sub of Royal Husband rather than of Married Person. In a sense the original speci cation that Husband is an all-sub of Married Person is treated as advice that should be re ned by the classi er. FRSs outside the KL-ONE family assume that all subsumption relationships are given by the user at class de nition time, and therefore that no additional relationships remain to be discovered. The basis of classi cation lies in comparing the intensional concept de nitions that give necessary and su cient conditions for membership in a class. That is, the KL-ONE family of FRSs interpret the slot value restrictions for a given class C as de ning necessary and su cient conditions for recognizing an instance of C. We infer that Royal Husband subsumes Husband by comparing the de nitions of their Wife slots. In practice, this precise de nitional view of concepts breaks down when we attempt to de ne necessary and su cient conditions for natural kinds such as the concept of a sh. We cannot list necessary and su cient conditions for what it means to be a sh because the concept is so complex, and so imprecise. Therefore, the KL-ONE family allows users to de ne a concept as primitive if it is impossible to specify necessary and su cient conditions for membership in that class | the de nition of a primitive concept speci es necessary conditions only. Primitive status tells a FRS that the de nition of the class cannot be expressed within the existing language of de nitions, and therefore that the subsumption relation cannot be computed between two primitive concepts. However, classi ers can compute the subsumption relation between two concepts that are de ned as all-subs of the same primitive concept (e.g., between two nonprimitive all-subs of Fish). See [14, p420] for a more detailed discussion of primitive versus de ned concepts. FRSs that employ classi cation are often called terminological reasoners because they maintain relationships between a set of terms | class or concept de nitions. The KL-ONE-style emphasis on term de nition | on the manipulation of structured descriptions | led to a system architecture that separates the subsystem that manipulates terms, from the subsystem that manipulates assertions with respect to those terms. This distinction arose in KRYPTON and is maintained in KRYPTON's successors. The TBox (terminologic box) 33

Statistics

051015'95'98'01'04'07'10'13'16
Citations per Year

104 Citations

Semantic Scholar estimates that this publication has 104 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Karp1993TheDS, title={The Design Space of Frame Knowledge Representation Systems}, author={Peter D. Karp}, year={1993} }