Reduction of Enumeration in Grammar Acquisition

Abstract

c) non-sentences i d e n t i f i e d as such. We note tha t the a v a i l a b i l i t y of s t r uc tu re ( i t em b ) ) d i f f e r e n t i a t e s t h i s model from o ther cu r ren t s tud ies on the a c q u i s i t i o n o f grammar (2) ( 3 ) . The knowledge of s t r u c t u r e , p r i o r to * * Work supported by C o n s i g l i o Nazionale d e l l e R i c e r c h e . The f i r s t pa r t of t h i s study was done in the Dept .o f Computer Sc ience , U n i v e r s i t y o f C a l i f o r n i a , Los Angeles. the a c q u i s i t i o n of a grammar, can be defended on th ree grounds. F i r s t a c h i l d l e a r n i n g a language has s t r ess and i n tona t i o n i n f o r m a t i o n a v a i l a b l e to h im, and t h i s could be i n t e r p r e t e d as a type of s t r u c t u r a l i n f o r m a t i o n . Second, i f our grammars descr ibe the base component(deep s t r u c t u r e ) of a language, then the re is an i n t i m a t e r e l a t i o n between s t r u c t u r e and meaning, the s t r u c t u r e being a p re re q u i s i t e f o r unders tand ing a sentence. The widespread b e l i e f the the re must be a par t i a l l y semantic bas is f o r the a c q u i s i t i o n o f syntax then imp l i es the a v a i l a b i l i t y of some s t r u c t u r a l i n f o r m a t i o n to the l ea rne r o f the language. T h i r d , a v a i l a b i l i t y o f s t r u c t u r e g r e a l l y reduces the number of a l t e r n a t i v e poss i b l e grammars, and insures t ha t the acqui red grammar ge_ nerates sentences w i t h s t r u c t u r e s c o n s i s ten t w i t h t h e i r meanings. Given the d a t a , a model could cons ider the enumerat ion ( i i i ) o f a t t a i n a b l e grammars, and t e s t t h e i r c o m p a t i b i l i t y w i t h the data a ) , b) and c ) . The t e s t is poss i ble because of ( i v ) . The model then s e l e c t s one of the comp a t i b l e grammars by means of the eva lua t i o n measure ( v ) . A more accura te model should a lso exp la i n the g radua l deve lopment of an a p p r o p r i a t e h y p o t h e s i s , and the c o n t i n u a l a c c r e t i o n of l i n g u i s t i c com petence, r a t h e r than j u s t c o n s i d e r i n g the i d e a l i z e d ins tan taneous moment of a c q u i s i t i o n o f the c o r r e c t grammar. We note w i t h Chomsky t h a t d i f f e r e n t ev a l u a t i o n measures w i l l ass ign d i f f e r e n t ranks to a l t e r n a t i v e hypotheses rega rd i ng the language of which the pr imary data Session No. 13 Computer Understanding II (Representation) 547 are a sample. Hence choice of an evaluation measure for grammars amounts to deciding which generalizations about language are significant. The class of possible hypotheses must be limited, if a realistic theory of syn tax acquisition is to be developed. However, the limitation must s t i l l yield a class of grammars that is adequate in strong (and a_ fort ior i weak) generative capacity. But beyond this, the requirement of feasibil i ty is the major constraint of the model. I shall have very few words to add to Chomsky's formulation of the problem, mainly for carrying further the formalization. My concern was for (1) delimiting a class of grammars which proved adequate in strong generative capacity, and defin ing a strategy for (2) reduction of possible hypotheses, and for (3) selecting a unique grammar. There is some similari ty between the grammars that we are going to describe, and Bar-Hillel's (4 ) catego r ia l grammars. A model similar to the one that we shall describe although of reduced sco pe proved its feasibil ity on the compu ter (5)(6) . Following Gold (7), Feldman (2), Crespi-Reghizzi (5)(6) and Biermann (3),con sider a context-free source grammar GS and a source language LS = L(GS ). The parenthesis grammar [GS] is de rived from G = (V/n , VT P,S) by replacement of every production A->u , u in V V, whe_ re, V = V U VT , with the parenthesized N T_ production A ->[_u_] , where "[" and "]' are not in V. Note that "renaming productions" i.e. productions such as A+B,where B is a nonterminal, are not parenthesized A positive sample S ={s, , s^ , . . . , s } and a negative sample M ={ n , n , . . . , n } compose the primary data available to the model at the discrete time t. At each time t we are interested in observing the gram mar G which is output by the grammar acquisition device to account for the sample S and MT . G must meet two requiret t t ments for compatibility with the data: 1) L( [G ] ; 2) L([G ]HM = j If the model has to account for the in creasing linguistic sk i l l of the learner we have to consider the grammars G ,,G * . . . . , at successive time instants t1 ,t2 , . . . corresponding to samples S , , S . . . , . We shall expect the guesses of the algorithm to become closer to the source language, as the sample is enlarged . The model is said to identify the source grammar GS in the limit if there is a time t' such that for t >t' , and for any information sequences I , N : 3) Gt = Gt ' ; 4) L ( [ G t ] ) = L([GS]). In other words, identification in the limit implies that there are primary data ST , MT, t> t ' , which cause the model to se — lect a grammar which is not later changed and is strongly equivaJcnt to the source grammar. This must be true for any information sequences for that source grammar. Two results, due to Gold (7), should be mentioned. Theorem 1 Let C be a class of decidable grammars ( i .e. grammars for which it is decidable whether a string is generated by the grammar). Then, if GS is in C there is an algorithm which identifies GS in the l imit . Theorem 2 Let C be a class of grammars which generam all the f in i te languages and any one infinite language. Assume the primary data consist only of a positive information sequence. Then, if GS is in C there exists no algorithm for the class C which is able to identifv GS 5 in the l imit , As a special case of Theorem 1, context free languages are identifiable in the li mit , if sentences and non-sentences (iden t i f ied as such) are given. If the latter are not available, Theorem ? implies that not even f ini te state languages can be identified in the l imit . We note that the proof of the f i rs t theorem is based on the fact that grammars are enumerable, and on the possibility of testing 1) and 2) for a decidable grammar. Such a proof does not lend itself to a feasible grammar acquisition method for 548 Session No. 13 Computer Understanding II (Representation) c o n t e x t f r e e grammars, because of the as t ronomica l number of grammars to be te s t ed . 2. A CHARACTERIZATION OF POSSIBLE GRAM­ MARS In a p rev ious work ( 5 ) ( 6 ) we have i n ­ t roduced a subclass of c o n t e x t f r e e gram mars, termed f r e e opera to r precedence grammars, f o r which i d e n t i f i c a t i o n in the l i m i t i s poss i b l e when only p o s i t i v e i n ­ f o rma t i on is a v a i l a b l e . In t h i s sequel we d iscuss an ex tens ion of these concep ts , t o i nc lude more genera l c lasses of languages. We s h a l l f i r s t d e f i n e a f a m i l y of c las ses of grammars(K-dist i n c t grammars)which cover the e n t i r e spectrum of c o n t e x t f r e e languages. When a f u r t h e r r e s t r i c t i o n on the form of p roduc t i ons is imposed we ob ta i n the f a m i l y of k d i s t i n c t and k-homogeneous grammars, whose s t rong gene ra t i ve capaci ty does not cover the f u l l spectrum of c o n t e x t f r e e languages ,a l though our new grammars are able to account f o r se l f -em bedd ing , n e s t i n g and o ther f e a t u r e s r e ­ qu i red f o r n a t u r a l languages. For each c lass of grammars in the fami l y , i d e n t i f i c a t i o n in the l i m i t w i thou t knowledge of non-sentences is t h e o r e t i c a l l y poss ib l e and p r a c t i c a l l y f e a s i b l e . We now i n t r oduce some new d e f i n i t i o n s . Let G = ( V V , P , S ) be a c o n t e x t f r e e grammar, and l e t Vp = V T U{( ,)}. Def ine the l e f t p r o f i l e o f order k o f a s t r i n g VP }= TB . In t h i s case we say t h a t G has no d u p l i c a t e d p r o d u c t i o n s . It would be easy to prove t h a t every c . f . grammar has a s t r o n g l y equ i va len t grammar w i t h no du­ p l i c a t e d p roduc t ions and we can r e s t r i c t our a t t e n t i o n to non d u p l i c a t e d grammars w i t hou t loss o f g e n e r a l i t y . Next we prove tha t any c o n t e x t f r e e grammar w i t h no d u p l i c a t e d p roduc t i ons is k d i s t i n c t f o r some k>0. Theorem 3 G is k d i s t i n c t f o r some k>0. Proof Let u be the sho r tes t s t r i n g which i s i n T but not in T , and l e t j = u + 1 . Then u$ is in L . ( x . ) , f o r some A->x. in P1! Since u$ is n o t J i n L . ( y . ) f o r any o t h e r p roduc t i on B->y , , it f o l l o w s t h a t P.(A) ≠ P. (B ) . In the same way determine j f o r any p a i r A, B in V , and l e t k be the l a r g est of the i n t e g e r s thus de te rmined . Then G i s k d i s t i n c t . Nonte rmina ls of a k d i s t i n c t grammar are un ique ly c h a r a c t e r i z e d by t h e i r k-pro f i l e s , which can be used as s tandard na­ mes f o r non te rm ina l c h a r a c t e r s .

Cite this paper

@inproceedings{CrespiReghizzi1971ReductionOE, title={Reduction of Enumeration in Grammar Acquisition}, author={Stefano Crespi-Reghizzi}, booktitle={IJCAI}, year={1971} }