This thesis has been submitted in fulfilment of the requirements for a postgraduate degree (e.g. PhD, MPhil, DClinPsychol) at the University of Edinburgh. Please note the following terms and conditions of use: • This work is protected by copyright and other intellectual property rights, which are retained by the thesis author, unless otherwise stated. • A copy can be downloaded for personal non-commercial research or study, without prior permission or charge. • This thesis cannot be reproduced or quoted extensively from without first obtaining permission in writing from the author. • The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the author. • When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given. RESOLVING LEXICAL AN.IBIGUITY INA DETERIYlINISTIC PARSER Robert VVilliam lVli.lne Ph.D University of Edin burgh · 1983 List of Figures Acknowledgements Abstract 1. Introduction CONTENTS 1.1 The Human Sentence Parsing N.Echanism 1.2 The Initial Framework 1.3 Summary of Results 1.4 Ll.mitations of this Research 1.5 Outline of the Following Chapters 2. N!arcus's \\Tork 2 .. 1 The Definition of Deterministic Parsing 2.2 Conscious Effort 2.3 How is Determinism Accomplished 2.4 Two Deterministic Parsers 2.5 VVhat did Iv.Brcus do about Lexical Ambiguity 2.5.1 Mrrcus and the Attention Shift 2.5.2 The Three Buffers in :rviu"cus's Parser 3. Using Determinism to Predict Garden Path Sentences 3.1 Garden Path Sentences 3.1.1 The Garden Path Prediction of PARSIFAL 3.1.2 Why is it "Wrong 3.2 The New Theory 3.3 The First Experiment - Testing Garden Path Sentences 3.3.1 The Pre-Test 3.3.2 Testing the Predictions 3.3.3 Subjects 3.3.4 Procedure 3.3.5 Analysis of the Data 3.3.6 Results 3.3. 7 Discussion 3.4 Summary So Far 4. Non-Syntactic Interaction 4.1 Each Rule N.ak.es a Semantic Contribution 4.2 "Which Ambiguities need Non-Syntactic Information to Resolve Them 4.3 "Which Ambiguities are Resolved with Non-Syntactic Information 4.3.1 PP Attachment 4.3.2 Reduced Relative Garden Paths 4.3.3 THAT Garden Paths 4.4 The Theory of 'When and VJ.by 4.4.1 When is Non-Syntactic Information Used 4.4.2 Why Must the Decision use Non-Syntactic Information 4.5 The New Garden Path Explanation 4.6 How to Recover from a Garden Path 4.7 Summary 4.8 The Second Experim.en t 4.8.1 Purpose 4.8.2 Task 4.8.3 Subjects 4.8.4 Examples 4.8.5 Results 4.8.6 Summary 5. The Role of Syntactic Information in Handling Part of Speech Ambiguity 5.1 Syntactic Context 5.1.1 Word Data Structures 5.1.2 lVJorphology 5.1.3 Disambiguation 5.1.4 An Example 5.1.5 The Word 'TO" 5.1.6 Adjective/Noun and Noun/Noun Ambiguity 5.1.7 Why Does This Work? 5.2 The Role of Agreement in Handling Ambiguity 5.2.1 l\tTa:rcus's Diagnostics 5.2.2 Handling the Word 'TO" 5.2.3 Handling the Word ''.FOR" 5.2.4 Ungrammatical Sentences 5.2.5 Subject/Verb Agreement 5.2.6 Plural Head Nouns 5.2.7 Handling 'WHAT" and ''\NHICH". 5.2.6 Verb Agreement 5.2.8 Noun/lVIodal Ambiguity 5.2.9 What About 'HER" 5.2.10 Handling 'TIIAT" 5.2.11 Handling the \i\brd ''HA VE" 5.2.12 The V\brd ''.A" 5.3 Possible Uses for Agreement in English 6. The Psychological Status of Deterministic Parsing 6.1 Psychological Criteria 6.2 Kimball's Principles 6.3 The ATN as a Psychological lVbdel 6.4 The Sausage IVBchine 6.5 Accounting for RA and ?v'IA. in ROBIE 6.5.1 Production System Rule Order 6.5.2 Right Association 6.5.3 Mnimal Attachment 6.6 Some Predictions 6.7 Lexical Access , 6.7.1 Swinney, Cairns and Kam.erman • 6.7.2 Tanenhause, Leiman and Seidenberg 6.8 Learning the Deterministic Parser 6.9 Timing of the Parser 6.:10 Summary 7. Related Worik. 7 .1 Deterministic Parsers -7 .1.1 Resolving Noun/Verb Ambiguity -7.1.2 Church's YAP 7.2 The ATN 7 .3 Chart Parsers 7 .4 A General Syntactic Parser 7 .5 Steedman and Ades 7 .6 Otiher Approaches to Parsing 7.7 The Timing of Semantic Interaction 7.7.1 At the End of the Sentence "7.7.2 At the End of the Clause ·1.7.3 Incremental Evaluation ·1.7.4 The On-Line Theory 7 .8 Linguistic Analysis 8. The Parser -7.8.1 Extended Standard Theory ·7.8.2 Phrase Structure Grammar 8.il Overview of the Parser 8.2 The Grammar : 8.2.1 Grammar Functions :8.2.2 Closing Nodes •8.2.3 Attachment !8.2.4 Implementing the Production System Grammar .a.2.5 Rule Order 8.3 The Dictionaries 8.4 The Packets l8.4.1 Dotted Rules 8.5 A Few Notes on the Grammar !8.5.1 Passive ta.5.2 Conjunction ~8.5.3 :MJvement 8.6 Tiie Agreement Checks 8.'il TI1e Semantic Database 9, Oth.er Ambiguities 9.!l Verb Particle Handling 9.2 HA VE Re-visited 9.3 Global Ambiguity 9.3.1 Preferred Reading, then Other Readings 9.3.2 All at Once 9.3.2.1 Similar Trees 9.3.2.2 Similar Trees - Peudo Attachment 9.3.2.3 Produced. in Parallel 9.3.3 Only the Right One . 9.4 Why are These not Garden Paths 10. Using Seman.tic Information for PP Attachment and VP Parsing 10.1 Semantic Handling of PPs 10.2 Parsing VPs with Semantic Information 10.3 Preference Semantics 11. Conclusion 10.3.1 VJil.ks 10.3.2 Boguraev 11.1 Summary 11.2 The Parser vs. The Grammar 11.3 Areas for Future Study Appendix A: An Annotated Example Appendix B: Some Example Sentences Appendix C: Free Text Analysis Appendix D: The Annotated Grammar Bibliography LIST OF FIGURES Figure 1: Examples from N.IECHO Figure 2: Examples from ?v'Brcus and Church ABSTRACT This work is an investigation into part of the human sentence parsing :mechanism (HSPlVJ), where parsing implies syntactic and non-syntactic analysis. It is !hypothesised. that the HSPM consists of at least two processors. We will call the first ;processor the syntact.ic processor, and the second will be known as the non-synt.actic ;processor. For normal sentence processing, the two processors are controlled by a 'normal component", whilst when an error occurs, they are controlled by an "error :recovery component". These divisions are based on the observation that human rbeings are able to bring at least two distinct types of information to bear on a te:Xt. 'The :~elution of lexical ambiguity will be used as a vehicle to investigate this '.hypothesis. Under control of the normal component, the syntactic processor is uncons­ ~ciou.s, deterministic and fast, but limited. It is hypothesised that the syntactic and ::non~syntactic ·processors work in parallel during the processing of a normal sen­ :tence. During processing of some sentences, the syntactic processor will, at key ipo:ints, ask the non-syntactic processor to make a decision in order to resolve an .ambiguity. These key points occur when.ever a situation arises in which the syntac­ :tic processor can no longer guarantee a correct analysis. A major focus of this :research is the identification of those situations in which people use the non­ syn~ic processor to assist with the resolution of ambiguity and the sentences in which this occurs. When both the .syntactic and non-syntactic processors fail, for example during the processing of a garden path sentence, it is hypothesised that an ''error :recovery component" is used, which controls both processors and is slower, semi.­ conscious and non-deterministic. We are concerned. with modeling only the normal use of the syntactic pro­ cessor. The mapr test of the psychological validity of such a model is that it fail on ;precisely those sentences that humans find to be garden paths. We use, as a starting ;point, l\/Iarcus's work on· deterministic parsing. The advances reported here are: ·-Reaction time experiments are used to provide a non-subjective classification of sentences as garden paths or not. Using this classification, it is shown that Marcus's parser would succeed on some garden path sentences and fail on some non-garden path sentences. - This deficiency can be corrected. by the use of non-syntactic information for ambiguities which may lead to a garden path. ·- Non-syntactic information is to be used to help resolve an ambiguity when the syntactic processor can no longer guarantee a correct analysis. All oth:er ambiguities are to be resolved on the basis of syntactic information. - An am.ended parser, ROBIE is presented which incorporates these conclusions. ROBIE :is shown to be compatible with the psychological evidence currently available on human sentence comprehension. - ROBIE is co:tnputationally and conceptually simpler than :J.\li:rrcus's parser. Introduction 1.1 The Human Sentence Parsing Mechanism People are able to understand most utterances quickly and without effort. Whilst this is obvious, how they understand them is not known. It is assu.med [Fodor and Frazier 1978], [1.Vmslen-Wlson 1976], [Kimball 1975] that each person has a sentence parsing mechanism to perform this tas~ where 'parsing" implies syn- tactic and non-syntactic analysis. The exact nature of this mechanism is an area o'f active investigation. However, l-"esearchers have agreed on several general observations about the Hu.man Sentence Parsing lV.Echanism (HSPJY.I). a) It is very fast and efficient and rarely seems to make a mistake. b) Humans are capable of understanding even very ambiguous sentences. c) The listener is not aware of the maprity of potential ambiguities in a sentence. d) People are able to accept as syntactically well-formsd, semantically anomalous sentences. e) They are able to understand syntactically ill-formed but semantically meaningful sentences. f) Some sentences, called garden path sentences, cause normal processing to fail and a semi-conscious process of error recovery takes place. These observations suggest that the.HSPM consists of at least two proces- sors. VVe will call the first the syntactic processor, and the second will be :known as the non-syntactic processor. In the syntactic processor, information about grannnati- cal structure only is used. The non-syntactic processor can use information from the meaning of words, sentences and also information from intonation, discaurse, ana experience. In this thesis, we will use the resolution of lexical ambiguity in written text as a vehicle to explore the nature of the HSPM and its interaction with the 9 non-syntactic processoro vve will also attempt to present a model of the syntactic processor that is fast, deterministic and of the same power as the part of the human syntactic processor used by :the normal component. To build such a model it was first necessary to explore the limitations of the HSPN.J; in particular, to determine when the syntactic processor might make a wrong decision and when the non­ syntactic processor must ·be used. It is postulated. that when the syntactic processor can no longer guarantee a correct analysis, the non-syntactic processor is used. The non-syntactic processor appears to be able to bring together a large variety of information in a complex way. In this thesis, we will only investigate when this processor is called upon to make a decision and not its exact nature. It is proposed. that for normal sentence processing, these two processors are controlled by the 'normal component". It is hypothesised that the two processors work in parallel during the processing of a sentence under control of the normal component. During processing of some sentences, the syntactic processor will, at key points, 'ask" the non-syntactic processor to make a decision in order to resolve an ambiguity. These key points would be whenever an ambiguity arose which the syn­ tactic processor o::mld not guarantee to resolve correctly, because of its limitations. Rather than using the non-syntactic processor only when an ambiguity has taken the syntactic processor astray, the non-syntactic processor is used when an ambiguity arises which might lead the syntactic processor astray. A m.a.:Pr focus of this research is the identification of those situations in which people use the non-syntactic processor to assist with the resolution of ambi­ guity and the sentences in which this occurs. This research is interested in the tim­ ing of this assistance, rather than its specific nature. It appears that in the majority of cases, the non-syntactic processor will direct the syntactic processor to a correct analysis. "We will however see cases in 10 which the non-syntactic processor leads the syntactic processor astray, where, left to itself, it might have found :the correct analysis by guessing. We will see that this is a desirable·featill'e for a psychologically plausible ID.Gdel of the HSPM When bot~ the syntactic and non-syntactic processors fail, for example during the processing of a garden path sentence, it ds hypothesised that an 'error recovery component" is used. This component controls both processors and is slower, semi-conscious and non-deterministic. The-natur.e of-the error component will not be investigated in this thesis. A model of normal sentence processing in the HSPM should fail on those sentences which :people find difficult to understand .and should not fail on those sen- tences which people have no difficulty in understanding. One type of sentences, that people have difficulty understanding, is the so-called:garden pat.h (GP) sentence. "\\e have chosen to look at. lexical ambiguity because it gives us many examples of ambi- guities that can lead to;a garden path. A mzjor test of the psychological validity .of this model will be whether it fails on all garden path sentences, but does not :fail ;on any non-garden path sen- tences. 1.2 The Initial Framework. As the initial framework, we will use ~cus's PARSIFAL (Iv'.mcus 1980]. N.Ercu.s presents a method -of parsing that is ''deterministic'', i.e., that never backtracks or changes a decision. As well as being deterministic, PARSIFAL I was primarily syntactic . I - -- -- Hiiitia:e: aRly and was ititended to be the first ;stage in a parsing system. Since Narcus's goal was the development of a deterministic' parser, he did not investigate ariy :psycliological . aspects. Hdwever' it is ,generally 'considered that his approach showed substantial'promi.Se as •a basis for work in this 'area. 11 In this thesis, we will.show ;that PARSIFAL would fail on some non-garden path sentences and not fail:on other garden path:sentences. VU:! will explore ways of .amendin,g :M:lrcus's_parser:so that it fails:on all and only garden paths and also inves- tigate part :of speech•am.bjguity, both areas which fvilrcus did not consider. In ad.di- tion, we want to build , a computationally more simple parser which rejects the :majority of ungrammatical sentences. V\e will assume the linguistic analysis of Chomsky [1957,65,73,75b,76,77] .and fuis Extended Standard Theory (EST) ·throughout this thesis. l.VBrcus's parser used :EST 1exclusively and :much of his ;grammar has been copied into the current parser. However, although the-parser uses EST, the m$rity of the examples in this thesis do :not depend on the portion of :EST which separates it from the older Standard Theory (Le., the·use of :traces). As with N.farcus's parser, ROBIE produces an annotated sur- i"ace !structure of the input sentence. In several situations, the use.or:Esr is incompatible with the limited pars- ing '.framework used .in this thesis so we have deviated from it. This deviation is :made to illustrate the implications of different grammatical theories is certain situa- tions. These situations and places where -deviations from Chomsky's analysis have been made will be noted. lt Js '.suggested that the parsing framework necessary to analyse a grammar is an effective criterion on which to judge the psychological vali- tlity- of various linguistic :theories. The evaluation of li:nguistic theories has not been a major focus of research in this -thesis. although there are some very definite impli- 1\ cations_ for linguistic -theories. Some ·of these implications will be outlined where appropriate. In particular, a favourable comparison will be made between the newer :Phrase Structure Grammar (PSG) of _Gazdar, Pullum and Sag [Gazdar, Pullum and Sag .l980C], [Gaidar 1979,80,BOb] and Cham.Sky's EST .. PsG also helps to illustrate the mte:h:miil.g 'differences 'between a 'transformational and non-transformational aPProach to ·grammar. Ait:hough it_ n.Ow ippears that ·PSG may have many benefits over EST for this determ.fuistic parser, this potential was recognised. too late to 12 permit a conversion1of the.grammar to PSG. (See Section 11.3). In this thesis, we will discuss non-syntactic interaction rather than seman­ tic interaction for the following reasons. 'In 'formal logic, semantic information is used to refer to that information which establishes the truth or falsity of a sentence. In other words, semantic information is all information other than syntactic infor­ mation. In computational. linguistics, semantic information is often used to ref er to the meaning of words, the meaning of the sentence, information from the current discourse (often referred to as pragmatics), etc. In order to avoid confusion, we will refer to non-syntactic information as .all types of information other than syntax. This definition will be explained in,greater detail in Chapter 4. 1.3 Summary of Results The advances reported here are: - Reaction time experiments were used to provide a non-subjective classifica­ tion of sentences into garden paths and non-garden paths. Using this clas­ sification, :PARSIF.AL .is shown to. succeed on some garden path sentences and fail on some non-garden path sentences. - This deficiency can be corrected by the use of·non-syntactic information for ambiguities which may lead to a garden path, - Non-syntactic information is to be used to help resolve an ambiguity when the syntactic processor can no longer guarantee a correct analysis. All other ambiguities are ito be resolved on the basis of syntactic information. - An amended parser, ROBIE is pr~ented whicll incorporates these conclusions. It is shown that the situations in which non-syntactic information is to be used can ~ accurately predicted 'tiy ROBIE's two buffer lookahead. ROBIE is shown 1to be compatibie with the psychological evidence currently :13 available on human sentence comprehension. - ROBIE is computationally and conceptually more simple than PARSIFAL [:rvlarcus 1980]. 1.4 Limitations of this Research This research is based only on written English. It is hoped that a better specification of when other information is needed to assist processing will emerge through the discovery of the limitations of text analysis. No information from speech, such as intonation, has been considered. As part of this research, a syntactic parser which covers a significant por­ tion of English grammar has been written. The grammar has been designed to cover the mechanics problems of the :MECHO ,system [Bundy et. al. 1979b]. Its coverage is illustrated in Appendix B. Nevertheless, 'WH movement and conjunction have not been thoroughly investigated. 1\iirrcus handled many examples of VVH movement in PARSIFAL [:rviarcus 1980] and demonstrated that VVH movement can be handled deter­ ministically. [Church 1980] has investigated both. of these areas and his work looks promising for a solution to these problems. The·focus of this thesis is on the implica­ tions of lexical ambiguity on the nature of the HSPM, so \l\7H movement and conjunc­ tion are not directly relevant to the discussion ·here, although a simple form of han­ dling these has been implemented. Because we are investigating only the syntactic processor of the HSP~ only that portion of the parser has been developed. The non-syntactic portion has been designed simply to handle the problems from· the lVIECHO world. In order to handle problems such as conjunction, the parser maintains three buffers. The third buffer is only used by the few rules which have been identified in this thesis to be exceptions. (i.e., conjunction and the original ''.have" rule.) il.4 This research is n:ot concerned with the exact nature of the non-syntactic ·tests, but rather, !the timing of these ·tests. As a result, the situations in which a :non-syntactic test should be m:ade are identified, but the exact test is not always implemented. As will be seen, several of the tests have been implemented usin,g ·semantic markers, :while other ;tests are in ·the form\ ~f various ~eu:ristics. The.se_ heuristics are intended to model what ;a non-syntactic itest should do, but they are :not claimed to be theoretically sig_nificant in ;themselves. Semantic problems of lexical ambiguity such as word sense ambiguity ·1Ni.thin a given part of speeeh and pronoun reference ambiguity have not been inves- :tigated. There are many proposed solutions to these·problems, but all fall outside the scope of this resear.ch. ·1.5 Out/ ine of the-Following Chapters This thesis is divided into four main sections. In the first section, Chapter 2, 'tleterministic parsing" is defined and Nl:rrcus's deterministic parser, PARSIFAL, is explained. This section ·establishes the framework which will be used in the rest of ;the thesis. The·second section, Chapters 3-5, is an investigation into resolving lexical .ambiguity, the first half of which concentrates on seeing bow one might resolve lexical ambiguities that can1ead to garden ·path sentences. This is done by exploring the consequences of the ·garden :path prediction of ':rvl:rrcus's 'parser. It is seen what :this prediction was and where it might be incorrect. A: reaction time experiment is [presented to show that this !Prediction was indeed incorrect. I ·To develop an improved predictfon of gamen a>ath sentences, in Chapter 4 :w-e explore an ~ded ·version of the par$er, ROBIE, dncotporating non-syntactic information. (In the form OL semB.n tic mark.'e:rs) Th.is chapter also looks at the limits of :the Syn.tactic ·processor ih an attempt to decide when non-Syn.tactic information is il.5 used to resolve lexical ambiguity and why. It is shown that people may use non­ syntactic information .to overcome the:limitations.of the syntactic processor. The second half of :our investigation into :lexical ambiguity, Chapter 5, focuses on how to :resolve lexical ambiguities :that do not lead to garden paths. To provide data for this investigation, :a :second reaction time experiment is presented. In :Section 5~1, ~e ·see which examples rof lexical ambiguity can be resolved. in our · model by syntactic ·context alone. In !Section 5.2, we ·will see what other informa­ tion :is needed to resolve part df:speech ambiguity. It is:shown that number and fixed constituent '.structure are sufficient to '.handle examples within the limitations of the syntactic processor. The third section compares our model, as developed in section two, with other Telated work. Firstly, we :investigate whether the model is psychologically plausible. The:relevant literature iis eJi::amined, in Chapter 6, and it is shown that the model can account '.for the -data presented. as well as explaining the principles :of :Minimal Attachment and Right As.5ociation. In Chapter 7, we look at other parsing proposals in relation to this work and we compare :the proposals in this thesis with other theories regarding the timing of non-syntactic interaction. The final ·section contains a: description o'f the· parser and suggestions for further work. Chapter 8 describes the workings of the parser in detail to provide the reader with a deeper understanding of the model. In Chapter 9, we return to the pxoblems of 1'.b:ave" and global ambiguities and some of the linguistic implications 1of this work. Finally, in Chapter 10 we will see some 'details of how non-syntactic illformation is actually, lised. in ·the current parser and how it could be added to an improved parser. 1'6 Marcus' s Worf< 2.1 The Definition of Deterministic Parsing :een tral to this thesis is the notion of determinism and deterministic pars- ing. In the following sections, what deterministic parsing is and the techniques that are used to implement it in a parser will be explained . . !Vl:rrcus [l\lmcus 1975] first proposed that English could be parsed deter- ministically. He later. stated: 'There is enough information in the structure of natural language in general,.and in English in ;particular, to allow left-to-right deter­ ministic parsing of those sentences which a native speaker can analyse without conscious effort." [:Nmcus 1980, p.204] The terms ··aeterministic parsing" and -'Without conscious effort" must be explained.. In relation to the former,, l\rlarcus said: ''Natural language can be parsed by a mechan­ ism that operates 'Strictly deterministically'' in that it does not simulate a non­ deterministic machine .. " [!Vl:rrcus 1980, p. 11] lMn-cus did not propose that deterministic parsing implies that natural laugu.age could be parsed by a deterministic machine in the automata theoretic sense. He points·out that any·computational mechanism that physically exists is determinis- tic in ·this sense. Tiie key n>oin t of the above statement is that the parser does not filn.hlate a :don-deterln:inisttc :tnaChine. \Vhy wouid a nattl.ral language parser '·see:in. to need to simulate a non- de'terril.in:iStic machine? Kaplan suggests this answer: 'Because hat'ilral language is am.lbignous, .a 17 natural language grammar is essen. tially a characterisation of a non-deterministic machine." [Kaplan 1973, p. 124] ·The assertion -of deterministic parsing is that a natural language grammar can be essentially a characterisation of a deterministic· machine. However, there are two ways a grammar interpreter usin,g a seemingly deterministic gram.mar can simu- .late non-determinism. These.are back.tracking and :pseudo-parallelism. ·Backt~acking can be prohibitoo by insisting that all grammar substructures are permanent. -In a parsing context this means that, if one item is attached to another, this attachment can never be broken. i.e., if a PP is attached to an NP, then the parser cannot break the attachment and attach the PP to, say, the VP. If a word is disambiguatearallelism, it is possible to follow :each permissible transition simul- --taneously. If une of the paths fails, the parser does not return to a previous state, but, instead, 'throws away" any structure built and then terminates that path. In deterniinistic parsing, building a constituent and then ''throwing it away" is not per­ mitted. This technique is therefore also \disallowed '\\e have two points relating to a deterministic parser. It must neither 'backtrack nor use pseudo-parallelism. In_ deterministic parsing, should a transition 'be made from ~so:ine state, we ·are guaranteed that the subsequent state will be on the ~path to a: suc&sS!Ul. · p'arse, 1f such a path exists. 'We shall consider this to be the 'definition 6f determiliistic parsing. na :Mlst :current :parsers· are clearly non-deterministic. The · ATN parser of "Wbods [\\bods 1970] makes extensive use of backtracking. In a "Chart" parser, [Kay 1973] no backtracking occurs, but -pseudo-parallelism causing wasted structure is very common. The Chart parser creates al.I possible connections (edges) between parse nodes. Of all these possible edges, only some .are used and the unused nodes violate the definition of determinism. Definite Clause Grammars as proposed by Pereira and "Warren .[Pereira 1980, Pereira and ~en 1980], are generally parsed using the backtracking of PROLOG:and, hence, are also not deterministic. 2.Z Conscious Effort ·This thesis makes some claims about theEuman Sentence Parsing :M3chanism (HSP:M). However, it does ·not claim that the ·entire HSPM is deterministic. For the reasons explained in the introduction, I feel that there is a ''.syntactic processor" for normal sentence :processing which is deterministic. Too little is known about the rest of the HSPMto make ,any statement.about it;_nor do I claim that all sentences are parsed without conscious effort. I do claim that those sentences which can be parsed by people with no consci.ous,effort, can be ;parsed deterministically by the 'Syntactic processor~', even though the non-syntactic processor may assist. As we are only con­ cerned with the 'function of the·syntactic processor for normal sentence processing no claim is made as to whether the non-syntactic processor is deterministic. It is assumed that the syntactic and non-syntactic processors of the HSPM rarely fail. When ttiey do, ·conscious effort is 'Ured to ·r-ecover from the error and the error ct>mponent ~control. It may use the:syntactic and non-syntactic processors non-deterministically. If we :assume that every failure of these processors causes oonsciolis effort, 'we kn.ow exactly when _a fauure. occurs ... If the model is unable to proceed ~t the pdint wnere people -exert cbhScioils effort, ;thep. it will be failing at the saine point a:s the syntactic and non .. 7syD.t~ctic proce5sors of the normal cam- ;19 iponent of the HSPM VVh.at idoes 'no ·conscious effort" :mean and how do we decide which sen- ten:ces "fall into this category? Unfortunately, the;answer to this question is not sim- ple. It is assumed that the processing :of a· normal sentence does not require conscious effort and :it js,generally agreed that ·to~understand a garden path sentence requires conscious,effort. The reader notices a mental "jump" or "block" when reading of the .sentence stops and the garden ·path is consciously realised. Experimentally, conscious effort can be.detected :by an increase in reaction time to a given task. As an armchair definition; :any grammaticcil sentence that seems abnormal to read, requires cansciolis effort. Since the line between conscious effort ;and no conscious effort is unclear, we 'Will ·concentrate ·on those exiampies tb.at -clearly require conscious effort. ~thout a clear '"definition and understanding of conscious effort, it is impossible to evaluate deterministic ·parsing for all sentences in a language. lVbre experimental ~· data must be collected in many areas ·before· we can conclude what does and what does not require conscious effort. Throughout this :thesis, I will point out when \Nie are assmning that a sentence or fragment requires conscious effort, when we know that it does and when the deterministic parser predicts that it does. 2.3 How is Determinism Aecom pi ished? The simple teclm.ique that makes determinism possible is I imited I oo/<.ahead. '-'.Lookclheal" means looking -ahead in tlie input stream before deciding wmch ~rammar:rule ·to execute •and ±hence,· wbich will be the next state. This ilook- iili.ead is always 1to the next K constituents after the' item the parser is currently con­ ktrncdrig. (Wh.ere~K may vary from:par.5er to parSei.-). The current grammar is writ"­ teh:suCh that each·ruie·can examine:the ?eattlres of' two ''buffers" containing consti ... 20 tuents, i.e., K=2. (see Section 2.5 for a full explanation of this.) If it were possible to ~oak arbitrarily far ahead, then it would be possible to have a buff er cell containing each word in the sentence. A single grammar rule which constructed the correct parse tree could then be written for each possible sen- tence. 'Whilst a parser that did this would be deterministic, it would not be psycho- logically plausible. The reasons for this will be explained in Chapter 4 and in Section 7.3. Unrestricted look.ahead would also enable the grammar interpreter to simulate a non-deterministic machine by allowing it to perform "closet backtracking". Limited lookahead is supported by 'wait and see". If it is unclear how a certain word or constituent should be used during the parse, the parser "waits to see" what ·should be done. Because the parser is not allowed to ignore or throw away any structures which it has built, this should not be construed as 'closet backtrack.- ing". As [N.rarcus 1980, p. 24] points out, it is not possible for the parser to simulate what it might do with the input (and hence simulate backtracking) because it cannot build and then discard these structures. ''\t\Jait and see" means that, if the parser is unsure of a situation, it does not make a random guess. Instead it waits until it has enough information to make the decision correctly. Rather than making an arbitrary decision or pursuing several options in parallel, it suspends work.ing on that constituent until it has sufficient information to make the decision correctly. :rv.Brslen-VV.i.lson's comment on ':Securely"· indicates that there may be a psychological basis to this approach: 'The general claim I want to make is that the human speech understanding system is· organised in such a way that it can assign an analysis to the speech input at the theoreti ­ cally earliest point at which the type of analysis in question can be securely assigned. What is meant by the term ':Securely" here is that the system does not, within limits, make guesses about the correct analysis of the input" [N.mslen-VV.i.lson 1980b, p.16] 21 1v1Y' claim is that this applies to the syntactic processor as well as the speech recognition system. Not making guesses is central to the theory of deterministic parsing. By using limited lookahead and wait and see, it is possible to analyse many sentence forms correctly, without the need for back.tracking. There is one more important strategy in implementing the deterministic parser. A non-deterministic parser is typically driven top-down. It often tries to start a new constituent before it has seen whether any of the following input words can start that constituent. 'Whenever the parser discovers that there are no lexical items for that constituent, it must backtrack a:Q.d attempt to find an alternative con­ stituent. Creating a new node before one seo...s if there are any lexical daughters for it can cause much backtracking. Creating a PP node at the end of a NP, before the parser has checked to see if the next word is a preposition, is an example of this guessing. By using lookahead this is not necessary. It is possible to check that the appropriate lexical items are present before a constituent is initiated. ROBIE does not begin the construction of a new constituent unless it has a lexical daughter for that consti ­ tuent. ~ shall return to this later in Section 6.5. One can see that creating new constituents without regard for the following items is the same as the guessing to which :rv.0rslen-'Nilson alluded above. One main difference in principle between a non-deterministic parser and a deterministic parser occurs when decisions are made during the parse. In a non­ deterministic parser, a path is usually followed and the structure is built first. After this, checks are made to see that both path and structure were correct, if not, the parser can back.track and try again. In a deterministic parser, one asks: 'Does it make sense to build the item"? before it is built. If it will make sense, then it is built, oth­ erwise it is not. This is one of the key processing principles that distinguishes deter­ ministic parsing from non-deterministic parsing. 22 2.4 Tvvo Determinist.ic Parsers Given this definition of deterministic parsing, we will now turn to lVIarcus's parser, PARSIFAL, and see how this definition was used by him in parsing. In this section, PARSIFAL, its motivation and structure will be described and con- trasted to the Ivli.lne parser, ROBIE. The reason~ for the modifications of PARSIFAL incorporated in ROBIE will be explained in the subsequent chapters. :Mrrcus first became interested in deterministic parsing when he watched an. A TN parser needlessly back.track, making the same error over and over again, in a situation where the correct solution was obvious to him. (personal communication) He then designed a deterministic parser. To be deterministic, it seemed that the parser would need at least the following three properties: 1) It must be, at least partially, data-driven. 2) It must reflect expectations. 3) It must have some sort of limited lookahead. The :rv:Brcus parser, PARSIFAL, has two main data structures and a grammar interpreter. The first of these is a system of buffers. These are a number of cells con- tainin.g words or items that have been constructed, but whose grammatical role is unknown. V\7hilst the buffers can contain any constituent under a single node, for reasons that will be explained below, when a NP is being parsed, the buffers only contain words. The other is a 'push-up" stack which contains incomplete constituents. This is called the Active Node Stack. The Active Node Stack is constrained such that the constituents it contains must be dominated by a non-terminal node, i.e., a partially built NP, VP, PP, S, etc. If it could contain a terminal (word) that was not dominated by a non-terminal node, then it could be used as an ext~nsion of the buffers. This would provide arbitrarily long lookahead, which, as previously explained, is forbid- den. PARSIFAL placed words on the Active Node Stack in order to implement the Attention Shift. In ROBIE, all node movements are performed explicitly by the gram- 23 mar rules and these rules are forbidded from placing words on the Active Node Stack. The Active Node Stack in !ROBIE is identical to that used in PARSIFAL with this one exception. Two JVm.n Data Structures: 1) The Active Node Stack - contains incomplete constituents 2) The Buffers - contain words or jconstituents (oomplete or incomplet~) IVIarcus al.lowed from three to five buffer cells in his parser and these pro- vided. the lookahead capacity. For reasons which will be explained in Section 2.5, ROBIE uses two static buffers. (It actually maintains three to handle conjunction, but the remaining g:rammar rules can only use the first two buffers.) These two buffers are always kept filled •and this fact constitutes one of the major differences between these two parsers. The parsers move from left to right over the input string, building struc- ture as they proceed. They may suspend construction of a constituent by pushing another item up onto the :Active Node Stack. The buffers are always the rightmost nodes under consideration by the parsers. They can be considered. to be ''below" the Active Node Stack, with possibly completed constituents being 'tlropped" from the Active Node Stack into the;buffers. In ROBIE, these buffers are always the next two cells after the bottom of the Active Node Stack. The buffers move right and left as items are pushed up onto and popped down off the stack. The grammar and the rule matcher can only look at the two buffers. (The one exception is conjunction. No one has yet researched. a satisfactory conjunction method for a deterministic parser. There appears to be no solution that can handle a :wide range of conjunctions without a special mechanism..) lVIarcus's parser was 'Written in LISP, whilst ROBIE is written in PROLOG [Pereira, Pereira, and '\Vdrren ~1978] and is the natural language front-end to the NJECHO project [Bundy et. al. '1979b]. Both parsers take, as their input, an English sentence. They then produce a: S"".fD.tax tree of that sentence as output. For example, 24 the output from ROBIE for the sentence: [ 1] The shy boy has kissed Mrry. will be: S-1 [s,major,decl] NP-1 [np,def,ns,n3p] DET TilE [ det,def,ns,n3p] ADJ SHY [adJ] NOUN . BOY [noun,ns,n3p] AUX-1 [aux,past,v3s] AUXVERB HAS [auxverb,past,v3s,verb] VP-1 VERB KISSED [ verb,en,past,vspl] NP-2 [np,name,ns,n3p] NAME l\MRY [ nam.e,propnoun,ns,n3p] This tree is very similar to the output of PARSIFAL. In parsing the sentence 'The shy boy has kissed :rvl:rry.", the state of ROBIE, which is very similar to the state of PARSIFAL, would be as follows: Packet: CPOOL Rule about to run: PROPNAJVJE pattern: [name] Active Node Stack: 2: S NP det-the [SS-FINAL,CPOOL] adj-shy noun-boy AUX auxverb-has 1: VP B1: Mrry B2:. verb-kissed (SS-VP ,CPOOL] "WOrds are first considered by ROBIE, as they ~rive in to the second buff er. For PARSIFAL words arrive into whichever buffer is the rightmost. In both parsers, the grammar may indicate the start of a new constituent, based on the syntactic features of the next word. This constituent will be placed on the bottom of the Active Node Stack. The Active Node Stack grows,.upward and only the bottom item of the stack is currently Active,. i.e., the only item being actively constructed at the 25 time. In this example, the VP node is the Current Active Node and work on the S node has been suspended until the VP node is finished. In the diagram, the first buffer contains the word 'Tv'.my'' and the second buffer contains the word "." . In both parsers, the grammar consists of a set of production system rules ordered into packets. Each rule consists of a pattern for the head, that serves for the production system pattern, and a body that is executed by the interpreter once the rule has been selected. Every rule has a n:am.e,.a priority and is a member of a pack.et. For example, here is the rule from ROBIE for parsing a determiner. Programming details are left out for convenience. (PARSIFAL's rule did not have the semantic interpreter step.) Rule DETERN.IINER in pack.et PARSE-DET: To analyse a determiner, if you have the feature 'tlet" in the first buffer then:- 1) attach the first Buffer to the bottom'of the Active Node Stack as a determiner. 2) tell the semantic interpreter you have a determiner. 3) deactivate the packet containing this rule, PARSE-DET 4) activate the packet PARSE-QP-2 Recursively call the rule matcher. In both parsers, only' rules in active packets can be tested by the inter- preter. The parsing mechanism puts no co~traint on the number of packets that can be active at any one time. In practice there are'rarely more than three packets active. A rule body can activate and deactivate packets. 'This provides the top-down or 'reflect expectation" component of PARSIFAL and ROBIE. The other functions of a rule body in ROBIE are explained in Chapter 8. When a pattern has been matched and a rule body is about to be run, the packet containing that rule is known as the 'Current Active Packet". In P ARSIF AL, the patterns of the grcimmar rules could match any combina- tion of the contents of the three buffers and the bottom of the Active Node Stack.. (The item currently being built). PARSIFAL could also check features on the lowest S node of the stack. This means that each pattern could inspect up to five nodes before matching. In ROBIE, the patterns are constrained to ?IBtch only two buffers and it is not possible to access any node in the Active Node Stack except the bottom one. The reasons for this will be described in the next section. The grammar rules in ROBIE are very similar to NE.rcus's original rules. In fact, most of the rules have the same names. They have, however, been modified. slightly, as will become apparent later. _ · The production system igramm.ar ~rules were structured by combining groups of rules into packBt.s, a pack.et being a collection of rules. These packets can be made active or inactive by the parser. Only active rules can match against the state of the parser. Each node on the Active Node Stack has a list of active packets associated with it. In the above diagram, the_packets SS-VP and CPOOL are associated with the VP node, and the packets CPOOL and SS-FIN.AL are associated with the S node. The only packets which are acthre are those associated 'With the Current Active Node. The following diagram illustrates the structure of the grammar. This is similar to [1.Vmcus 1980, p. 19]. The grammar: Matched against the Buffers Priority Pattern PACKET! 5: [ ] [ ] 10: [ ] 10: [ ] PACKET2 10: [ ] 10: [ ] 15: [ ] PACKET3 5: 5: 10: 10: [ ][ ] [ ][ ] [ ] [ ] THE BUFFERS: (lst] (2nd] Action -> ACTION! -> ACTION2 -> ACTION3 -> ACTION4 -> ACTION5 -> ACTION6 -> ACTION7 -> ACTION8 -> ACTION9 -> ACTIONlO 27 .. / Vie can now look at the three desirable properties mentioned above and see how both parsers incorporate them. 1) The parsers are data-driven in that the patterns of the rules will match the words as they arrive in the buffers, or other items in the buffers. 2) Both parsers reflect expectations, since only rules in active packets can match. By activating the packets for the expectations we have, the parser provides a top-down component. 3) The buffers give a constrained lookahead such that the contents of a limited number of buffers may be examined before a rule is match~. Nbre on ROBIE and how it differs from P ARSIF AL will be found in Chapter 8. 2.5 What did Marcus do about Lexical Ambiguity? In the previous section we saw the structure of l.V0rcus's parser. In the next sections :M"arcus's method of handling lexical ambiguity [:rvl:rrcus 1980] and several problems with his approach will be explained. In N.Iarcus's parser, almost all words were defined as only one part of speech. For example, ''blocku was defined only as a noun, whilst "schedule" was defined only as a verb. This is clearly anl over-Simplification of the English language I :·,, . As a result, the following sentences could not be parsed by his parser: (example sen- tences will marked with a number in square brackets. (e.g. [ 4])). [ 2] I lost my Schedule (3] The car will block the road. It is easier to parse sentences if on,e has to deal only with structural ambi­ guity and not part of speech ambiguity as well. VVhi1st l.V0rcus's thesis clearly demonstrated that it is possible to parse a wide range of syntactic phenomena, he showed this on a relatively simple, non-ambiguous set of sentences. 28 One might then think that his work is not really a proof of how easy it is to parse deterministically, since there was very little ambiguity to confuse the parser. If words had only one part of speech, then part of speech ambiguities could not occur and the following sentences would either be readily comprehensible or total nonsense depending on which part of speech the ambiguous words (i.e., 'block", 'Will", 'Can." and 'her'~ were given. [ 4] The block will be made of wood. [5] The plug will block the;pipe. [ 6] He wrote the will. [7] The trash can is red. [ 8] :Nt:rry patted her dog. If deterministic parsing is to be. able to handle a wide range of sentences, then, clearly it must resolve part of speech ambiguity. A non-deterministic parser solves this problem by trying one part of speech, and if it is wrong, backtracking, to try another part of speech. A deterministic parser, on the other hand, is not allowed to experiment, but must find the correct part of speech at first try. As the work by [lV.lilne 1978] showed, handling noun/verb ambiguity in IVarcus's parser was possi- ble, but other types of part of speecll ambiguity were not investigated. Nor did NJarcus investigate this problem. We are interested in a psychologically plausible method of p~sing. It seems that people have no trouble with:most occurrences of part of speech ambiguity, so it should be stipulated that it must be easy for the parser as well. In general people do not notice all occurrences of lexical ambiguities. If handling part of speech ambi- gui ty in the parser requires much extra effort and machinery, then this would be considered a serious blow to deterministic parsing as a psychological model. Con- versely, the ability to handle part of speech ambiguity quite easily would support / . deterministic parsing as a viable approach 'to natural language parsing. It would be important if the constraints of the parser 'influenced the way we handle ambiguity and even more important if the parser .design enabled us to predict successfully the effect of ambiguous situations on people. 29 As has been just explained, Marcus did not investigate part of speech ;ambi­ guity. There•are two other_ problems arising from lVfar-cus's research. These are :the "Attention Shift11 and his 1use of "three buffers". 'W0 will l-00k at -each -of these in turn in the next two_sections. 2.5.1 Mar.cus an:J:the Attention Shift Wnen. IVrarcus [l\lm"cus 1!980] presented his -parser, he assumed that NPs arrived :im. the buffers m•a method·which was transparent to the parser. Throughout his -work, it is assumed that NPs ap:pear~fully parsed. VJ.i.th this assumption,:the parser uses·three buffer lookahead to :do all the: rule matching. In his chapter '''Parsing Noun Phrase5", (Marcus 1980, Chapter 8], :Mll'cus describes how NPs are parsed. He extends the parser by adding a new class of gram­ mar Tules callai ''.Attention 'Shifting" rules. These rules work, as the name implies, by shifting the attention of the parser. Normally the parser has its attention on the first buffer, but an Attention:Shift canicause the parser to "shift its attention"·to any of the three billfers. The buffer on which attention is focused is the new first buffer. ThiscAttention Shift (AS)·can occur whenever a word which may start a noun phrase arrives into ane of the buffers. For example, if a determiner arrives into the'third buffer, then the parser will ''.Attention Shift" to the third buffer. This will make it the virtual first butter and-have two more.buffers available. The parser'will then build the NP a5 if the preceding buffers were not there. When the NP is finishea, the parser retunts to the originCll first buffer, leaving the newly built NP in the:third buffer. When the Attention Shift is useG., PARSIFAL is u.s:i.n,g 5 buffers. ThiS method provides the me\:banism for NPs arriving in tlile buffers in a way which iS transparenlt to tne rest of the parse. It _seems that some similar -lrletbod is needed if the parser is to 'liave rule patterns of the form: 30 / Ihave][np] [to](np] [for][np][to] !Each of these has an NP in the seco~d buffer. If the parser cannot shift past the first word, it is very difficult to build this NP. For example, if the pattern to start an NP was: [ngstart] and the rule could match only one of the buffers, then all three of the fol- lowi.Iig patterns wotild be needed: [ngstart] [] [ngstart] [] [] [ngstart] a'o avoid the need "for all three patterns, the Attention Shift rule could match its pattern to any of the three buffers. a'he Attention Shift is a very powerful :tool, but it has a few undesirable side eITects. First, to handle the parsing of NPs, an extra mechanism was added. This seems· to be necessary because of the above patterns. Secondly, there seems to be no principle governing the situation in which the Attention Shift should be used. How many buffers are now needed? M:rrcus introduced his parser with threetbuffers. If the parser is shifted to the third :buffer, does it still have the three \ buffer lookahead? The answer to this question is 'yes". When the parser is in an Attention !Shift, the old third buffer, in the worst case, is now the first buffer and there iare two more buffers after it. This means that at the worst, five buffers are needed, the three buffer lookahead, and the two buffers that have been Attention Shifted past, not three as before. What happens if we are in an Attention Shift and another determiner a:rrives in the third buffer? If we could re-attention shift then the number of buffers wbuld be five plus two, i.e., seven. Clearly this is against our goal of a lim- ited lookahead. 31 :rvID-cus prevented this by allowing only two Attention Shifts at any one time. The second attention shift was restricted to parsing numbers and other seman­ tic items inside the NP. For example; in the fragment, "a two hundred pound rock", the words ''two hundred pound'' would be assembled by a second Attention Shift. The Attention Shift is undesirable for several processing reasons. Firstly, it increases the number of buffer cells. If the parser was using three cells. with an Attention Shift it could be increased to five cells. This made the claim of limited lookahead weaker. Secondly, it adds a special mechanism. to the parser. Finally, note that it is possible to Attention Shift past individual words. We have said that we do not allow the Active Node Stack to contain individual words unless they are dom­ inated by a non-terminal. This was necessary to prevent the Active Node Stack from being an extension of the buffers, and prov:iding an 1mlimited number of lookahead buffers. By Attention Shifting past individual words, we are violating the spirit of this principle. It seems that it would be desirable to get rid of the Attention Shift. 2.5.2 The Three Buffers in Marcus's Parser ~y people have raised objections to the use of three buffers. lVBrcus gave only empirical rather than experimental evidence of the need for tmee ~buffers [!Vlrrcus 1980]. He also varied the number up to five in some cases and proposed t!hat the number may actually vary from person to person. If three buffers are needed for English, the question also arises, are three buffers needed for all languages? Bo some languages need four, and others require only two? As we will see, looking at the next word only (one buffer) is not suffi ­ cient to prevent backtracking. If it was sufficient, the non-determinism of Irlost ct.rrrent parsers would have been eliminated. \\e will see examples in Chapter 5 where one buffer look.ahead is not sufficient. Therefore, two is the minimhm number of buff8!s to prevent back.tracking. 'W9 shall also see that no matter hbw 32 many buffers we have, there will still not be enough information to re.solve some ambiguities. The two primary motivations for three buffers were to handle some of the diagnostics and embedded sentence parsing. We will look at the diagnostics in . Chapter 5 and show how to re-formulate them without three buffers. The three buffers were needed to start embedded sentences such as: [1'13..rcus 1980] Marcus's Pattern: [for ][np ][to] [ np ][to][ tenseles.s] [that]( np ][verb] [verb][ np ][verb] [have][ np ][verb] Marcus's ... rule name: INF-S-START INF-S-STARTl THAT-S-START SUBJ-QUEST HAVE-DIAG Rules of the form X-S-START, started embedded sentences with a cam- plementiser of the type X These are !Vbrcus's only rules (excluding those used to parse time and number phrases) which used the three buffers in their patterns. In order to get the subject of the embedded sentence in the proper place, N'Iarcus's parser waited until the above patterns were filled before it started the embedded sentence. To avoid problems with ambiguity and get the correct semantic analysis, this see.ms necessary. In fact, if one uses Chomsky's [Chomsky 1973,75,76,77] analysis, three buffers are needed, but if a different linguistic analysis is used, such as Gazdar's [Gaz- dar 1979,80a,80b,80c] Phrase Structure Grammar, only two buffers seem to be needed for parsing VPs. The Chomsky analysis is used with ~CHO to remain compatible with its semantics. The embedded sentence can be started before all its daughters are present, provided. there.is no ambiguity about its be.ginning. It will be seen in the following chapters how each of these can be eliminated and re-formulated with only two buffers. (The rnle SUBJ-QUEST above will not be discussed, since it is not relevant to the current grammar. It is not anticipated that this rule will cause further problems.) 33 This .claim, that only two buffers are needed, can be used to investigate linguistic th-~ries. Some theories ( Chomsky) seem to need the three buffers to start an embedded. sentence. In others, (Gazdar) it seems possible to use only two buffers. For example, consider the INF-S-START1 rule above. In Chomsky's analysis, the NP is attached. as the sub~ of the S-. In order to be certain that '1to" is part of the VP, it is necessary to check that it is followed. by a tenseless verb. If this is true, then the S- can be started and the NP attached. as the subject of the new S. It·would be wrong to attach the NP to the VP of the current S, as this would produce\ the wrong semantic meaning. Therefore, the three buffers seem to be necessary. In Gazdar!s analysis, the NP will be attached. to the upper VP, not as the subject of the S-. In fact, he does not have an S-, only a VPe 'Whether •aio" starts a PP or a VP, the NP is still attached. in the same place. Hence, we can attach the NP first and then a later rule can decide what to do with 'to". This approach can be imple- menJ.ed with two buffers. The resulting structures are: I Chomsky: (NPV (NP to V NP)) Gazdar: (NP V'.NP (to V NP)) Either of these could be correct, but the latter requires only two buffers. In ROBIE we use only two buffers for the pattern matching. It is intuitively obvi- ous that deterministic parsing with one buffer is not possible. \Ne will see several examples of ambi,gui ty in the next two chapters that demonstrate this. I will demonstrate that it is possible to resolve ambiguities with two buffers. This number is proposed. to be invariant throughout the parse and possibly across languages. It should be noted that to handle the phenomenon of conjunction three buffer lookahead is still needed, in order to jump over the 'and". Ali I have not investigated conjunction, all that is said here excludes sentences using conjunctions. 34 It may be noticed that almost all of the rules with three buffers need the Attention Shift as well. Removal of the three buffers and the AS go together. The rules which needed three buffers were either the "diagnostics", which are the subject of Chapter 5, or the VP rules, which I reformulated above. Once they were changed, there was no need for three buffers and no need for the AS, e..xcept to build preposi- tional phrases. To build a prepositional phrase in l\/Iarcus's parser, the pattern [prep ][np ]-> PP was used. This pattern required that the Attention Shift build the NP before this rule could run. This approach built PPs bottom-up, ignoring Kimball's (Kimball 1973] principle of 'function words start new nodes". Kimball's principle stated that the preposition initiate the construction of a PP. But l.Virrcus's approach did not build the PP node until the NP had been built. To rectify this, we can build PPs top-down by using the pattern [prep][ngstart] to start the PP. This follows Kimball's principle and guarantees that the node with the feature ''.ngstart" will turn into a NP. The parser can start the PP, attach ·the preposition and then build the NP. When the NP is finished it can be attached. to the PP. This has the same effect as N.0rcus's rule, but does not require the Attention Shift. The PP phrase is then on the Active Node Stack (ANS) whilst the NP is being built. As a result there are now more items on the ANS, but fewer buffer cells. It is clear that the NP will be attached to the PP when finished, so this could be done at :the time the NP is started and hence reduce the number of ANS cells. The current parser is not concerned with limitations on the size of the Active Node Stack, so this // J is not an issue. The total number of cells in the parser is the same in both systems. It is :basiCally as efficient for the parser to have separate unlinked cells, as it is to have lots of cells, so the memory problem is not really relevant here. 35 Thus, we have removed the need for the Attention Shift and the third buffer. From now on, we will use neither the t:hird buffer for pattern matching, nor the Attention Shift. There is one problem IvJarcus mentioned that we will not look at in this thesis. This is ambiguities _involving the yvord 'as". Some of the problems which can arise are illustrated in the following examples: .. ..... [9] As many as ten different explanations were proffered [ 10] No one could ever be as big as big bad John. [ 11] He left as quickly as he could ,_. [ 12] Who could believe he caught as big a fish as that? [ 13] Bill offered his advice as an expert in such matters. [lVJarcus 1980] used these exa.TD.ples to suggest that perhaps· a three buffer lookahead and the Attention Shift are necessary to diagnose the proper use of the leading 'as" in each example. The follovvi.ng buffer patterns indicate the type of con- stituent which should be built, based upon the three buffers. The pattern [np] has been changed to [n.gstart], as this is all that is necessary to start an NP. [as][quant][as] --> [as][adj][as] --> [as][adv][as] --> [as][adj][a] --> [as][ngstart] --> quant adj adv np pp These ,above patterns demonstrate that the Attention Shift is not actually needed in order to diagnose the different uses of 'as", but only three buffers. Further more, based on the above data, there is only one ambiguous case. This is exemplified by [10] and [12], when an adjective occurs in the second buffer. But in these exam- ples, the verb subcategorisation provides enough restriction to remove the am.bi- guity. That is, one cannot say: [14] *Who could believe he caught as big as big bad John. It should be noted that lVJarcus did not implement any grammar rules to handle the word "a.5".. ''.As" can play many grammatical roles and there is more than one theory of its possible roles. For a discussion of these difficulties, see [Bresnan 1973]. I ifeel that the current linguistic and psychological evidence is not 36. sufficiently compelling to reject the use of two buffers and no Attention Shift on the basis of these ''as'' examples alone. PARSIFAL could handle a wide range of syntactic constructions as illus­ trated. in Appendix B. rv.mcus also demonstrated that many linguistic 'Universal" constraints could be accounted for by the parser structure. In this section, we have seen that N.Brcus did not handle part of speech ambiguity. Other limitations of PAR­ SIFAL will be discussed if). subsequent chapters. !irstly, though, we will consider its use in the prediction of garden path sentences. 37 Using Determinism to Predict Garden Path Sentences V\e will begin our investigation into resolving lexical ambiguity by look- in.g at those examples which can lead to a garden path. A model of normal sentence processing for the HSPM should fail on ·those sentences which people f:i.nd difficult to understand and should not fail on those sen- tences which people have no ·difficulty understanding. One type of sentences, which people find difficult to understand, are the so-called garden path (GP) sentences. Lex- ical ambiguity gives us many examples of ambiguities leading to a garden path. In this chapter, we will look at words which can be either a plural noun or a si:qgular verb. VVe will investigate the decision as to whether the word is being used as a plural noun or a singular verb and how this can lead to garden path sentences. ,In the next chapter we 't-vi.11 look at other examples of ambiguity which can lead to a garden path. 3.1 Garden Path Sentences A garden path sentence is one which seems to lead people 'tlown the garden path'~ i.e., a person seems to analyse incorrectly a portion of the sentence.:and !then, because of later evidence, must go back, reanalyse and correct the mis-analysis. In this section we will explore several definitions of a 'garden path". Until we redefine a garden path later in this thesis, we will use 1\1.Brcus's definition. :M:rrcus [:iv.mcus 1980, p. 202] says garden path sentences are those: "which have prefectly acceptable syntactic structures, yet which many readers initially attempt to analyse as some other sort of con­ struction, i.e., sentences which , lead the reader ''down the garden path''. 38 The following is a classic garden path: ~ 15 ~ The horse raced past the barn fell. In each sentence of this type, there is a point where two possible analyses are possible, ie., at 'raced". The need to back.track. is a result of selecting an analysis differi:Qg from that demanded by the rest of the sentence. For each garden path sen- · tence there is a corresponding sentence which does not require backtracking, e.g. [ 16] · The horse raced past the barn. This non-garden path partner has the same two possible readings at the same point, ;but the analysis selected is that demanded by the rest of the sentence. Such a pair of sentences will be called a pair of potential garden path sent.ences. "When­ ever a person encounters a particular paten tial garden path sentence, that sentence -may, or may not, cause a garden path. In that situation, of the pair of potential gar- ·den path sentences, one is a garden path and the other is not, although which is the ;garden ~ath is variable over subj=cts and from situation to situation. Examples of potential. garden path sentences will be marked with curly brackets (e.g. ~1 p. VJhy do :these sentences cause problems for people? The proposed answers to this 1question are in the form of other definitions for a garden path. Crain and Coker [Crain and· Coker 1979] note: "Bever -as well -as -Chomsk.y -and Lasnik have .argued :convincingly that unacceptability of GPs is due to processing difficulty." :chomsky and. LaSiiik. say '~arden path sentences result from the omission of all syn­ ·tactic markers which signal that one is parsing a Complex NP". This explanation sug­ ·gests t:hat an garden path sentences should be a problem of un-m.arked relative clauses. For example, in [15], the relative clause marker has been omitted. Other ·explanations for the difficulty come from Fodor, Bever and Garret. Bever sdys 'the first N .. V .. (N) clause .. is the main clause, unless the verb is ·marked subi..ordinate" [Bever 1970, Strategy B, p.294]. Fodor, Bever and Garret {Fodor, Bew and. Garret 1974, p. 356] have the Canonical Sentoid Strategy to 39 account for the unacceptability of GPs. This is a structure independent mapping from the surface syntactic structure to the semantics. This strategy always "takes the verb which immediately follows the initial NP.of a:sentence as the main verb, unless there is a surface structure mark of an embedding". These explanations account for the .difficulty in [15], but, again, suggest that all garden paths are due to the difficulty of an un:2mark.ed relative clause. V\bods' · presents the following defiliition of a garden path: "In human parsing, there ~are :clearly cases where, on the basis of local context and the history of the sentence up to a point, a deci­ sion is made to follow a particular alternative and all other alternatives :are left to be pro­ cessed later. This type of processing gives rise to the so-called 'garden path'' sentence in which the listener is fooled :into;a false choice among syntactic alternatives and must cons­ ciously undo this choice iafter ·detecting an inconsistency." [V\bods 1973, p. 133] This definition, like l.\&rcus's, is more general and allows for garden path sentences which do not involve relative clause conflicts. ~will use l\/Iarcus's defin- ition, until a new definition is presented later in this thesis. VVhil.st these definitions seem to account for the examples presented here, do they truly account for all garden path sentences? In order to answer this question, let us look at a simple case of ambiguity and how it can be resolved with a two buffer lookahead. "We will then turn to a more difficult example. The sentence fragment: [ 17] The toy rocks .... could be completed as: (18] The toy rocks are red. (19] The toy rocks easily. In sentence [18] the subj:d NP is "the 'tt>y' rocks" and 'rocks" is a noun, while in [19] the subject NP is only "the:toy..,and 'rocks" is a verb. In order to find the end of the noun phrase correctly, the parser :inhst detect these possibilities and 40 decide which is applicable. A non-deterministic parser with a back.trackin.g capabil- ity, could always try one analysis first, and if this fails, try the other analysis by backtracking. The only difficulty is which alternative to try first as a :matter of efficiency. A deterministic parser however, is not able to backtrack and hence cannot follow this strategy. Once the deterministic parser "decides" that a word is a noun, it is·committed and can.not change its mind. Hence, the deterministic parser must be able to decide what part of speech the ambiguous word is without making an error. ROBIE uses two buffer lookahead to see the follm,\ri.ng word and decides which part of speech the word is being used as. For example, when parsing [ 18], the parser would have the word 'rocks" in the first buffer and the word ''are" in the second buffer, Le: [rocks][ are] This pattern indicates that 'rocks" was being used as a noun in this sen- tence. For [19], the buffers would be: [rocks][ easily] showing the verb usage. By looking at the following word, the parser can handle these examples without the necessity to back.track., Using two buffers then, it is often easy to decide which part of speech a -word is being used as without needing to backtrack. This is a much cleaner and simpler result than the non-deterrni.Jtistic approach explained above . . As was explained in Section 2.4, the Active Node Stack. can.not contain words unless they are dominated by a non-terminal node, i.e., a partially built con- stituent. Because of this, the lookahead buffers will always contain words rwhen the b8ad.noun of a 'NP is being constructed. The detailed reasons for why this must be trbe are not relevant to our d.iscusSion here, but will become clear later in the tart. Let us now look at a more diff1cult example. In this case, the two bUrfer lookahead is not sufficient to resolve the;ambi.guity. ~20J The building blocks the:sun. ~21j The building blocks the sun faded are red. ~22~ The building blocks the.sun shining on the house. ~23~ The building blocks the.sun shining on the house faded are red. In these examples, the second buffer is not sufficient to disambiguate the ambiguous word properly. To distinguish between [20] and [21], it seems we would need four buffers and to distinguish between [22] and [23] we would need eight buffers since we need to see the word 'faded"~before we could resolve the ambiguity. J.f the word ''blocks" is currently in the first ·buffer, we need six buffers containing ''the sun shining on the house", plus a buffer containing 'faded". This is eight buffers in. total. In fact, there can be any arbitrary number of words between "blocks" and the word which indicates whether "blocks" is a noun or a verb, see [N.lilne 1978]. Therefore, no fixed amount of lookahead will be able to disambiguate the word ':'blocks"_in this situation. In no way, using only syntactic in.formation, could a two buff er deterministic parser handle these examples. As was explained in the previous chapter, the use of an arbitrary number of buffers would make our claim of psycho- logical plausibility vacuous. The above sentences are in two pairs of potential garden paths. -within each :pair, the sentences are the same for most tof the string, but differ at the point at which the function of the word in question (blocks) can be ascertained (at faded). For the parser this means that the i>uffers will contain the same items and the disam- biguati.ng word will be beyond the buffers (to the right). (20,21] and [22,23] form two such pairs. For all of these, the buffers •contain [blocks][the] and the disambi- guating word is too far to the right. [lVlilne 1978] explored the liandl.ing of noun/verb ambiguity in a deter- iuin.istic·parser and showed that the ''building blocks'' sentences could lead to garden paths and hence each one is a potential garden path. In fact this paper showed that all situations involving a word that could be either a noun or a verb followed by a word that could be a plural noun or a verb, could lead to a garden path. Th.e above definitions do n~t explain why these could be garden path sentences. [lVfilne 1978] made no attempt to handle these situations. Consider the following sentences: ~24j The toy rocks near the child quietly. ~25 ~ The toy rocks near .the child are pink. These are a pair of potential garden path .sentences. They help to demon- strate that, in the situation of a singular noun/verb word followed by a plural noun/verb word, it is always possible to finish the sentence so that it will be agar- den path. \Ve will refer to this type of a garden path as a pi ura/ garden path. As previously stated, the lookahead may need to be arbitrarily long to handle these examples. In these cases, the deterministic parser cannot disambiguate all of them properly. At this point, we must return to the motivation for the parser. It was con- structed to parse in a psychologically plausible way. This means that we want the parser to perform in exactly the same way as people do. Should the parser parse sen- tences which people found incomprehensible, then it would not be fulfilling its role as a psychological model of normal human sentence processing. V\e have noted that, by definition, people fail on garden path sentences, needing to ;employ conscious effort in their analysis. The parser, then, should also fail whenever it is :presented with a garden path. Tirls point was made by ~cus, who stated that :any deterministic parser should fail on this type of sentence. He says: 'a deterministic parser ideally should take the garden path and become ''stuck" at exactly the paint at which people become conscious that they have been misled." [Nmcus 1980, p. 204] PARsiF AL was not built to handle garden path sentences, therefore it failed on some, but not on all. 43 In order to limit the model to parsing successfully only non-garden path sentences, those sentences on which we wish the parser to fail must be identified. Since garden paths are defined as sentences on which people fail, this should be .possi- ble by experimentation. Assuming that the parser is limited in such a way, it can then be used to predict which sentences will ·cause problems in humans and, hence, give a definition of a garden path sentence as one which cannot be parsed deterministically by the model. The experiments which follow were designed to test the model against human performance, in its ability to distinguish and fail on garden path sentences. 3.1.1 The Garden Pat.h Prediction of PARSIFAL !Et us look at the garden path prediction of :rvln-cus's parser. PARSIFAL con- sisted. of an Active Node !Stack, where partially built items resided, and three buffers. Each buffer contained a word or constituent that could be represented by a single node. A buffer can then hold a word, NP, PP, VP, etc. ·NP stands for Noun Phrase, VP for Verb Phrase, PP for Prepositional Phrase and S for a Sentence, S- is an embedded. S, toVP is an embedded VP with the auxverb "to". In the best situation, an ambiguous word will be in the first buffer and the lookahead will be two items (Buffers :2 and 3). When an NP is being :built, these two items of lookahead will be single words and never whole NPs or larger items. At the time the word 'rocks" is being analysed. in the fragment ''the toy rocks .. :•, the parse:r'-s -state would be as below: (The symbol [noun], will mean a buffer which contains an item with the syntactic feature 'noun'~ symbols such as [rock] will mean a buffer which contains the word rock.) Active Node Stack: NP the toy . Buffers: [rockS] [the] [cruid] 44 A sentence is predicted to be a potential garden path if the look.ahead is insufficient to disambiguate the word correctly. l\lhrcus did not recognise the con- cept of a potential garden path sentence. Instead of the pair of sentences we have called potential garden path sentences, he considered one a garden path and the other an ordinary sentence. For the examples in this chapter, the lookahead is not sufficient to resolve the ambiguity. "VVe have said that each time a person encounters a pair of potential garden path sentences, one and only one is a garden path. :JV.Brcus's method says that one will be a garden path, but cannot tell which it will be. Because of this, his parser would arbitrarily choose one case (i.e., the noun usage) as pref erred. For example, in the case of a word that could be a plural noun or a singular verb, :bis parser would always choose the plural noun usage. This would be correct for: ~26J The granite rocks by the seashore-are-eroded. ~27~ The granite rocks by the seashore with the waves. In [26], "rocks" is used as a noun, and l\lhrcus's approach would correctly predict [27] to be a garden path. But in: ~28~ The statue stands in the park. ~29J The statue stands in the park.are rusty. where 'Stands" is used as a verb in [28], his approach would incorrectly predict [28] to be a garden path and not [29]. ·Therefore, this approach would predict some sentences to be garden paths which are :not. !VS.rcu.S's prediction or garden path sentences also tells us how ''short'' agar- den path must be. In sentence [22], the disambiguating word is outside the three buffers. "What happens when the word is in.one of the three buffers? !Vilrcus's pred- iction would say that it was not a garden path. If the parser used all the information in the three buffers, then '.for a sentence to be a garden path, there would need to be at least three words between the ambiguous polli.t and the disambiguating word. Are all garden path sentences this fong? This uncertainty suggests some possible counter 45 : I r/ examples to the garden path prediction of the deterministic parser. 3.1.2 Why is it Wrong? In fact, the well-known garden path: ~30~ The prime number few. is a counter example to lVJarcus's prediction. 'When PARSIFAL is analysing the word 'number" the state will be: Active Node Stack: NP the Buffers: prime [number] [few] [.] Since the entire sentence fits into the three buffers, all information to analyse the sentence is available, but people do garden path on this sentence. The fol- lowing may also :be considered counter examples: ~31 j The granite rocks during the earthquake. ~32j The j3ep rocks are large. Even though the number of words read, before the error is realised, is very small, it seems that people are aware of some confusion whilst analysing these sen- ten.ces. Again, all the information for proper analysis is contained in the three buffers and it is not predicted to be a garden ;path by PARSIF AL. We will later see experimental evidence that these sentences.cause problems for a reader and, therefore, must be considered garden paths. This shows that lVIarcus's garden:path prediction is inadequate. "While PAR- SIFAL will correctly tell us some sentences, are garden paths, it will also predict some sentences are garden paths which are not and judge as acceptable some sentences which are garden paths. So how can we :predict whether ;a sentence will be a garden path? What is happening that causes the garden path? Can our model be extended to answer these questions? 46 3.2 The New Theory I feel that the previous definitions are inadequate because they fail to I incorporate non-syntactic information, where non-syntactic information /includes I information such as the meaning of words, intonation and pragmatics. (\AA:! will dis- cuss this definition more fully in the next chapter.) To account for the difficulty of garden path sentences, the following hypothesis is proposed to describe what people do when they encounter a potential garden p~th. Semantic Checking Hypothesis: ''When a person encounters a situation which syntactic context implies might lead to agar­ den path, they decide which alternative to pursue based on non-syntactic information, inst.ead of using /ool rocks" since it is very difficult to imagine the complex item (jeep rock). Finally for the case represented by [28] and [29] (toy rocks), both constructions are equally possible, so some people would garden path on [ 28] and some on [29]. It is also very easy to bias this last case with context, etc., altering the 47 prediction. 1V.0rcus's approach to predicting garden path sentences was often correct because it seemed that one member was the garden path more often than the other. Although the non-syntactic choice can go either way, in many cases it tends to go the same way each time the sentence is encountered and, hence, one of the pair causes a garden path more often than the other. This fact explains why the previous garden path definitions i s~med. correct .• eel. The Semantic Checking hypothesis predicts that ~33J The sentry stands on guard. will not be a garden path, but 134~ The sentry stands are red. will be. This new theory makes definite predictions of what will be a garden path sentence, based on a person's preference for complex headnouns. Lookahead is predicted to have no affect on these examples, as the decision is being made on non­ syntactic basis alone. In the next sections, an experiment designed to test this will be presented. 48 3.3 The First. Experiment - Test.ing Garden Path Sentences The 'Semantic Checking Hypothesis" makes definite testable predictions. According to this hypothesis whether a person chases to combine a pair of nouns is dependent upon their preference to pair these words.·Vvt! can model this preference for certain noun/noun pairs by using semantic markers. Our predictions can then be modeled by these markers. An experiment was conducted to test the above predictions. The purpose of the experiment was to show that, of a pair. of potential garden path sentences, one was a garden path whilst the other was not, 1also ;that: subjects do not use lookahead to resolve the ambiguity leading to the potential.garden path. Remember, a garden :path sentence is one which seems to lead people 'down the garden path". As our theory in the previous section predicts, for these sentences: [35) The chestnut blocks are red. (36] The chestnut blocks the sink. one will cause a garden path and the other will not. Remember, a garden path sentence is one in which the reader initially1mis~analyses a portion of the sen- tence and must exert conscious effort to correct this mis-analysis. In Section 2.2, we stated that this effort is normally detected by: an increase in reaction time to the gar- den path sentence. Hence, it is predicted that :the garden path sentence will lead to a longer reaction time than its non-garden path partner. If one sentence has a longer reaction time when compared with its partner, 1we 'can assume that the sentence which took longer caused the subject to garden path [ Cxain and Coker 1979]. In both the above sentences, looking just one word ahead is sufficient to resolve the ambiguity as the second buffer contains the disambiguating word. So, a person looking at .the second buffer, could resolve tlie ambiguity and the reaction times should be the same for both sentences. If a person uses lookahead to resolve the ambiguity, then the person will not need to garden path on these sentences. Hence, if I one sentence has a longer reaction time, we can conCl.ude that the person garden 49 pathed and did not use lookahead. If the theory is wrong, then responses to both sentences will take the s.=-me length of time. If the theory is correct, the response to one of the sentences V\rill t Jke longer than to the other. As the hypothesis states, conte..xt :;:nay have a strong ef:::-~t on the understanding of t!iese sentences._ Therefore, it is predicted that the sente:::ice of the pair which requires t,he longer time mar vary from context to context for any one person and also from one person to anoth~. 3.3.1 The Pre-Test. First, it was important to decide whether there was a generally prefer:red. read_ing for the noun/noun combinations which were to be used in the experiment. This data would also be used to establish the semantic marker pairs for the parser. To collect examples, a written survey was conducted which consisted of 21 fragme2ts, as below. Each subject was asked to complete the series of words such that fr.:.ey formed a complete sentence. The examples were presented in two different orders to control for order effects and 50 subjects participated. The examples were: (37] the grappling hooks (38] the aluminum screws (39] the granite rocks . [ 40] the map pins [ 41 ] the top hooks [42] the truck handles (43] the sentry stands (44] the boy screws (45] the cook handles (46] the jeep rocks (47] the statue stands (48] the sniper pins (49] the arm hooks [50] the chestnut blocks [ 51] the toy rocks (52] the bike handles (53] the plastic blocks (54] the building blocks (55] the cover screws (56] the flower stands j ( (57] the book pins The part of speech use of the noun/verb/plural word was then checked. A tally was made each time the word in question was used aS a noun and as a verb. This provid~ an. indfration of the preference for these word pairs. For example, if alm.o'.st all sub~s completed 'grappling hooks" using ''hooks" as a noun, then this combina- 50 -· tion was considered to be pref~red. If most of the subjects completed ''boy screws" using "screvvs" as a verb, it was concluded that the verb :reading was preferred. The above examples were then divided into three groups. The first group contc.ined pairs which were strongly pref2rred as nowl./noun combinations, the second, pairs which_ wer_e ~ot preferred as noUi---i/nowJ. combinations and the third - . ~ . " . group were examples shovving an equal spli~ among subjects, or no bias. The results by group are as follows: Noun/Noun Preference Noun Uses Verb Uses [37] the grappling hooks 49 0 [38] the aluminum screws 50 0 [39] the granite rocks 48 0 [40] the map pins 45 5 [ 41 ] the top hooks 34 9 [42] the truck handles 32 12 Verb Preference [43] the sentry stands 0 44 [44] the boy screws 0 50 [45] the cook handles 0 50 [46] the jeep rocks 0 44 [47] the statue stands 8 42 [48] the sniper pins 7 32 [49] the arm hooks 8 42 [50] the chestnut blocks 13 37 Equal Bias 26 [ 51 ] the toy rocks 21 [52] the bike handles 25 23 [53] the plastic blocks 25 25 [54] the building blocks 30 20 [55] the cover screws 20 30 [56] the flower stands 18 23 [57] the book pins 20 28 3.3.2 Test.ing the Predictions The first experiment was based on the above data and collected. reaction times to test the hypothesis that, of the pair of potential garden paths, one is agar- den path. The above marker parrs lead to the following predictions, which we can now test. Subjects were asked to read the series of words presented• and decide whether they were a complete sep.tence, or just a fragment. The noun/noun combi- nations used were taken from the above list of pairs. The examples were in two groups, those With astron.g non-syntactic bias and those with no non-syntactic bias. The first group will be called the 'biased examples". Examples were picked which had a strong preference for or against the combination, but could be used as both a noun/noun combination and a noun/verb combination. A sentence was constructed for each combination, using it in the non-preferred way. A partner sentence was then constructed which was matched in syllables and words, but used the combina- tion of words in the preferred usage. For example: [ 58] The sentry stands are green. [59] The sentry stands on guard. It was predicted that [ 58] would be a garden path. The pre-test has shown that "sentry stands" is not preferred as a noun/noun combination, although it is used as such in the sentence. According to the theory, the subject would attempt to use 'Stands" as a verb rather than as part of the headnou_TJ. combination. This would lead to the wrong analysis, and a garden path would result. In each sentence, the features of the word following the target word (the word with the features noun, verb, plural, i.e., 'Stands'~ were sufficient to disambiguate the target word. It w~ predicted that [58] would take longer to understand than [ 59]. However, the difference in reaction time might simply have been the result of the different syntactic structures of the two sentences. Therefore, for each sentence of the above pair, a control sentence was constructed.. This sentence was matched in syllables, but the word preceding the target word was changed to a make the sen- tence a definite noun use (for example by switching to an adjective); similarly, for the second sentence. Th.is produced controls of the form: [60] The pencil stands are green. [61] The army stands on guard. It was predicted that the control for the non-garden path sentence, [61], would take the same time as the non-garden path member of the potential garden 52 path pair, [59]. To ensure that the longer reaction time for the verb usage was not due to structural and processing differences, the two controls ([60] and [61]), were compared. If they required the same time, then it could be concluded that the verb usage did not require a longer time to process. If [61] took more time than [60], i.e., the verb usage took longer to ·process, then it was necessary to check that the effect was greater than would be due to structural differences. If three of the sentences required the same time to read and one of the test sentences required a longer time, then the predicted result would have been achieved.. The above prediction was only valid if the subject judged both examples of the test pair to be a sentence. If the sub~t judged [ 59] to be a fragment then, we can conclude the noun reading of 'stands" was used in both sentences. In this situation, the reaction time for both sentences ([58], [59]) should be the same. For each test combination, we now have four sentences. To keep the task valid, four fragments were also included. so that the subject was presented with an equal number of sentences and fragments. The test pairs used were as foilowS: [62] The prime number few. [ 63] The bold number few. (64) The sniper:pins were rusty from the rain. [ 65] The sniper :pins the victim. in the woods. [66] The sentry stands are green. (67] The sentry stands on guard. [ 68] The chestnut blocks the sink. [ 69] The chestnut blocks are red. [70] The granite rocks during the earthquake. (71] The granite rocks were by the seashore. [72] The map pins onto the walL [73] The map pins are bright red. Each pair above is a pair of potential garden path sentences. The first of each pair was predicted tobe·a garden path sentence on the basis of the results of the pre-test. The second test ,group, known as the 'non-biased examples", consisted of sentences with no strong non-Syntactic bias as in: 53 [74] Tne tDy rocks the child. [75] 'lh~ t11:'.Jding blocks t::ie sL.n. [76] Tn2 took pins the author. [77] Th~ tike hai!dles in the boy's ~!ds. follI compl2men~ary s2nt·~nces with a definite noun use, a d2finit2 v2rb use, a d2fin- ite fragment, and a definite complete sentence. - For e.::.:am.ple-: [78] The toy rocks are soft .-- definite noun use [7 9] The toy rocks when hit definite verb use [80] The toy rocks the child complete or fragment [ 81 ] The toy rocks the child pulls definite fragment [82] The toy rocks the child gently definite sentence Each of these sentences, [74-77] will be interpreted as a fragment or a sen- te:u.ce depending on the reading given to the noun/noun combination. The results of th9 pre-test indicate that across subjects, there was no agr29d non-syntactic bias for these examples. However, this finding may be interpreted in two \-Vays: a) It may be that each sub~t has no preferred. interpretation for e.g. "toy rocks". If this is so, then both readings for [7 4] will be available to :him. In the face of this ambiguity, he may take longer to decide it is a complete seil.tence than, say [79] or longer to decide that it is a fragment than [81]. In addition, whichever deci- sion he makes for [80], he will not produce a reactiorr tiine difference between [78] and [79], or between [81] and [82], greater than that due to differences in process- i:ng times between the two types of sentences or b9t"tveen sentence and fragment since there is no interference in these cases from non-syntactic preference. b) On the other hand, it may be that each subject does have a preferred. reading for [80], but that the preferences differ between subjects. Those who prefer to interpret 'rocks" as a noun will quickly (relative to [79]) interpret [80] as a frag- ment and those who prefer to interpret ''rocks" as a verb will interpret [80] as a complete sentence as quickly as [79], but will garden path on [78]. 54 The sentences were presented in two different orders to control for order bias. The biased examples were presented before the non-biased examples in both orders. Each subject was given one order and all types of sentences were randomly ordered. All subjects were tested on all sentences. 3.3.3 Subjects Forty-seven undergraduate students from Edinburgh University partici ­ pated in the experiment. N.bst were from the Psychology department. All were unpaid volunteers and native speakers of ''British English". Approximately half the students were tested on each order. 3.3.4 Procedure A Commodore Pet micro computer was used to collect reaction times for the subjects. Each person sat in front of the display screen (VDU) and the instructions were read aloud by the experimenter. The sentences were then displayed on the center of the VDU. The subject was asked to decide whether the series of words presented was a complete sentence, or just a fragment. If the subject thought the series of words was a fragment, he pressed a key with his right hand. If he felt it was a sentence, he pressed a key with his left hand. Reaction times were measured from the presentation of the sentence until the response. The subject was told that all series of words would be syntactically and semantically well formed. He was first given sixteen practice sentences. Sentences were presented in groups of twenty with a short rest between each group and there were eighty-six examples in total. After the test, each subject was asked how he had done. Those reporting they did badly, or had trouble, were noted for later analysis. 55 3.3.5 Analysis of the Data The reaction times and judgements of all the subjects were checked for irregularities. Remember that for each test pair in the biased examples, 4 sentences and 4 fragments were constructed. The sentence/fragment judgements of each sub­ ject were checked for the biased examples and the number of judgements differing from this design were tallied. The average number of different judgements per sub­ ject was 5 sentences ( 10%). Sentences with different judgements were not removed from the analysis. 18% of the subjects had differences in judgement on more than 11 of the sentences. For example, some of these subjects listed all examples as complete and others as all fragments. As the error rate for these subjects was over 20%, they were removed from the analysis. If the test word was used as the same part of speech in both examples, then no •:garden pathing" should have occurred and both examples should require the same time to read. Therefore, if one of the examples 1Na5 judged to be a fragment, the pred­ iction was that both sentences of the pair ~ould require the same length of time. Very few subjects made this altered judgement and their results were not separated from the others. As a result some of the variances are slightly larger than they would be if these people were checked separately. Exceptionally long or short reaction times were regressed to their group's mean but without losing their significance or relative ordering. For the ':granite rocks" examples, approximately half the subjects con­ sidered one of the test sentences a fragment. It was only predicted that one of these -would be a garden path when both examples were judged to be complete sentences. In order for this to happen, the noun/verb/-plural word would have to be used as a noun in one sentence and a verb in the other sentence. If the example was judged to be a fragment, then it is assumed that the noun/verb/plural word was used as the same part of speech in both sentences. !56 For the •:granite rocks" examples, the responses were split into two groups and the same analysis as above was performed on each group. The time reported below is for the test pair which was prediCted to be significantly different. The results of the F test [Snedecor and Cochran 1967] among the sentences which were predicted to be the same was less than three, indicating that there was no significant difference among them. A separate analysis of variance with repeated measures was performed for each set of four sentences in the biased group. (i.e., the G.P., non-G.P. and 2 con­ trols). For the groups which the F test showed were stgnificantly different, the Newman-Keuls test was used to determine which pairs were significantly different. These results are reported below. A Student's t-test was performed on each of the test pairs and across their corresponding controls. These results also supported the findings presented below. A further analysis of variance was performed. using the model Time = Reading Rate(Person) * Difficulty(Sentence) :1o: Error. This model was transformed to log(Rate) + log(Difficulty) + log(Error). The times for the sentences were fitted against this linear model and the standard error was then checked for significance. A plot of the error was performed which confirmed that it was normally distributed. This model supported the findings presented below. These tests were made on each order and across both orders. 3.3.6 Results The results of the test sentences are shown below. I will present each group of four examples and the mean reaction time in 100 ths/ sec for all examples combined from both orders. The star (*) indicates that the sentence was predicted to take longer. All results are at the 1 % level of significance, meaning that there is less than a 1 % chance that the sentence will be found to be significantly different by the 57 F test if they are L-i fact, of equal m.::.==le sentences were presented in monotone through earphones to the subjects, who were then asked to judge the "truth value" of a possible paraphrase. Reaction times were collected for the judgement on the paraphrase to be made. Examples were of the above for~ all of which were intended to be full sentences. If no non-syntactic ~decision shoul~ ·be made, then t;h.ey felt the s.ubjects would garden path on most of t~e examples. If the decision was ~de on a non-syntactic basis, then only half of the . examples would .be garden paths. Their results she.wed that only about 1ial.r of these e_"'l[amples ( 46%) did cause garden· paths. This indicates that the explanations of Fodor, Bever and Garret [.1974] in Section 3.1 are inadequate. Their experiment showed that [ 128] was not a garden path sentence, "'!'?ilst [ 129] was. They then concluded that the subject perceived whether the ini- tial NP would fit the subject slot of the verb. If it would, the subject would accept the verb as the main verb and contillue. If, however, the initial NP. did not fit the subject slot, then the sub~t would· make it a reduced relative clause. The subject would garden path if this non-syntactic decision led to the 1Nrong choice. Horses normally race, so ''.raced" is accepted as a main verb. Likewise, boys read more often than they are read to. Professors instruct normally, but students get instructed, so the analysis here is 1different and people do not garden path on both of [128] and [ 129], nor ·both of the following: ~ 130j The tenant delivered junk mail threw it in the trash. ~131~ The postman delivered junk mail threw it .in the trash. According to Crain :and 'Coker's theory, not all potential garden paths will be actual garden paths, but only .some of them. The prediction of which will be gar- den paths is based on the non-syntactic 'fit" of the NP into the subject slot. No stra- tegy based on limited look.ahead al.one can, always resolve this ambiguity. We have 7.0 seen that people seem to use non-syntactic information in this situation. This gives us another example of non-syntactic information probably being used to assist deci- sions during the parse, which fits our current theory. 4.3.3 THAT Garden Paths The next examples we will look at are based on the ambiguity involved in the word "that". [ 132] I told the girl that I liked the story. [ 133] I told the girl that I kissed the story. [ 134] I told the girl that I know the apple. ~135j I told the girl that I liked in 1978 the story. The problem in these examples is that the word "that" can be either a cam- plemen tiser for an embedded sentence or a relative pronoun. for a relative clause. If the reader takes the word 'that" to start the wrong type of clause, an error can result. The first three examples shown here differ from the previous types in that no matter which part of speech one makes 11that", it is possible to analyse the sentence syntactically, even though it may be semantically anomalous. The problem arises when one version has been chosen, where semantically the other is required. Sentence [ 135] often leads to a garden path. In this sentence, if '~hat" is used as a com.plementiser, then the PP 'in 1978" will be in a syntactically unaccept- able position. Notice that the two buffer lookahead is insufficient to choose the correct ·reading. It may be that readers choose the "complementiser" version and, because the PP is unacceptable, they garden path on this example. Both readings are possible in [ 132]. This means that, like the PP attachment examples, the syntactic processor, even with all the information in this sentence, cannot decide which reading to use. It is accepted that the reading eventually chosen depends on the discourse. If discourse can affect how the sentence is interpreted, then the non-syntactic processor must make the decision, not the syntactic processor. If the syntactic processor were making the decision, it would always make the same 71 decision and discourse would not affect which reading was chosen. There is little data on the amount of processing difficulty people have with these examples. As suggested above, should the wrong choice be made, the sentence could still be finished, even though the 'rule by rule" semantic interpretation may have rejected it. For these examples, the garden path will only be noticed if the non-syntactic processor stops the parse when it makes no sense. If the non-syntactic processor is· continuing, despite being unable to make sense of the sentence, these examples will . be noted as strange, but will not cause garden paths. Similarly, for people, these examples will seem odd, but not cause a garden path. Normal syntactic parse+s have great difficulty with this problem. There seems to be no: syntactic strategy that will always decide which use of "that" to use. The explanation seems to be that people resolve this case using non-syntactic infor­ mation. Since .a syntactic parser uses only syntactic information, it has difficulty with this problem. People seem to be able to resolve this ambiguity easily because they do so on the basis of non-syntactic information. We will return to problems of ''that'' in Chapter 5. To find out whether [132] causes processing difficulty, it was tested in the second experiment. The reaction times showed that people took less time to analyse this than for an unambiguous relative clause reading (see Section 4.8). The subjects informally questioned afterwards said that they took 'that" in [132] to start a sen­ tence, rather tban a relative clause. These subjects said they were not aware of another reading. This suggests that [ 132] does not cause processing difficulty and the embedded 'sentence reading is preferred. with no context. Again, no strategy based on limited lookahead alone can alway-s resolve this ambiguity. Although it seems that non-syntactic information is used to resolve this ambiguity, we do not have enough evidence to establish whether it is non-syntactic information alone. 72 4.4 The Theory of WHEN and WHY 4.4.1 When is Non-Syntactic Information Used? \\e have now seen several cases where a choice between alternatives is probably made by a non-syntactic processor. These are PP attaclunent, reduced rela­ tive garden paths, the ·•plural garden paths" that were the subject of the last chapter, and '1that" garden paths. For :pp attachment, the question is; 'tloes the PP modify the nearest NP or some other NP, or something else''? In the reduced relative, the prob­ lem is; 'tlo we have a relative clause, or a main verb?". For the 'plural garden path" case, the problem;is; "do we have the end of the NP or the main verb". In the 'that" examples, the problem is; "do we have a relative clause or an S-"? Notice that all of these are concerned. with finding the end of the noun phrase. For the examples we have covered so far, these are the only situations where there is a choice of alternatives. All of these examples could also be character­ ised by the question 'Do we have the end of the NP, or shall we make a larger NP?". This is exactly the I end of NP problem". Hence, the answer to the VVHEN question is: 'when finding the end of the NP'~ All other situations that the current parser hand­ les can be resolved without the non-syntactic processor making a decision. 4.4.2 Why must. the Decision use Non-Syntactic Information? For each of the cases we have just seen, there exists a pair of potential gar­ den path sentences. For each of these, we stated that there is no strategy, based on syntactic information and limited lookahead alone, to resolve all possible cases correctly. It .is also not possible to predict which of each pair will be a garden path based on this information. To :be able to resolve the ambiguity and predict the gar­ den path sentence for au examples, the look.ahead would have to be arbitrarily long. Hence, for each of these, there is no strategy that a deterministic parser could follow, based purely on syntactic information, to get these examples correct. ;73 This, then, explains the \NHY part of our que.stio~. !.:o::i-sy:uactic informa- ticn :rr..ay be used to resolve the ambiguity because :h-3 sy:n.t2ctic processor, using only s:yntactic information and limited lookahead, CaIL"'lot. choose th-? ''correct" mean- ing for all .situations. Type PP Attachment Reduced Relative Noun/Verb/Plural That Modifies NP or other NP or other End of NP NP or other .syn~ Help? no no no no Non_Syntax Decides? yes yes yes yes Therefore, the theory of V\7.h.en non-syntactic information is used to resolve ai."'TI.bigui ty and \1\7.h.y at those times, is this: WI!EN - "When syntactic information \/Vith limited lookai~ead is net sufficient to ·determine whether a ~JP should be closed or not. vv-:;:-IY - Because the syntactic processor could err. ROBIE ca..a11 handle all the areas I have investigated, vv"ithout the use of nou-Sylltactic information to make decisions, e..xcept in th2 above cases. It was predicted that non-syntactic information would be needed Ln. those situations which can.not be resolved with two windows of lookahead. Conceptually the parser U.ses the non-syntactic processor to decide when to e.n.d a :NP. ROBIE does not identify these cases automatically, rather, in the current implementation, these checks have been manually inserted. into the grammar rules which reqiifre them. These rules are the ones responsible for the decisions in this chapter. Tnis also indicates the limits of the syntactic p:rocessor. It ha."'ldles all situa- tions that its lookahead is sufficient to resolve in a deterministic way. If the look- ahead is not sufficient to resolve the ambiguity in general, then the non-syntactic processor chooses between the alternatives. It is only these sentences that are eligi- ble to be garden paths. 74 We can now make a prediction of the cases in which non-syntactic infor- mation can influence the parse. If it is possible to build a minimal pair such that the distinguishing point is outside the two buffers,\, it is predicted that the decision will be based on non-syntactic information. 4.5 The New Garden Path Explanation ·We now have a new explanation of a garden path. Previous explanations were based on syntactic information alone and have beo._n shown to be inadequate. The garden path arose when the reader built the wrong structure and then had to ''back.track" to correct it. The proposed explanation for a garden path is this: If a situation is encountered, in the course of reading a sentence, such that there is more than one alternative, but zyntactic informa­ tion, with limited :lookahead, is not sufficient to guarantee the correct choice, then the reader uses non-syntactic information to choose one of the alternatives. The reader does this without regard to the following words in the sentence. If subsequent processing leads to an analysis differing from that demanded by the remainder of the sentence, the reader will "block" later in the sentence. This explanation means that . a garden path is now based on the necessity for a non-syntactic:preference. Thus the prediction of what is and what is not agar- den path is more sensitive to context than the previous syntactic accounts and explains why not all potential garden paths cause a garden path. 4.6 How to Recover from a Garden Path Given the above explanation and account of how people garden path, how are garden paths detected and how do 'people recover from a garden path? The pro- posed answer will be given in the context of the parser. As the parser analyses a sen- tence, it iprocesses the words, building structure and activating the appropriate pack- 75 ets. At some point the sentence will be detected as a potential garden path. This is realised when two rules of equal constraint match. The non-syntactic processor is then called to select which of the two rules to execute. Later in the parse, the parser may discover it has made an error, as no rule matches. This is the point of detection when the psychological 'block" is struck. How is recovery then made? V\l.h.en subjects read a garden path sentence during the experiment, it was surprising how much trouble they had. In experiment two, many subjects told me they could not 'figure out" the sentence when it was a garden path. This shows that recovery from a garden path is, indeed, very hard. If recovery from garden path sen­ tences was as simple as syntactic backtracking, then the subject should not have so much trouble finding the correct reading. I feel this difficulty may occur because the subject is very reluctant to reverse the decision at the non-syntactic level. To recover, the error component takes control of the two processors. This research has not :specifically investigated. this component, hence it is not possible to say what it does. However, the error component could go bac..k to the last non­ syntactic choice l>Oint. Since it knows of potential garden paths, it may have saved the state at the time the non-syntactic choice was made. All non-syntactic choices are binary decisions :and the error component may then simply re-start the analysis from that point, but with the opposite decision. The parse would then amtinue as before. In the second experiment, for the subjects that only had a small amount of trouble, the time to read the garden paths was about 1.5 times the time for the nor­ mal sentence. Also, the point at which the ambiguity was incorrectly resolved for these sentences was approximately in the middle of the sentence, so according to the above theory, tlie person would have to re-scan the second half of the sentence at least. This is c0I1Sistent with the time they have taken. 76 4.7 Summary In this chapter we have answered the questions of VVHEN and V\7HY non­ syntactic interaction may be used by people to make parsing deciSions. This has led to a different prediction of what a garden path sentence is and when it will occur. In the previous chapter, we investigated a case of lexical ambiguity that could lead to a garden path. This investigation suggested that non-syntactic interac­ tion is required to resolve the ambiguity. In this chapter, we have seen that this suggestion also applies to other end of constituent ambiguities, many of which could also lead to a garden path. VVe have also modified the model to account for these results. "Whenever an ambiguity arises, which the syntactic processor cannot guarantee to resolve correctly because of its limitations, the non-syntactic processor could choose which alternative to use. This is contrasted with a theory that suggests that the non-syntactic processor is only used when the syntactic processor has incorrectly resolved an ambiguity. The main difference is that they interact when the syntactic processor might _incorrectly resolve an ambiguity, rather than when it has incorrectly resolved an ambiguity. In the next two chapters, we 'Will investigate lexical ambiguities that do not lead to garden paths and see if the model, as it is now developed, is adequate. However, before we continue our investigation, it is necessary to present experi­ mental data on some of the sentence examples which will be presented. The follow­ ing experiment was designed to provide non-subjective data on how people react to certain sentences to be used in the rest of the thesis, as well as to provide some interesting data for future investigations. i77 4.8 The Second Experiment 408.1 Purpose The purpose of this experiment was to test several types of sentences which were believed to be unusual and for which little experimental data was avail- able, especially the garden path effect. The test sentences were divided in.to several ,groups, some being examples of global ambiguity, some classic garden paths, and some curious examples, reactions to which were of interest. lVI:my of these examples of ambiguity are discussed in Chapter 5, W:hile the garden path effect has been discussed in Chapters 2-4. AB in the first experiment and as was explained in Section 2.2, if the reader encounters an ambiguity which leads to a mis-analysis or confusion, it is believed that more conscious effort will be required than to read a sentence without this ambiguity. This use of-conscious effort can be detected by an increase in reading time for this task. 4.8.2 Task This e:xperiment was a simple collection of reading times, as used by [Cirilo and Foss 1980], [Graesser, Hoffman, and Clark 1980], [Just and Clark 1973]. The 'Subject sat in front of a Visual Display Unit (VDU) and instructions were presented on the screen of the VDU. The subj3ct was then presented with a series of sentences. 1N.h.en he hcrl read and understood each sentence, the subject pressed a key on the keyboard. The instructions were as fallows: You will be shown a series of sentences. You should read each one, and WHEN you have READ it AND understood it, press the RETURN key - it is near the rigli.t of the keyboard. Doing so will show the next sentence in the series. All sentences are syntactically and semantically well-formed. Press RETURN when you are ready to start ... 78 ('.i.ne s-==.tences will app3ar on this ~ine) A T..:=:-a.~ rriJ.cro-computer 1-Vas used to coll::ct reaction th112s. The time was m~asur'=d. fro=i the pr2sentation of 1he sent~nce until the key v .... -as pressed. The next sentence vv-as presented immediately. 4.8.3 Subjacts Tvventy-two undergraduate students from the University of Edinburgh participated in this experiment. Ail were unpaid volunteers. None of the subjects were familiar vvith the notion of global ambiguity or garden path sentences. 4.8.4 Exampl as The examples were arranged L-i tvvo orders to control for sequence effects. Each subject 1vas given one order and each order contained all s2ntences, the sequence of tha sentences being rcn.dmn. Between the test orders, the sequence of each sentence \.~.Tith its control "tvas reversed. The predictions varied v..rith each group of sentences. These will be discussed v....i.th the resllits. The subject was given ten practice sentences before the test sentences appeared. Tne sentences tested were as follows: [136] [137] [138] [139] [140] [ 141 ] [142] [ 143] [144] [145] [ 146] [ 147] Global Ambiguity They can fish. They can walk. They can fruit. He looked up the street. He looked up the address. He lookup up the hill. They are flying planes. They are flying fish. They are flying home. Kissing aunts can be boring. Kissing aunts are boring. Kissing aunts.is boring. 79 [ 148] [ 149] [150] [ 151 ] [152] [ 153] [154] [155] [ 156 J [ 157] [158] [159] [160] [ 161 ] [162] [163] [164] [165] [ 16 6] [167] [168] [169] [170] [ 171 ] [ 172] [173] [ 17 4] [175] [176] [177] [178] [ 179] [180] Garden Paths The boat floated down the river sank. The boat floated down the river quietly. The horse raced past the barn fell. The horse raced past the old barn. The cotton clothing is made of grows in Alabama. The cotton clothing is made in sunny Alabama. "That" Ambiguity I told the girl that I liked the story I told the girl whom I liked the story. I told the girl the story-that I liked. "Have Ambiguity" Have the students take the exam. Have the students taken the exam. Make the students take the exam. Did the students take the exam? Have the students taken in the back room finished yet. Have the students taken in the back room finished off. Sentence Initial "That" That deer ate everything in my garden surprised me. That deer ate everything in my garden last night. The deer ate everything in my garden last night. That birds ate everything in my garden surprised me. That bird ate everything in my garden last night. "What" Ambiguity What little fish eat is wormso What little fish eat big worms? The little fish eat big wormso PP Attachment I saw the man on the hill with the telescope. I saw the man on the hill with the large tree. I saw the man with the large hat on the hill. I know the man on the hill with the telescope. I know the man on the hill with the large tree. I know the man with the large hat on the hill. Noun/Modal Ambiguity Let the paper note be read. Let the paper will be read. Will the paper can be re-used? Will the paper bin be re-used? 80 4.8.5 F?-3SUit s A s~ud~nt's t-t~st \\:-2S perforn:ed en each :pilr of .s2:::t~JJ.c~s, for each order and for the c0s":in2d res-:ilts. Listed below .3Ie the =ie~-i ti::.::::::s (i:i hu..:."ldredths of a second) for all seEtences. The result are presented L.-1 -che ta"bl2s b2lcY\.'..T \A.<·ith the sen- tence num.ba:-s referring to- the previous axamples. Si::.:.ce there are so ffi.5..:."1.Y examples, I . - - • • I will discuss them LTJ. their logical groupings. As in the first ex:perimen t, we chose the .01 levei Gf significance for this experiment. In the column on the right is the n1J.ID.ber of rules that ROBIE ran to analyse the same sentence. In Section 6.1, we vvill see that the relative parsing times of a model of the HSPM should match the relative times of human subjects. These figures are given to assist the reader in making this comparison. The phrase "failed at X'~ means the parser 'blocked" or garden pathed at that point. Only the relative ranking of this measure should be compared and no claims of further corn:me...11t vvill. be made on this measure. Sentence [136] [137] [138] Mean Time 214 229 327 Rules Run by ROBIE 1 1 1 1 16 T.nis was the first set of global ambiguity examples and v\rill be discussed in Chapter 9. It v\:-as predicted, for reasons whicll. -will be explained, that there would be no significant difference between the first sentence and each of the other sen- tences in the triplet. Sentence [ 138] was :semantically odd to :mai.J.y subjects, indicat- ing the preference for the use of 'can" as an auxiliary verb. In fact, several of the subjects chuckled when they read it. There was ,no significant difference between [136] and [137]. However, [138] was significantly different from (137] and from [ 136] at the p< .05 level. This level of. signifieance was not accepted for this experi- ment, but further experimentation may produce a different result. Since the times 81 fo:::- all sen:.ences were statistically the S22!1-2, a t:h.~ory t:b .. 2t autom..a.t:ic backtracking is u.s~::l to find all readings of an ambiguous s2nt2nc::! should be rsjected, since back- tr2ckL--ig s:10uld require 2Il increase in reading tim..2. Tne e:xperi:rn..ent did not test which reading or readings the subject percei V·3d. Sentence [139] [140] [ 141 ] Mean Time ',248 298 262 Rules Run by ROBIE 18 18 18 There was no significant difference in reaction tim_es among the three sen- te.iJ..ces. Sentence [142] [ 143] [144] Mean Time 255 272 268 Rules Run by ROBIE 17 17 17 A.gain, there was no significant difference in reaction times among the t:t1ree sentences, nor as before, was any difference predicted. Sentence [145] [ 146] [147] Mean Time 304 323 212 Rules Run by ROBIE 12 1 1 11 Sentence [145] had one more word that the other two sentences, making a comparison difficult. Even so, there was no significant difference between [145] and [ 146]. Sentence [ 147] was significantly faster at the 1 % level than either of the other two. Sentence Mean Time Rules Run by ROBIE [148] 938 failed at'"sank" [149] 371 25 [150] 1013 failed at "fell" [ 151 ] 305 25 (152] 923 failed at "grows" [153] 404 22 82 Th~se 2.:'e the garden path sentences which v.re:re discllised i:!.2 Ch.::.pt·~rs 2-4. It was predk:ted that the first sentence of each pair would cause a garden path, but not the second. The first of each pair was .significantly different from the second at the 1 % level. These sentences proved to be ext~emely difficult for the subjects. After look.i:n.g at t~e gc.rden path,, examples for several seconds, ~J.Y subjects 2.Sked v..rhat to do if th~y could not understand the sentence! The experiment did not· te.5t vvhether the subject had really analysed the sentence properly. A few subj3cts admit- ted afterwards t:--iat they couldn't understand the sentence, so went on to the next sentence. TI-iese shov.r that the first of each pair causes a garden path, \1Vhi1e the sccond ~n tence of each pair does not. Sentence [154] [155] [156] Mean Time 310 450 365 Rules Run by ROBIE 36 36 36 These Examples were to test the difficulties i.vith ''that" as a relative pro- noun versus ''that" as a complementiser. In Section 5.2.10, we v._rill see tr...at this is one of the major am'biguities in which "that" can be involved. The results here will be important to our discussion in that section. The ambiguous version ([ 154]) was si...g- nificantly faster at the p< .05 level than the two relative clause readings. The level of significance which can be used for a particular experim.ent is dependent upon the total number of examples in that experiment [Snedecor and Cochran 1967]. Because of the large ntimoer of examples in this experiment, we cannot accept this level of significance. This may indicate that- the globally ambiguous example is, in fact, easier to process than the relative ~lause example. The subjects questioned afterwards said that they did net notice the possible relative clause reading for the first sentence. The experiment i.-V"as not design~ to test, which reading the subject obtained, but I 83 f~el this su:r:;ports the theory that the subject preferred the :::mbedd-=d sentence read- ing. Sentence Mean Time Rules Run by ROBIE [ 157] 369 34 Imperative [158] 268 24 Yes-No-Question [159] 364 53 Imperative [160] 279 1 23 Yes-No-Question In Section 5.2.11 and Section 9.2 we investigate ambiguities involving the 't .. rord 'have". In these sections, it is importaTJ.t whether certain e..xamples involve an increase in reading time. It was predicted that these ex21I.lples would show that there was no tLllB difference between reading 'have" as an imperative versus ''have" as a yes-no-question. There was no significant difference betv11-een [ 157] and [ 158]. Tnere was a slight order effect in that the first s2Il.tence of each pair presented took longer than the second sentence. Sentences [159] and [160] showsd the processing difference of imperatives versus yes-no-questions, (significantly different at the p< .05 level). In both sets, the Imperative was one second slower than the Yes-No- Question. Tnis suggests that the 'have" examples ~\i~ere not different from the normal examples. Sentence [161] (162] Mean Time 832 930 Rules Run by ROBIE 40 40 These sentences were included to provide data for the discnssion in Section 9.2. There vvas a very definite order effect with these examples. The times given are those obtained when the example occurred. fiist in the order. However, when the e..xample occurred second, the times were 300 and 490 respectively. The difference between the sentences was not significant. It Seems that when the subject saw the first sentence of the pair, irrespective of which sentence it was, they had a consider- able amount of trouble. One can see that the time was almost exactly the same at the 84 , i first exposure. These times showed t:h.c.t the subject h2.d probl~m..s with these sen- tences the first tiIP..e, but then learn'3d most of th2 exa:.-r..pl~, ~vhich assisted the understanding of the similar sent2nc2 the next ti:m:?. It is not possible to tell whether the problems wer~ due to the reduced relative, or "have''. Sentence [163] [ 164] [165] [166] [167] Mean Time 1263 367 393 665 363 Rules R~n by ROBIE failed at "surprised" 34 34 42 34 These examples also relate to ''that" ambiguities as discussed in Section 5.2.10. JYfil"cus [:rvhrcus 1980] suggested that [167] was a garden path. The reader should use ''that" as a determiner, rather than start the initial embed~ed sentence. Sentences [ 163] and [ 164] were significantly differeilt at the .01 level. [ 166] also had the sen. ten tial. subject problem and the longer time may be because of the struc- tural processing difficulty in this type of sentence. In the parser's analysis, the phrase ''birds ate everything in my garden" is first a.11alysed as a sentence, and then made into a subject NP. Tiris analysis requires extra rules for the parser, and may be part of the reason people also required a longer time to read this sentence. It was significantly different from (165] and [167] at the .01 level. [154] and [167] were predicted to be identical in processing time. [165] should also have the same time and is not significantly different from [ 167]. Sentence [168] [169] [ 170] Mean Time 432 479 404 Rules Run by ROBIE 31 23 23 These examples illustrate potential difficulties "With the word 'what", as discussed in Section 5.2.7. ·The~e examples were to test whether the first sentence caused a garden path. These times showed that the first example was not a garden 85 path. ROBIE degraded the initial sentence; 'vvnat little fi3h eat" iato the subject of the n1ain. verb: ''.is". This analysis predicts mo:re rnles \t\rill i·lu"'l and o!le can see that ROBIE's results reflect this~ There was no significant difference between the three. The third sentence occurred ,im..l!led.iately after a garden path 2xa.i.-nple in the second order. Th.is increased its time in that order. In the first order, the second sentence ~ occurred after a garden path, and this seems to have increased its ti 1112 as well. Sentence [ 171 ] [172] [ 173] Mean Time 358 372 427 Rules Run by ROBIE 37 38 38 These .sentences· ara globally .ambiguous and they are generally considered to have several potential readings. The purpose of these examples was to see whether this global ambiguity caused an in.crease in processing time for the sentences Although these results are interesting, they vJill not be discussed elsewhere in this thesis. There was no significant difference between these exili11ples. One can see that the times overlap with the ones below (main verb "know'1. Sentence [174] [175] (176] Mean T:lme 346 !-l43 439 Rules Run by ROBIE 37 38 38 Sentence [174] was significantly faster than the other two, p< .01. This result was unexpected and may be due to the different subcategorisation of 'know". These examples were included out of curiosity only and will not be discussed. ROBIE's simple use of semantic markers for PP attachment can not fully model how people presumably decide upon PP attachments, and no attempt was made to accu- rate]y model this. A.s a result ROBIE may have produced a different analysis than the subjects. 86 Sentence [177] [178] Mean Time 292 9163 Rules Run by ROBIE 28 failed at "will" This a..rl.d the following pair illustrate the difficulti2s of noun/modal ambi- guities. In Section 5.2.8, we shall see why th:sa are predicted to be garden paths. It was predicted that the second exa.-rnple would cause a garden path because of the fragment 'will be". The reaction time indicates that the subjects did garden path on this example, as did ROBIE. [177] and [178] were significantly different at the .01 level. Sentence [ 179 J [ 180] Mean Time 623 413 Rules Run by ROBIE failed at "can" 28 These are essentially similar to the above pair, but this time, the problem was 'can be". Again, the garden path effect is evident, though not as strong. Sen- tences [179] and [180] were significan.tiy different at the p< .05 level. 4.8.6 Summary This experiment has provided non-subj2ctive reading times for many sen- tences. Several of these results "Will be important in the follo1'\Ti.ng chapters: a) Garden path sentences do require a much longer read.jug time relative to their non-garden path partner. For most examples the reading time for the garden path sentence was much greater. b) The 'have" examples did not cause a garden path effect or show any sig- nificant difference in reading time. c) The noun/modal examples exhibited the garden path effect. 87 fo::: :ii.scu.s.sion la.t'=:r in th.2 t-2.:::·.:t. It should be noted that in the case of a globally ambiguous sentence, the parser returns a parse for only one reading. The reading which is selected can be considered to be arbitrary but is influenced by the principles of minimal Attachment and Right Association as in Section 8.5. 88 The Role of Syntactic Context in Handling Part of Speech Ambiguity In the previous chapters we have looked at resolving cases of lexical ambi­ guity which can lead to a garden path. This has suggested a model of the syntactic processor with limited lookahead. In this chapter we 1.AJill look at C3.SP-S of lexical ambiguity which do not lead to a garden path and investigate whether our model, as developed so far, can handle these cases. In doing this, we vvill see wh...ich examples of local ambiguity can be resolved on a purely syntactic basis and look at resolving part of speech ambiguity in greater detail . .Although many high level constituents can be "moved" in English, the lower level structure of some constituents is relatively fixed. For example, after a determiner, one expects a noun rather than a verb. In this chapter we also wish to ask, 'How might this low level fixed order assist in the resolution of ambiguity?" We "Will not give a definite answer to this question, but .will see that it is extremely useful in the resolution of ambiguity. The examples of ambiguity shown in this chapter seem to cause no apparent problems to a person reading them. That is, all of these examples read easily and cer­ tainly do not exhibit the garden path effect. If ROBIE is to be psychologically plau­ sible, then it is desirable that it handle these examples in such a way as to explain -why people have no apparent difficulty with most sentences, despite the inherent ambiguity in them. Occam's Razor tells us that, if there are two solutions to the same problem, then the simpler solution is preferred. If it is possible to handle all the examples of local ambiguity presented here, with no additidnal mechanism, device or feature than is needed for ordinary sentence parsing, then our goal above can be considered met. One possible explanation for people not noticirig local ambiguities may be that there 89 is no special mechanism needed for them, so that nothing differing from normal pars­ ing is necessary. Conversely, if it is necessary to add special mechanisms and routines to the parser just to handle theSe examples of ambiguity, then this will not explain how people can understand these examples so well and can be considered a weakness in the model. To say part of speech ambiguity can be handled deterministically, but with the use of special mechanisms would be no surprise and not very important. To say one can handle part of sp€€Ch ambiguity deterministically with no special mechan­ isms is a more significant claim. In this chapter it is indeed suggested that many cases of part of speech ambiguity can be handled by the parser Vl7i th no special mechanisms. It should be noted that a non-deterministic parser does not need to tackle the problem of local part of speech ambiguity. If it should make an error, then it can backtrack and correct it. However, to handle ambiguity deterministically, we must never make an error. '-.Ne vvill also see that many cases of ambiguity can be resolved usillg standard techniques which have been applied to non-deterministic parsers. 5.1 Syntactic Context 5.1.1 Word Dat.a Structures As a first approach to handling ambiguity, I asked, "If we construct a com­ pound lexical entry for each word composed of the features of each part of speech the word can have and make no. alterations to the grammar, how wide a coverage of examples will we get? This approach was uSed by [VVin.ograd 1972] and was found to be very effective for the following _reasons. If words have all the possible relevant features, then the tests for all 'Possible parts of speech which a word can be used as will 90 succeed. In this way, all applicable rules will match. It may be that often only one rule will match, or that the first rule tried is the correct rule. The question is, how often will the rule which matches, be the correct rule? In Section 5.1.3, I will explain the answer to this question. Firstly though, it is necessary to explain how the word definitions were altered. All words in ROBIE are .defined in the syntactic dictionaries. Each word has a compound lexical entry incorporating all the features for all the possible parts of speech which the word could be. This is iexactly as was done by [\Ni.nograd 1972]. For example, ''block" is defined as a noun and a verb, "can" is defined as a noun, auxi- liary verb, and verb and 'hit" is defined as a noun and a verb. The features for each of these parts of speech are kept in· the dictionary and, when the word is looked up, they are returned as a single ordered list of. features. These features are sub-grouped according to the part of speech they :are associated V\ti.th. Hence, when the word ''block" is looked up, the result returned is both the noun and the verb definition. In this way, all possibilities are returned. Below is an example of a dictionary entry: word: block noun features: noun,ns,n3p,ngstart verb features: verb1v-3s,tenseless In the above example, the word 'block'' has the features noun, noun singu- lar, noun third person, verb, tenseless verb and is marked that it can start a noun phrase and as a verb it agrees 'with any noun which is not 3rd person and singular. This multiple meaning is then carried during the parse, until the word is disambi- guated. . In the English language, most words can be several parts of speech. This fa~ must be reflected in a parser of English and we do this with the multiple mean- ings above. 'When the parser has enough irl.f ormation to decide which is the correct one, it ignores (removes) the other possibilities. In this way, we have not built structure which is later thrown away,.rather we have reflected an inherent parallel- ism of language. In Chapter 6 we will examine the psychological basis for this 91 approach. 5.1.2 Morphology The first part of the disambiguation process takes place in the morphology. 'When ROBIE identifies a word which has a morphological ending, the morphology must adjust the features of the word. For example, when ''blocked" is identified, the feature 'ed." must be added to the list of features for ''block". At the same time, a por­ tion of the disambiguation takes place. If ''block" is defined as both a noun and a verb, then "blocked" is not a noun. The morphology causes some features to be added, such as 'ed., past" and some features to be removed such as ''tenseless". As features which are no longer applicable are removed, so also are parts of speech and their associated. features which are no longer applicable. For ''blocked", the features 'noun, ns, n3p" will be removed and the features "ad~tive, ed, past" will be added. Similarly for the word ''blocking". This cannot be a noun, so only the verb and adjective definitions are _carried forward. Wnen the morphology causes adjust­ ments to the features for ''block", by adding the features 'en" and "adjective", it must also alter the . tense and number. At this time, other features that are no longer appropriate such as 'noun" will be removed. The morphology- will identify words such as adverbs, adjectives and verbs in a similar way. The morphology which is used is very similar to that of [VVinograd 1972], [Dewar, Bratley and Thorne 1969] and the part of speech additions and dele­ tions are taken from [N.rarcus 1980]. 5.1.3 Disambiguation The dictionary and morphology now give all the possible features for each word. vve will see in this chapter that having the dictionary return the definitions in this way greatly assists our handling of ambiguity. The psychological motivation 92 for this is discussed in Chapter 6. The next step is to add a method for disambiguat­ ing the words. This is done by pattern matching of the grammar rules as follows. Each rule matches the features of one or two buffers. If the word ''block" is in the first buffer, then a:pattern [noun] or a pattern [verb] will match. These pat­ terns do not relate to the other possible definitions of a word. If a rule pattern has matched on the feature 'noun" in the first buffer, then ROBIE assumes that this word is a noun. It would then be appropriate to disambiguate the word as a noun. This is exactly as in [VVinograd 1972]. For every grammar rule which disambiguates a word to a certain part of speech, that part of speech is the same as the part of speech which the pattern of that rule assumed the word was. This means that the technique of having the patterns disambiguate the words is:the only one necessary for ROBIE. (.Although this is how the parser's disambiguation works in principle, in the actual implementation, the grammar function ATIACH performs the disambiguation. This will be explained in detail in Chapter 8.) In a non-deterministic parser it is not essential to find the correct rule first. If the parser runs an incorrect rule, the parser may backtrack and change the category assignment. But in a deterministic parser, there will never be any back­ tracking and this solution cannot be used. Since ROBIE .does not back.track, there;is little danger of the pattern match­ ing disambiguation ma.k:ing an error. Once .a 'rule runs assuming a buffer contains a certain part of speech, it must be use as such in the parser. The general disambigua­ tion scheme is; if a full pattern matches a word as a certain part of speech, then it is disambiguated as that part 'of speech. The compound lexical entries and pattern-matching disambiguation alone will handle many examples of ambiguity. In the rest of this chapter we will see just what this can do for us. 93 5. 1.4 An Examp/ e Given the above mechanisms, multiple definition and disambiguation by the pattern matching, let ~ see how a few examples are handled. Consider: [181] The falling block needs painting. "We will look only at the words ''.falling" and "block" in this example. The wurd "falling" is defined as a verb and an adjective in the dictionary and ''block" is defined as a noun and a verb. 'Whilst parsing this example, after the word ''the" has initiated an NP and been attached to it as a determiner, the rules to parse adjectives are activated.. The rule ADJECTIVE has the pattern: [adj], and matches the word ''.falling". 'Falling" is then attached and disambiguated as an adj3ctive. Recognition of ''.falling" as a verb does not occur. As there are no more adjectives, ROBIE will activate the rules to parse the headnoun. The rule NOlJN with the pattern [noun] will match on the word ''block" and it will be attached as a noun. Hence ''block" will also be disambiguated without the verb use being considered by ROBIE. Other ambiguities inside the noun phrase will be handled in a similar way. This approach alone will cover the situation of singular head nouns, verb/adjective ambiguity and most other pre-nominal ambiguities. This works because the noun phrase has a very strict word order. VVhen an ambiguous word is found, only one of its meanings will be appropriate to the word order of the noun phrase at that point. This approach can be thought of as an extension of the basic approach of the Harvard Predictive Analyzer [Kuna 1965]. This strategy will also often disambiguate main verbs. For example in:- [ 182] Tom hit N.fary. [ 183] Tom will hit :rvfary. [ 184] The will gave the money to :Nlary. In (182], 'hit" is the main verb. In the dictionary, 'hit" is also defined as a noun, (as in card playing). The parser will attach 'Tom" as the subject of the sentence 94 and then activate the rules for the main verb. Since 'hit" has the feature "verb", it will match that rule and be attached and disambiguated as a verb. Again other possi- ble parts of speech are not considered. The word 'Will" could be a noun or a modal as sentences [183] and [184] demonstrate. In I 183], 'Will" cannot be part of the headnoun with 'Tom", so the NP will be finished as above. The rules for the auxiliary will then be activated and the word 'Will" then matches the pattern [modal] and is attached. to the AUX. In [ 184], the word 'Will" is used as a noun. Since it follows the determiner, then the rules for nouns will be activated.. The word 'V..'i.11" then matches the pattern [noun] and attachs to the NP as a noun. The same approach will also disambiguate 'Stop" and 'run" in the following sentence. Since "stop" is sentence initial and can be a tenseless verb, the rule IlvlPERA- TIVE will match and it will be disambiguated as a verb. The word 'run'~ which can be a noun or a verb will be handled as 'Will'' in [184]. [ 185] , Stop the run. 5.1.5 The Word TO Now let ·us consider a more difficult example, the v\rord 'to''. 'To'' is defined as a preposition and an auXiliary verb in ROBIE, as illustrated by these sentences: [ 186] I want to kiss you. [187] I will go to the show with you. In [ 186], "to" is the infinitive auxiliary, whilst in [ 187] 'to" is a preposi- tion. This analyS:is is based on that of [JVBrcus 1980, p. 118]. Our two buffer look- ahead is sufficient to disambiguate these examples. The buffer patterns for the above sentences are: [to][tenseless] -> embedded'VP [to][ngstart] -> PP 95 By looking at the following word, '1to" can be disambiguated. In [ 187], the word 11the" cannot be a tenseless verb, so the first pattern does not match. In [ 186], the second buffer does not have the feature ''ngstart", so the rule doesn't match. I However, the above patterns will accept ungrammatical sentences. To rej2ct ungrammatical sentences, we can use verb subcategorisation as a supplement to the above rules. One cannot say: [ 188] *I want to the school with you. [18S] *I will hit to wash you. In English, only certain verbs can take infinitive complements. 'To" can only be used as a auxiliary verb starting a VP when the verb can take an infinitive complement. Hence, by activating the rules to handle the VP usage only when the infinitive is allowed, the problem is partly reduced. Also by classifying the verb for PPs with the preposition 111:0 11, the problem is simplified. This is merely taking ad van- tage of subcategorisation in verb phrases. To allow 'to" to start a VP, when it is not allowed, would be ungrammatical. Taking advantage of this fact greatly reduces, but does not eliminate, the possible conflict. In ROBIE, the subcategorisation of verbs for infinitive complements has been implemented, but the verbs are not fully marked as to the type of PPs they will accept. vve have seen what to do if the verb will only accept a toPP or a VP. The final difficult situation arises whenever the following three conditions are true: 1) the verb will accept a toPP and a toVP 2) the item in the second buffer has the features ''tenseless" and ''.n.gstart" and 3) the toPP is a required. modifier of the verb. It seems that there are very few verbs which have this subcategorisation [Gazdar, personal communications] and the distribution of words with toVPs and toPPs seems different, so this problem rarely arises. "'When 1t does, the principle of Right Associa- tion and Mnimal Attachment apply as discussed in Chapter 6. A free text analysis done on a oover story in TINJE magazine [TIME 1978] resulted in 55 occurrences of the word '1to". The two rules mentioned above in con- 96 junction with verb subcategorisation gave the correct interpretation of all of these. These rules were also checked on the lMECHO corpus (Appendix B) and the ASHOK corpus [JVB.rtin, Church and Patil 1981 ]. There were no violations to these rules in either of these. For a full explanation, see Appendix c. 5.1.6 Adjective/ Noun and Noun/ Noun Ambiguity Some readers may wonder how adjective/noun ambiguity and noun/noun ambiguity are handled in ROBIE. As stated in the introduction, this research has not investigated semantic problems, within a single part of speech. Therefore this research has not investigated noun/noun ambiguity. ROBIE does handle adjective/noun problems for the IVJECHO world and this approach is explained below. Theoretically, a nounphrase can have an infinite number of adjectives. In order to implement this in the parser, the packet PARSE_4J)J is activated after the determiner for the NP has been attached, or at the start of the :J\1-P if there is no deter- miner. The packet has essentially two rules. The first rule vvi.11 attach an adjective in the first buffer to the partial NP which is the current active node. This rule may then apply if there is more than one adj3ctive. "When there are no more adj3ctives in the first buffer, the second rule matches. This rule is of low priority and has the pattern [ t ]. It will deactivate the pack.et PARSE_ADJ and activate the pack.et PARSE...NP. Adjective/Noun ambiguity can be characterised as: should the next word be attached as an adjective, or should the pack.et be deactivated? Adjective/noun ambiguity is handled in a simple minded way. If the word following the ambiguous adjective/noun word can be a noun, then the ambiguous word is used as an adjective. In other words, all conflicts are resolved in favour of the aijective usage. This problem arises in these examples: (190] The plane is inclined at an angle of 30 degrees above the horizor1t.aJ. ( 191] A block rests on a Sihboth horizontal table. 97 In [ 190], "horizontal" is a noun, while in [ 191], it is an adjective. The above algorithm handles these cases. For sentences such as [192], with noun/noun ambiguity, the syntax in ROBIE is a flat structure. [ 192] The soup pot cover handle is hot. In ROBIE, the semantic interpretation receives an ordered list of headnouns. This ambiguity is then left to be resolved by the semantic interpretation component. The semantic representation looksJike this: headnouns for NP1: soup, pot, cover, handle The semantic inferencing can then do whatever is needed for the applica­ tion of the system. Since the non-syntactic processor is extracting the meaning of the sentence in parallel with the syntactic analysis for normal sentences (see Section 4.1), the semantic interpretation of the headnouns can be performed as the list of headnouns is built. In the current parser, the :rv.IECHO semantics does this as a second stage. This mechanism· is the same as the markers used to perform the evaluation needed by the Semantic Checking Hypothesis. ROBIE makes no attempt to :resolve all cases of adjective/noun ambiguity and often treats an ambiguous word as part of the complex headnoun. I feel that a better understanding of what people seem to do with adjective/noun ambiguity needs to be gained before it can :be dealt with in a psychologically plausible way. For this reason the approach outlined above is used, although it is not intended to be psychologically plausible. 5.1.7 Why Does This Worlst ambigui ti.es are not recognised by people because only one of the potential ambiguities is grammatical. In many :situations, when fixed constituent structure is taken into account, other uses of an ambiguous word are not possible and probably not even recognised. Since fixed constituent structure rules out most alternatives, we have been able to handle the examples in this chapter without any special mechanisms. In the introduction to this chapterJ it was stated that a clean and I - - ·~· - - - - simple method of handling ambiguity was desired. I feel that this goal has been met for these examples. 5.2 The Role of Agreement in Hand/ ing Ambiguity 99 Using the simple techniques presented in the last sections, we can handle many cases of part of speech ambiguity, but there are many examples we can:-:.at resolve. For example, the second of each pair of sentences below would be di.smn."":::·i- guated incorrectly. [193] I know that boy is bad. (194] I know that boys are bad. [ 195] "What boy did it? [196] Wh.at1boys do is not my business. [ 197] The trash can be smelly. [198] The trash can· was smelly. ?v'.lany people wonder what role person/number codes and the ·relativ::ly rigid constituent structure in the verb group play in English. Llnguists can desc::-·~ be these, but cannot produce an adequate explanation of why they are there. In tl"ris section we will look at these mysterious items from a processing viewpoint. \A.le i::..:-ill explore their role by attempting to ansvver the question, '""1.hat use is the ·fi.:-\:: 3d. structure of the verb.group and person/number codes." First, however let us llvW look at how N.arcus's parser handled a few more examples of ambiguity. 5.2.1 Marcus' s Diagnostics :rY.mcus [ 1980] did handle some part of speech ambiguities. The words ''to", ''for'~ "what", "which'~, ''that", 'a'~ and 'have" could all be used as several parts of speech. For each :of these words he also used a 'Diagnostic" rule. These Diagnostic rules matched when the word they were to diagnose arrived in the first buffer;po.:i- tion and the appropriate packets were active. Each diagnostic would examine the ' features of the three 'buffers and the contents of the Active Node Stack. Once C::i:e diagnostic decided which part of speech the word was being used as, it either add9d the appropriate features, or explicitly ran a grammar rule. N.Jarcus did not. give each word a campound lexibll entry as we have done here. M>st of the igrammar rules in his parser were simple and elegant, but th'e diagnostics tended to ·be very complex and contained many conditionals. In so:ne 100 cases they also seemed rather ad-hoe and did not meet the goal of a simple, elegant method of handling ambiguity. code) For example the THAT-DIAGNOSTIC :- (ignore the details of the grammar i [that][np] -> in the Pack.et CFOOL "If there is no determiner of second iand there is not a qp of second .and the nbar of 2nd is none of massn;npl and 2nd is not-modifiable then attach as det else if c is nbar then label lst pronoun, relative pronoun else label lst complementiser.'' [from :Mrrcus 1980, p~ 291] Notice that if the word 'that" were to be used as a determiner, then it would be attached after the NP was built! This is his primary rule for disambiguat- in.g the word 'that". :Mn-cus's parser also had three other rules to handle different cases. The WHICH-DIAGNOSTIC was more simple, but still a special case: [which] -> in the pack.et CPOOL '1f the NP above c is not modified then label lst pronoun, relative pronoun else label 1st quant, ngstart,ns,wh,npl" It seems that these rules did not 'elegantly capture generalisations" as did the rest of his parser. I consider these rules undesirable and feel that they should be corrected to .comply with my criteria for simple and elegant techniques in resolving ambiguity. I wanted a method which used no special mechanism, or routine, other than that needed to parse grammatical sentences. These diagnostics are certainly spe- cial meChanisms and do not meet this goal. Can we cover the same examples in a more simple and principled way? In this section, we Win look at each of these diagnostics in turn and show how they have been replaced in the newer model. \M3 will also look at a few other examples of ambiguity which Marcus did not handle, but are related to our discus- sion here. 101 5.2.2 Handling the Word TO Niarcus's diagnostic handling of ''to" can be replaced by the method outlined fill Section 5.1.5 This method was motivated to handle grammatical sentences and meets our criterion for a simple approach. 5.2.3 Handling the Word FOR The problem with the word "for" is in deciding whether it is a preposition or a complementiser, as the standard Chomsky analysis would have it, in the follow- i.ng examples: [ 199] ?I want for John to go. [200] ?I want for John to hit l\vm'y. [ 201] I preferred for John to go. [202] I preferred for John to hit !vm'y. [203] I wanted a flower for V.Lary. [204] I want the paper for Wednesday. In [ 199] and [200], 'for" is a complemantiser with "to", whilst in [203] and [204], it is used as a preposition. A conflict could arise in the use of this word only when the verb is sub- categorised for both a forPP (a PP with the preposition ''for'~ and a For-To embedded. sentence, i.e., both the following sequences are possible: V forPP to\TP VforPP toPP There do not seem to be any verbs in this class. In ''.British English", sen- tences of the form [ 199,200] are unacceptable. This case would be the most difficult to handle for ROBIE of the cases we have discussed. It is interesting that al though it is al.lowed. in some dialects of ''.American. English", it is not allowed in ''.British English". The only time when a conflict seems to ari~ is when the verb appears in the context: VNP forPP V NP forPP to\TP 102 for example: [ 205] I want a horse for John. [206] I want a horse for John to rida To handle this case, JV.arcus built the NP (John) vvith an Attention Shift before he combined the NP with 'for" into a PP. Before he ran the PP rule, his parser tried to match the following rule:- [for][np][to] This rule could then clearly disambiguate. the word "for". The solution worked well with the use of the Attention Shift and a three buffer lookahead. Nei- ther are allowed in ROBIE. In fact, given the Chomsky analysis that these are cam- plementisers, there is no way to distinguish between the use of ''for" as a cam- plementiser or .as a preposition unless we look past the NP to see if the .word "to" is present. It seems that there is no way to handle this case without three buffers of lookahead and the Attention Shift, but this linguistic analysis is not the only one possible. Several linguists have suggested that 'for" is actually a preposition in both situations, e.g. Gazdar [Gazdar, Pullum, and Sag, 1980c]. He treats the forPP in (199] as a PP and part of the VP for 'want", not as an embedded sentence. By doing this, 'for" is not used as a complementiser and the ambiguity disappears. Other linguists e.g. Chomsky [Chomsky 1965], suggest that the ·pp is part of the verb phrase. Let us assume that 'for" is always a preposition and see how these examples may be handled. (As we shall see, this is a descrip~ion of what ROBIE currently does.) It is very easy to treat the forPP as part of the verb phrase, so we will look at how to build the· embedded sentence analysis. The parser will make the word 'for" a preposition and start the PP as a nor- mal PP. After the PP has been fully built and dropped back into the first buffer from the Active Node Stack, the state of the buffers will be: 103 [forPP][to] At this point the rule·deciding the attachment of the PP will be active and the NP ("a horse") will be the bottom item in the Active Node Stack. Simultaneously, the rule that would have run to detect the For-To complementiser above, would be active. "We need only to alter this rule so that it has the pattern: [PP & for][ to] The embedded sentence will now be started and the NP of the PP will be the subject. The semantic interpretation of the embedded sentence uses the "for" in the subject as it sees fit. Analysis is then possible without tha need for an Attention shift, or the three buffers. This is one of the cases in Section 2.5.2 which JY.mcus required. three buffers to handle. If we adopt the analysis of Gazdar above, we can handle this situa- tion with only two buffers, whereas Chomsky's analysis is incompatible with ROBIE. It is interesting that both analyses are possible for ROBIE, but the Chomsky analysis requires two additional items which the Gazdar analysis does not need, i.e., three buffers and the Attention Shift. For ROBIE, we have adopted Gazdar's analysis of the word "for". Following this.approach, ''for" is always a preposition, making the resulting grammar more simple, and the need for !Varcus's For-Diagnostic disappears. 5.2.4 Ungrammat. ica/ Sentences Before we proceed, let us look at an assumption l\fI::rrcus made in his parser, that it would be given only grammatical sentences. This assumption makes life easy for someone writing a grammar, since there is no need to worry about grammatical checking. Hence he did not cater for ungrammatical sentences and the original parser accepted. such examples as: [207] *a bloCks are red. [208] *the boy hit the girl the boy the girl. [ 209] *are the boy run? 104 This simplification causes-no problems in most sentences, but can lead to trouble in more difficult examples. If the parser's grammar is loosely formulated because it assumes it will be given grammatical examples only, then ungrammatical sentences may be accepted. If the syntactic analysis accepts ungrammatical sentences as grammatical, then it is ~ an error. In .the next sections we vvi.11 look at the I consequences of this assumption ~ well as those of rejecting ungrammatical sen- tences. 5.2.5 Subject/Verb Agreement vve know that the verb group has a complicated but relative~y fixed con­ stituent structure. Although verbals have :many forms, they must be mixed in acer- tain rigid order. We also know that the first finite verbal element must agree with the subj:ct in person and number. That is, one can.not say: [210] *The boy are run. [ 211] *The boy will had been run. [212] *The boys had are red. etc. 'Whilst IV".iarcus•s parser enforced these observations to some extent, he did not follow it throughout his parser. '\Ve want to enforce this agreement throughout ROBIE. Checking the finite or main verb, ·to be sure that it agrees in number with the subject, will lead to the rejection of the above examples. This was done by adding the agreement requirement into the pattern for each relevant rule as will be explained later. Buffers 1 and 2 must agree before a rule relating the subject and verb or two verbs can match. This check looks at the number code of the NP and the person/number code of the verb and Checks whether they agree. The routine for subjact/verb agreement is very general 'and is Used by all the subject/verb rules. The routine can only cheCk the grammatical features of the buffers and could be done by expanding the buffer feature patterns. See SeCtion 8.6 for the full details of this 105 agreement eh~ 5.2.6 Plural Head Nouns \Ve saw in Chapter 3 that the case of a word which can be both a noun, plural and a verb, singular (noun/verb/plural word) after at least one singular head- noun can lead to a garden path by the end of the sentence. Let us go back and look at a simpler case of this. \Ve saw only one subcase of all the logical possibilities for combinations of words which can be both a noun and a verb. We 1/Vill now look at these possibilities and see that these cases can be disambiguated. by simple rules using subject/verb agreement. The following examples illustrate all the possibilities: [213] The soup pot cover handle screw is red. [214] The soup pot cover handles screw tightly. [215] *The soup pot cover handles screws tightly. ~216j The soup pot cover handle screws tightly. ~217 ~ The soup pot cover handle screws are red. Each of the words "pot,cover,handle,screw" can be either a noun or a verb. The ''end of constituent" problem is to find out which word is used as the verb and which words make up the complex headnoun. The possible distributions of plural and singular among two words gives us four cases. ~ vvill deal with each of these in turn. Case 1: In [213] each noun is singular. For this case all ambiguous words must be nouns and part of the headnoun. Due to subject/verb agreement, a singular noun must match a 3rd person singular (v3s) verb, i.e, one without the letter 's". Tiris .case excludes that possibility since none of the words have an 's" at the end. Hence they must all be nouns. Case 2: In [214] ''hai:tdles" is a plural noun and each word before it must be a noun. tvh.en a singular notiil./verb word follows 'handles", the word (screw) must be a verb and ;'.handles" is the last of the headnouns. It is not possible to use 'handles" in this situation as a verb, and "screw" as a noun because of subject/verb agreement. 106 Case 3: Sentences 'With two consecutive plural nouns as in [215], where both words ~ve noun/verb ambiguity are often ungrammatical. (Do not confuse plural 's" with possessive ·~s"). The following is an example of two consecutive plural head nouns, the first of which is not noun/verb ambiguous ('sells'~. [218] He sells plows to farmers. This case is grammatical and not relevant to the discussion here, because "sells" is not noun/verb ambiguous. For the case where both words are noun/verb ambiguous, whenever the first plural is the main verb, then the second plural will be a noun and is dealt with by ROBIE later. For example: [219] He cuts trEeS on the week.ends. If the first plural is a noun, then the second one cannot be a verb, unless it is part of a different constituent. An example of this is: (Sentences beginning with ''?"are considered granunatical but unacceptable to most readers.) [220] ?The soup pot machine handles screws easily. [ 221] The soup pot machine handles screw easily. [222] Which years do you have costs figures for? [223] Do you have a count of the number of sales requests and .the number of requests filled? (The last two are from [IV.mtin, Church and Patil 1981]) Because there is a non-plural headnoun followed by a plural headnoun, this situation leads to the next type of example. Case 4: Sentences [216] and [217] both have the same word initial string until after 'screws'~ but in [216] "screws" is a verb while in [217] ':Screws" is part of the headnoun. In this situation, where the final word in a series is plural, each word before it must be a noun. The word itself can be either a noun or a verb, depending on what follows. These can be recogni.Sed as a pair of potential garden path sentences, as discussed in Chapter 3. Therefore, this is the case to which the Semantic Checking Hypothesis applies and the predictions ofChapter·3 apply. :M:lss nouns are def:i:h.ed as ambiguous between singular and plural in ROBIE. This is because they appear in singular or plural noun positions. If the mass noun 107 occurs in a location where it could be plural, ii: is treated 2.S s:ich.. He.i1ce if there is an ambiguity of the type discussed here with a mass noun, it is treated as a plural nowJ. and the semantic check we have described earlier applies. Due to number and sub~ct verb agreement, these facts have a linguistic base. They rely on the fact that a final "s" marks a plural noun, but a si:p_gular verb. If the verb is v3s (verb agrees with a 3rd person, singular noun, as With the 's"), then the subject of the verb must be sing~arj or else the sentence is u:ngra.J!l_"f!latical. This is why all the words before the v3s word must be nouns. If any of these vvords were used as a verb, then subject-verb agreement would be violated. Th.is is why [215] ·is ungrammatical. If the verb is v-3s (agrees with any noun phrase except 3rd person, singular i.e., no "s'~, then the subject cannot be singular. [213] has no plural subject and so cannot have a v-3s verb. In [214] 'handles" provides a plural sub~t, so "screw-'' which is v-3s can agree. In this section, we have looked at resolving a simple case of noun/verb ambiguity. In order to resolve this ambiguity, it was necessary merely to e:x.-ploit agreement bet~....n the subject and verb in num.ber and person. 5.2.7 Handling WHAT and WHICH For both 'what" a:ll.d 'which", the ambiguity lies between a relative pro- noun and a determiner. The. following examples show various uses of both words: [224] [225] [226] [227] (228] Which boy wants a fish? Which boys want fish? The river which I was has fish What boy wants a fish? What boys want is fish. det det rel. pron. det rel. pron. ·There is some debat·e aooii.t tJ:ie part of speech to be assigned the word "which". Some linguists consider it to be:a-quantifier [Chomsky 1965], whilst others consider it to be a determiner tA.k.ina.Jian :and Heny 1975, Chapter 8]. "W3 shall adopt _108 ..... the determiner analysis, making the problems for "what" and "which" similar. To determine the correct part of speech for these two words, l'&rcus used the f ollo-wing diagnostics: [which]-> in the pack.et CPOOL ''.If the NP above C is not modified then label 1st pronoun, relative pronoun else label lst quant,ngstart,ns,wh,npl." [what][ t] - > in the pack.et NPOOL ''.If 2nd is ngstart and 2nd is not det then label 1st det,ns,npl,n3p,wh; activate parse....det else label 1st pronoun,relpron, wh.'' [Jv.mcus 1980, p.286] These diagnostics would make the word in question a relative pronoun if it occurred after a headnoun, or a determiner if the word occurred at the start of a pas- sible noun phrase. If we follow the approach in the last section, and give each word a cam- pound lexical entry composed. of the determiner and relative pronoun features, we find that these words are always made determiners unless they occur immediately after a head.noun. In other words, the 'which" examples are all parsed correctly, but [228] is parsed wrongly. Tnis happens because the determiner rule Vvill always try to match before the rule for WH questions can take effect. This simple step gives the correct analysis if the ambiguous word is to be a determiner, but Vv'ill still err on [228]. The rule to parse ·a relative pronoun and start a relative clause is active only after the head noun has been found. At this time, the rule for determiners is not active. Therefore, if the word 'what" or "which" is present after a headnoun, the only rule which can match is the rule to use it as a relative pronoun and it will be used as a relative pronoun.~ have resolved·the simple case of "what" as a relative pronoun Using oil.ly the simple techniques of the laSt section. For these sentences: [229] What block is red? [230] Which boy hit her? (231] Which is the right one? 109 ROBIE produces the correct analysis, but still errs on [228]. Th.is error is because "what" is being used as a relative pronouns, but it does not follow a head- noun. \\iithout any additional changes to the parser, we get two things. Firstly, if the word occurs after the headnoun, then the NP-COlVlPLETE packet r113:es are active and it will be a relative pronoun. In fact, since relative clauses can occur only after the end of an NP, this correctly resolves the relative pronoun uses. If the word occurs at the start of an NP, then it will be made a determiner. 'This approach has exactly the same effect and coverage as did N.Brcus's diagnostics, but we have not needed any special rules to im.plemen t it. It will now provide the correct interpretation for ''which", but will make some errors for the word ''what". l.Vl3rcus's ''what-diagnostic" will treat "what" as a determiner when- ev~ the item in the second buffer could start a NP. Tiris is usually correct, but ''what" will be treated as a determiner in all of the fallowing: ~232~ What boys want is fish. ~233j What blocks the road? ~234l What climbs trees? ~235~ What boys did you see? ~236~ What blocks are in the road? ~237~ What climbs did you do? .In this thesis, we are al.opting the following analysis for WH clefts such as [232]. The initial WH word, ''what" is a relative pronoun and attached. as the 'WH- CO:rvlP of the subject S node. The subject is the phrase ''\Nb.at boys want". The main verb of the sentence is "is" and the object 'fish". The exact details are not important, only that the word ''what" or ''which" is a not determiner at the start of a WH cleft. The following tree illustrates this analysis: I VVHCOIVIP I what s I \ NP I s I \ VP I ' VP Verb I I boys want is NP I fish. :110 In sentences [232-234], the word ''what" is not used as a determiner. In the analysis we are using, it is a relative pronoun and is used as the 'WR-CON.IP for the S. In sentences T235-237), the word ''what" is used as a determiner. 1\1.Brcus admits that this diagnostic11 produces the incorrect result in this case (1Vfarcus 1980, p. 286]. His diagnostic will make ''what 11 a determiner in all of these examples, as will my analysis. One can also see that each of the above pairs is a pair of potential garden path sentences. For each pair, the two buffers contain the same words. Hence our two buffer lookahead is not sufficient to choose the correct usage of the word ''what". There is no way to make ''what'' a relative pronoun in the case where the headnoun is plural, but a determiner in the case where the headnoun is singular for all arbitrary sentences using only two or three buffers. ~th regard to the Semantic Checking Hypothesis then, it is suggested that this decision is based ·on non-syntactic information. I believe that intonation is crit ­ ical in these examples. Unfortunately there is insufficient experimental evidence to determine for certain whether this is true. Furthermore, if one adopted a linguistic analysis where ''what" was used as a relative pronoun in all these examples, then the problem would not exist. Finally, the problem of ''what" and ''which'·' as sentence initials, with no noun in the second buffer seems to arise very rarely. I have found no examples of this problem in free text analysis. (See Appendix C). The eurrent parser (ROBIE) cannot obtain the extra information which is provided by intonation to help resolve this case. As a result it follows 1.Vhrcus's diag­ nostic and makes ''what" a determiner in each of the above cases. This is because 'what" is defined as a determiner which can agree with either a singular noun or a plural noun, as it was in ivk"cus's parser. 111 5.2.8 Noun/ Modal Ambiguity 'We'Will:now consider noun/modal ambiguity as demonstrated by "can" and "will". Both can be either a "noun or a modal (i.e., could, should, would, can, will, might, etc.): [238] The trash can was taken out. [239] The trash can be taken out. [240] The paper will was destroyed.. [241] The paper will be destroyed. Each of these words is entered in the dictionary both as a noun and a modal. Due to agreement requirements, the modal/noun word can only be grammati- cally used as a modal if the word following it is a tenseless verb, i.e., the pattern: [modal][ tenseless] - > modal usage applies. Handling noun/modal ambiguity can be quite easy, when the noun modal word appears in the first buffer one merely has to look at the contents of the second buffer to see if it contains a tenseless verb. This can be complicated. though, if the auxiliary is inverted. or the sen ten.ea. is an imperative. Tne following examples show how this can arise: [242] Let the paper vrul be read. [243] \IV.ill the paper can be re-used? , In sentence [242] the fragment 1 'Let the paper" implies that "will" can only be used as a noun, as the sentence already has one tensed verb. In the parser, the noun/modal word is f"Irst encountered inside the ~IP packets and the parser must decide whether to use the word as part of the headnoun or to leave it in the buffer to be used as a modal verb. These rules do not know whether a verb has been found previously. Hence, not all information from the sentence is used. If all the informa- tion is available at the time the noun/modal ambiguity is being resolved, these sen- tences would 'be unambigtlous and people would have no trouble reading them. Sub):!cts were asked to read the above examples in the second experiment. the results showed convincingly that they are potential garden paths. IVrany naive 112 read.ers had considerably more difficulty with them than with their more straight forward counterparts. This was predicted for reasons that will be explained below. This result seems surprising. If the sub~cts used all information available at the time the noun/modal word was encountered, then they should have had no trouble with these sentenc~. The fact that these are garden paths indicates that the readers did not use all the information available to them. Notice also that the ambi­ guity can be reformulated as: 'Do we have the end of a noun phiase, or a complex head.noun''? V\e have already seen a case where people do not seem. to use all the inf or­ mation available to them. In Chapter 4 we saw several end of NP problems which could lead to a garden path. In each of these, we showed that the ambiguity was resolved on the basis of non-syntactic information, without regard to the following words in the sentence. In other words, we saw that the reader did not use all the information available. There is one crucial difference though. In the previous cases, they used non-syntactic information because the syntactic processor with its limited lookahead was sometimes unable to choose the correct alternative. In this case, the information necessary has already been absorbed by the parser. This suggests that the .choice of alternatives is made locally inside the NP parsing rules, without regard to information about the type of sentence being parsed. In other words, the two buffer pattern applies regardless of the rest of the sentence. This assumes that a noun/modal word followed. by a tenseless verb is being used as a modal. Let us look :at why this might be true in the parser. 'When the parser starts to parse a NP, it creates a new NP node and pushes it to the bottom item of the Active Node.Stack. This operation makes the NP node the Cmrent Active -Node and parsillg of the old Current Active Node is suspended. If the parser is p·arsi.rig ail Snode, for exan:iple at the start of the sentence, then work on this node will be SU.spended until the NP node has been completed. and dropped into the buffer. (see Appendix A for an example.) i113 Remember also .that the pattern matcher for the grammar rules is allowed only to inspect the grammatical features of the two buffers. Th.is means that the parser is unable to examine the contents of the Active Node Stack and, hence, the information that a tensed verb has already been found is unavailable to the NP pars- ing rule.s. This then sugge~ts that the ambiguity will be resolved on the basis of I local information only. The structure of the parser gives us a computational explanation for the difficulty of these sentences. From a processing viewpoint, the trouble with these examples is understandable, whilst from a descriptive vie\l\;point, there seems no a priori reason why these sentences should cause difficulty. This ambiguity is an end of :t-J-P problem and the choice of alternatives is made on the basis of limited and local information. Tilis suggests that non-syntactic information may be used "to resolve the ambiguity. There is one further possibility. The semantic choice mechanism is attempting to find the end of a NP. So far it has asked the question, ·can this item be part of the NP?" However, the end of NP prob- lem can be reformulated as, ''Is it better to use this as part of the N-P, or as the start of the verb group?" It is conceivable that the end of NP mechanism uses 't,,.vVi.11" as the start of the·verb group in the majority of occurrences, hence leading to the apparent modal preference in these ·examples: [244] The trash can hit the wall. [245] The paper will hit the table. The exact question asked by the syntactic processor of the non-syntactic processor can probably never be determined. These sentences have made several suggestions. Due to lack of data, it is not clear exactly what people do and this would seem to provide an interesting area for further investigation. 114 5.2.9 What About HER Another problem is the word ''.her", which can be used as a pronoun or as a possessive pronoun. Note that we can say: [246] ·Tom kissed her. [ 24 7] Tom kissed her sister. Clearly in [246] ''her" is a pronoun and in [247] 'her" is a possessive deter- miner. When multiple part of speech definitions were added to ROBIE and the simple disambiguation method used, ROBIE always made 'her" a possessive determiner. This difficulty arose in 11.arcus's parser because the rule to start a NP was ordered before the rule to parse a pronoun. These rules were copied directly into ROBIE's gram.mar. Since·the word 'her" has both the features ''ngstart" and 'pronoun", it could match both rules. Unfortunately, as Mucus's rules were stated, it always matched the NP starting rule, and hence was never made a determiner. This indicates one problem that can arise in the writing of a parser grannnar. To handle possessive determiners, P ARSIF AL aiJ.d ROBIE have a rule with the pattern: [poss....np] · This rule will match a possessive pronoun after it has been made into an NP. It will also match any pos.5essive NP, such as: "the boy's" or 'the boy's mother's". The rule then adds the feature determiner to the I\1-P, making it eligible for the NP starting rule. By degrading the possessive NPs to determiners, both parsers easily handle examples of left branching such as: [248] The boy's mother's brother is his uncle. Another problem arose in (246] because the possessive NP rule was not suf- ficiently constrained. It is possible to use ''her" as a determiner only where the next ·word can be part bf a noun phrase with that determiner. To enforce this, the second buffer is checked to be certain that its contents will take the determiner. Using this approach ''her" in [246] would not be converted to a possessive determiner. The rule 115 DETERJVIINER can run only if the next item will ''take a determiner''. This check is made by the syntactic category of the following word, rather than by a specially marked feature. This check could be done by having a list of all the possible categories as the pattern of the second buffer. As an implementation detail, this is in the form of an agreement check, merely to simplify this rule and to show its generality. The only remaining problem occurs when the verb can take one or more ob~ts and the item after the word "her" can be either the second object, or an NP with "her" as a determiner. For example: ~249 j I took her ,grapes. ~250 J He saw her duck. ~251 ~ I gave her:food for the dog. The examples presented above are all examples of global ambiguity, which will be discussed· in Chapter 9. In these cases the check of ''will. the next word take a determiner?", may or may not lead to the wrong analysis. This problem also interacts with the top-down component of verb phrase parsing and the semantic restrictions presented by it which will be discussed further in Chapter 10. The conflict between the determiner and possessive usage can be modelled as a conflict of rule ·priorities. If the possessive use is pref erred, then this rule should match first. Conversely, if the ob~t use is preferred, then the ob~t rule should match first. Any error in reading these examples would be due to one rule having priority over the other, when the reverse should be the case. Finally, notice that with no help from either intonation or context, either analysis is possible. That is, there is not enough information in the sentence to determine a unique in terpreta- tion. In Chapter 9, I will provide a more satisfactory explanation of these examples. 5.2.10 Handling THAT 116 In ROBIE, 'that" is defined as a singular determiner, a pronoun, a relative pronoun and a complemen tiser. Nmcus bad four diagnostics to handle the word ''that". Vile have seen one of these at the start of this section. In this sub-section we vvill see how these four diagnostics can be replaced in a simple way. Let us consider how to handle the uses of ''that" one at a time. Firstly, as a determiner. The following sentences illustrate the problem in identifying this usage. [252] I know that boy ·should do it. [253] I kn.ow that boys should do it. I 1'.:..ave stated before that :rvBrcus [lVE.rcus 1980] assumed grammatical sen- tences. If determiner/number agreement is not given to a parser, then it will, incorrectly, make 11that" a determiner in [253], producing the wrong analysis. The way to prevent this is to enforce number agreement in the rule DETERJVllNER by insisting that the determiner agree with the noun in number. The determiner usage "'tl\7ill be grammatical only when the headnoun has the same number. If we make this a condition for the rule to match, then ''that" will not be made a determiner in [253] and we will get the correct parse. How is this number agreement implemented? The rules are matched by the features of the first two buffers. To make agreement more transparent, another con- straint has been added to the patterns of the rules, the "agreement requirement". It is possible to specify in the pattern :of each rule that certain buffers must agree in cer- tain ways. For the above examples, buffers 1 and 2 must agree in number. The agree- men t check is restricted to ·the same grammatical feature checking that the rest of ROBIE uses. For this case, the agreement check would make sure that one of the follow- ing patterns are true: · [ns] [:nS] [npl] [npl] H7 Rather than having the follovvin.g two patterns, only one pattern is needed and the agreement check acts as a ''macro" to provide the two tes~above. [ det,ns] [ noun,ns] [ det,npl][noun,npl] The rule DETERlVllNER can Tun only if, as in the above case, there is a noun in the second buffer agreeing in number v\rith a determiner in the first buffer. See Section 8.6 .for a further explanation. of this check. Another way to interpret the agreement check is to imagine the rule checking that the result will be grammatical before it is run. A non-deterministic parser would run the rule, discover the lack of number agreement .and then have to backtrack. By checking for grammaticality before the rule can match, backtracking is avoided. vve have just added an extra check to the rule matching. This seems to go against our goal of simple and restricted checking of rules. However, this is not a violation of our goal, because we have restricted it to using only the items which the normal.-pattern matching can check. This check is really only a 'macro". The pat- terns of the rule could be re•vvritten quite easily so that they contained these features. This was not done because.it would lead to more complex patterns. I feel it is more transparent and general to have the agreement check separated. In some grammar formalisms, this agreement check will be easier to imple- ment. If one assumes an analysis \¥here the agreo...ment features are part of the main features, then the agreement would happen automatically, since it is part of the rule patterns. These two cases are handled properly because number agreement blocks the interpretation of the [253] as a determiner. This approach leads to the correct preference, ·when there is an ambiguity and accounts for the difficulty in [254] vs. [255]:- ~254~ That deer ate everything in my garden surprist3d me. ~255~ 'Th.at deer ate everything in my garden last night. 118 Experiment two showed that [254] is a garden path sentence, while [255] is not. In both sentences, it is believed the sub~t uses the word 'that" as a deter- miner. 'Deer" is both singular and pluralt so it fits the above rule. In [254], it must be used as a com.plementiser to make the sentence grammatical. The approach outlined above vvi11. use "that" as a 9-eterminer in an ambiguous case such as this. These two simple techniques, word order and agreement, are sufficient to handle all the examples we have just presented. In addition, free text analysis has shown no violations to this approach (See Appendix C). 'That'' can only be a complemen tiser when a ''that S-'' is expected. Hence the rules using 11that" to start an embedded. sentence are only activated when the verb has the feature 'THAT-COlVJP". The rules in 'THAT-CO:MP" will fire when 'that" is fol- lowed by something which can start an NP. This ensures that the S- will have a sub- ~t and means that 'that" will be taken as a pronoun in the follovnng sentences: [256] I know that hit l\lmy. [257] I ·know that will be true. but it will be taken as a complementiser in these sentences: [258] I know that boys are mean. [259] I know that Tom will hit 1V0ry. It seems that unless the S- has. a subject, the pronoun use of 'that" is pre- ferred. Otherwise one would have a complementiser followed by a trace, rather than a unmarked. complementiser, followed. by a pronoun. The rule to handle pronouns in general is of low priority and will only fire after all other uses have failed to match. 'That'' is treated in the same way. 'That" will be identified as a relative pronoun only if it occurs after a head.noun and the packet NP-COIVIPLETE is active. This situation will be handled in the same manner as the usual relative clause rtlles and will then cover: [260] r'know the boy that,you saw. [261] I know the boy that hit you. H9 The most difficult case for "that" is when the verb is sub categorised: V NP S-. That is, it can take an NP subject, followed by a 'that" S-. For these exam- ples, ROBIE may have to decide if the series of words following "that" is a relative clause or an embedded sentence. Examples of this are: [262] I know the girl that Tom knows hit :rv:ary. [263] I told the;girl that Tom.was running alone. In [262], we have a relative clause, whilst [263] contains an embedded. sentence. It seems necessary to distinguish ·between these because of movement. If the NP 'the girl" is to be moved, the movement mechanism needs to note this fact. If the mechanism. tries to move the NP, ·but cannot find a place to put it, the parser has made an error. If the parser did not try·to move it and then discovered that it should have, it has again made an error. There is no way that a deterministic parser with three buffer lookahead, let alone two buffer lookahead, can "see" enough to handle this case. In the following sentences, the lookahead would have to be more than three buffers. (Brackets indi- cate words in the buffers. The last word is the disambiguating word.) { (264] I told the girl [that][the][boy] hit the st.ory (265] I told the.girl [that][the][boy] will kiss her. It can be seen that in these sentences, the disambiguating word is outside our three buffers. How do people handle these and what should our parser do? In Chapter 4 it was shown that when the syntax could not resolve the ambiguity with its two buffer lookahead, the decision of which interpretation to lise might be made using non-syntactic information. In Chapter 4, we stated that if context can effect the interpretation of the sentence, then non-syntactic information is being used to select the interpreta'tion. The reader can experiment for himself and see that context does affect the interpretation of these sentences. Therefore we predict that non- syntactic information is being used to interpret these sentences and this problem should not be resolved. on a syntoctic basis, but a non-syntactic one. 120 This explains why some of these examples cause difficulty and others do not. The psychological·eviden.ce from cases using 11that" is scant and I feel no conclu- sions can be reached here. MY' theory predicts that context will strongly affect these examples and, if they are strongly biased to the incorrect reading, a garden path should result. One well known example in this area is [266]: [266] I told the girl that I liked the story. [267] I told the girl whom I liked the story. [268] I told the girl the story that I liked. These examples· were tested in the second experiment. The results sug- gested that [266] was read. faster than the other two examples. NBny of the subj=cts were questioned informally after the experiment about their interpretation of the sen. tence. All reported only one mea:n:ing; the S- reading. None of the subjects said that they noticed the relative clause reading, hence the result. The experiment how- ever, was not designed formally to distinguish these. [266] provides another example of global ambiguity. The second experi- ment has suggested that in a conflict the S- reading is preferred. Another possible ex-planation is that the "complementiser" usage is preferred syntactically. We will return to the question of preference in Section 6.5. Syntactic defaults have been mentioned for the other examples we have considered here. Since the syntactic processor is deterministic, it will never backtrack. It is, therefore, not possible for it to try a preferred reading and then an alternative reading. If a situa- tion arises where the syntactic processor, with its limited lookahead cannot resolve an ambiguity in general, we have stated that the non-syntactic processor chases one alternative. If the non-syntactic processor does not have a preference, then it may be possible for it to chose a 'Syntactic default". VVe will return to the question of syn- tactic defaults and preferences in Section 6.5. To ·handle the examples we have seen in this section, :JV.mcus had four diag- nasties, one of which was very complicated. I have just shown how to handle all ·121 four cases of "that" without any special rules, merely substituting enforced. agree- ment and rejecting llllgram.matical sentences. 5.2.11 Handling the Word HAVE Let :us now look at the· elimination of lVIarcus's HA VE-DIAG in relation to the use of agreement we have been discussing in this section. The problem with 'have" is illustrated by the following sentences: [269] Have the students take the exam. [270] Have the students taken the exam? In these, we must decide if 'have" is an auxiliary verb or a main verb and whether the sentence is a yes-no-question or an imperative. The sentences have the same initial string until the -fin.al morpheme on "take". To handle this case, :rvbrcus used this rule: [!Vmcus 1980, p. 211] "~RULE HA VE-DIAG PRIORITY:5 IN SS-START [have, tenseless ][ np ][ t] - > If 2nd is n.s,n.3p or 3rd is tenseless Then run imperative next else if 3rd is not verb then run yes-no-q next else ~oif-not sure, assume it's a y/n-q and% run yes-no-question nextJ." This rule seems to be necessary in order to distinguish between the ques- tion and the imperative. If one tries to ascertain. exactly what occurs, the apparent complexity is revealed. Note also that ~cus defaults to a yes-no-question twice in this diagnostic. The following sentences illustrate the distinction this rule makes. [271] Have the boy take the exam. (272] Have the boy taken the exam. (273] Have the boys take the exam. [274] Have the boys taken the exam? It can be seen tbat YES-NO-~QUESTION should run only when the NP fol- lovvi.ng is plliral and the verb has ''en" (i.e., "taken'~. For example, in [274] "the boys" is plural and the verb is "taken". None of the other examples above have both "boys" and "taken". This can also be understood as: the sentence is an imperative if the item 122 in the 2nd buffer is not plural and the verb is tenseless. [271-273] are Imperatives because either the noun (boy) is singular ([271] and [272]) or the verb is tenseless ([273]). The second part of the rule takes care of the fact that the third buffer must con~ain a verb for the im~erative, as this would be the main verb of the embedded sentential object. Let us look ·more closely at the reason for only [ 27 4] being a question. Firstly, if the sentence is a yes-no-question, then aux-inversion must occur. \Nh.en this happens, 'Have" will be adjacent to the verb which was in the third buffer. In order for ROBIE to continue, the verb must have an 'en" ending, or ''.have" ~d the next verb will not agree in aspeet. This is the basis for discrimination in the earlier e..-.:camples [271-274]. Secondly, in [271] and [272], the noun phrases are singular and both sen- tences are imperatives. This is because, if the sentence had been a yes-no-question, 'have" would need to agree with the subject which must then be plural. Hence, in effect, lVBrcus's rule checks for number agreement between the subject and verb and that the fixed order of the verb group is obeyed. Let us now look at other situations where this is necessary. P ARSIF AL would accept the following ungrammatical strings: [275] *Are the boy running? [ 276] ·*Has the boys run? [277] ·*Has the boy kissing? [278] *Has the boy kiss? For a yes-no-question, the inverted auxiliary must agree With the verb after it has been inverted. To stop these ungrammatical items we must enforce verb agreement. The pattern for the rule YES-NO-QUESTION should be: [auxverb][np][verb], agree(auxverb,verb),agree(verb,np). This constraint enforces agreement of the verb and auxiliary verb ·and the subject and verb. Again this check. is based only on the linguistic features of the buffers. See Section 8.6 for further details. ~123 Such a constraint effectively blocks the lmgrammatical items. (The parser will fail if the auxiliary has been inverted, since the auxiliary will not be parsed.) Also the subject NP must agree 1;..vith the auxiliary verb, so we can also add 'agree(auxverb,np)" to the rule, as we did with the HA VE-DIAG! So, by fixing the yes-no-question rule, the HAVE-DIAG is redundant. The reader will notice that the HAVE-DLt\G handling of the agreement as stated requires an NP in the second buffer and three buffers. This NP can only arrive in the second buffer via an Attention Shift. Since we have removed the Attention Shift and use only two buffers of lookahead, it is not possible to im.plemen t this rule in ROBIE. ~ will return to this, problem in Chapter 9. 5.2.12 The Word A The final diagnostic of ~mcus:s we will look at is the A-HUNDRED- DL4.GNOSTIC. This diagnostic was developed to select the correct usage of 'a" in the followi...ng sentences. [279] A hundred boys are in the class. [280] A hundred pound rock is in the car. !vlarcus considered 'a" to be a determiner in [279] but part of the number ·phrase in [280). His diagnostic used a second Attention Shift and all three buffers to select the correct interpretation. Wnilst this diagnostic was often correct, N.arcus admits [:rv.Brcus 1980, p. 214] that it would fail on the second of this pair. [281] John lifted a hundred pound bag . . [282] John lifted a hundred pound bags. 1Vhrcus then presented informal experimental evidence to show that [ 282] is a garden path for at least half 'the subjects he tested [rv'Brcus 1980, p. 214]. This. diagnostic has several things ·in common with the HA. VE-DIAGNOSTIC in the last sec- tion. It requires 3 buffers, the Attention Shift and is not always correct. As with the HAVE~DIAGNdsTIC, we will not use the A-HUNDRED-DIAGNOSTIC. :124 l.Vhrcus also demonstrated that there is no strategy based on three buffer lookahead which can always resolve this ambiguity correctly. We have seen a situa- tion similar to this in Chapter 4. In that chapter we saw several examples of am.bi- guity which we could ·not disambiguate on the basis of limited look.ahead. For each case we saw that the ambiguity was resolved on the basis of non-syntactic informa- tion. The cases we looked at in Chapter 4 were.all end of NP problems. However, they suggest that the general statement of VJ.HEN non-syntactic information is used is: VVIIEN - "When syntactic information with limited lookahead \,'\rill not always be suf­ ficient to choose between alternatives. One can now see that this statement applies to the A-HUNDRED-DIAGNOSTIC examples, which suggests that this ambiguity could be resolved. on the basis of non- syntactic rather than syntactic information. Unfortunately there is insufficient psychological data on these examples to confirm that this suggestion is, in fact, true. Although l.Vhrcus required three buffers to resolve this ambiguity, our theory sug- gests that this decision may be made based upon non-syntactic information, rather than on the purely syntactic basis which W..arcus suggests. \Ve have now shown how to replace all the diagnostics IVfarcus used. In doing this, we enforced number and verb agreement on the rules before they could run. This was motivated to reject ungrammatical items, rather than handling of ambiguities. Whilst there are still a few problems which we Vllill return to in Chapter 9, the approach reported here has the same coverage as Niarcus's diagnostics, and provides a better explanation of :why people have trouble on certain sentences. 5.3 Possible Uses for Agreement in English Linguists can 'describe the use of verb agreement and person/number codes. It is quite clear that these must be enforced in ;grainmatical sentences, but most 125 linguists do not offer an explanation of why we enforce number agreement and the fixed order of the verb group. In other words, our understanding of agreement is descriptive, but not explanatory. VVhat role does number agreement and the fixed order of the verb group play? In this section ~e have :seen a possible explanation of this puzzle. vve have seen several. occurrences of ambiguity, for each of which, we have found a parallel situation that could lead to acceptance of ungrammatical sentences by ROBIE. VVe then used person/number codes or the fixed structure of the verb group to block these unacceptable readings. M:lst of our ambiguity problems were also handled by this method. Although this has been used before "With non-deterministic parsers, we did not know how well it would work in a deterministic parstng environment. In this section we have looked at language from a processing viewpoint. From a linguistic viewpoint, word order and agreement could describe language, but could not explain it. But from a processing viewpoint, we saw that word order and agreement were essential. to the resolution of ambiguity in language and help to explain why people can resolve ambiguity so easily. Once person/number codes are taken into account, the ambiguity problems are reduced, for in each case, only one of the ambiguous possibilities was grammati­ cal. It seems that person/number codes might reduce the ambiguity in natural language parsing and this hypothesis needs further investigation. IVarcus had a few rules to resolve part of speech ambiguity, but they were ugly and ad-hoe. "'We have seen that we can replace these rules very simply by merely exploiting agreement. This now concludes our investigation into the resolution of lexical ambi­ guity. 'W3 have seen how to hand.le cases of lexical ambiguity which can lead to 'a garden path and cases which do hot lead to a garden path. We have developed and. improved our model to make it psychologically plausible. In the next chapter we will investigate how compatible the model is, as developed so far, with the relevant 126 psycholinguistic literature. 127 The Psychological Status of Determinist.ic Parsing 6.1 Psychological Criteria At various points in the thesis, arguments for c~tain strategies to make ROBIE psychologically plausible have been put forward, but how does it stand in comparison with the psychological evidence? In this chapter many of the important results on psyei."lological plausibility and their main points will be explained. VVe will then ask, 'can cur model account for or at least agree with these results''? "While it is claimed that the model presented here is a psychological model of the Human Sentence Parsing l\ll:cllanism (HSPlVJ), it is not claimed that it is a com­ plete model. 'My claim is a lesser one; that the overall design of ROBIE can describe, and explain, much of the data collected on how people perform in various linguistic contexts. This claim only applies to the areas this research has specifically addressed. IVIany details :of ROBIE are clearly incorrect and for these no claims are made. Only the major principles such as determinism. and limited look.ahead, and the timing of the interaction of syntactic and non-syntactic information are considered relevant. It is known that the current model is not a sufficient model of the entire HSPlVf, but it is felt that it is closer than previous models. The success of this model will provide a foundation on which to build a better model. The shortcomings of the model will show which areas to develop in order to build that better model. To evaluate this model, we can start with the criteria of Fodor and Frazier, [Fodor and Fraii.er 1980] who say the human sentence parsing mechanism must answer the f olloWing questions: a) ''Does it succeed in parsing all and only the sentences and non-sentences of the language that the native speakers succeed in parsing?" 128 b) 'Do its relative parsing times for different sentences match the relative parsing times of human subjects?'' c) "When it makes parsing errors, do these resemble the errors made by human sub- ~ts?" Clearly these criteria are barely adequate, but as no working parser today can meet all of them, they are a start. It is very important that the model not be capa- ble of too much and out-perform people. ff the model can process sentences which people are not able to understand, then it is too powerful. Conversely, any parser that does not have a grammar covering all of the language vv-m not meet the above criteria. No current computer model has the full range of senses, knowledge, infor- mation and capabilities which people have. Because of this, there is no way to design successfully a psychologically real parser today. One reason that it is impossible to show that a given parser is psychologi- cally real, is that no one is sure exactly how the HSPMworks and what is its design. In fact, we may probably never be sure how it works. This same problem puts limits on the extent to which we can dis-prove a model. If ROBIE does something clearly wrong, then it can be dis-qualified, but if it does not do anything clearly wrong, it must be considered correct until it can be shown in error by a better model. N.bre and better data need to be collected before better models can be designed. It will be shown throughout this chapter that the limitations of ROBIE are very close to the limitations.actually observed in people. 6.2 Kimball~s Principles '\\e will be.gin our discussion by looking at the general principles which the 'human parsing mechanism seems to follow as described by Kimball [Kimball 1973]. He proposed seven principles of SUif ace Structure Parsing. We will examine each of these to guide our discussion. In the foilowing section, \we wi 11 first look at _ 129 his principle~ then a brief explanation of it and finally how this principle fits the current parser. 1. (Top Down) Parsing in natural language proceeds according to a top-down algorithm. Top-down parsing implies that a parser has expectations about what will follow, even before it has seen the data The point most important to our discussion of top-down parsing is the need to reflect. the expectatia.ns of possible following items. ROBIE is partially top-down. \Nb.en l\/Iarcus [W.iarcus 1980] designed. his parser, he motivated the need to reflect expectations in the parse. Packets were introduced for this and provide the top-down component. This top-down com- ponent is essential to ROBIE. V\e have already seen sev-eral ea..~ of ambiguity where these expectations are essential (such as handling the words 'to" and "that'~. l\Arrcus has motivated the need to also be bottom-up, which PARSIF AL and ROBIE are as well. By design, these parsers meet this principle. 2. (Right Association) Terminal Symbols optimally associate to the lowest non­ terminal node. This principle reflects the fact that right branching structures seem to be preferred in natural language. Kimball's evidence is based on sentences of the form: [283] Joe figured that Susan wanted to take the cat out. (284] The girl took the .:Pb that was attractive. [ 285] Joe called the man who smashed his new car up. (286] Joe said that N.m-tha expected that it would rain yesterday. [from Kimball 1973] In each of these examples, the final modifying I item can. be n.ssoci;:;t~. with_ more than one ·location, but the preferred reading for each of these is to attach it to the lowest and rightmost item. This principle is used to explain the preferred reading for sentences of this type. Bever [Bever 1970] had tried to explain these examples as 130 part of memory limitations and Church [Church 1980] has explored this problem from the same viewpoint. This principle has started much discussion and will be dealt with separately when we discuss the 'Sausage Nl:ichine". 3. (New Nodes) The construction of a new node is s:ignaled by the occurrence of a gram­ matical function word. This principle is fair!y self explanatory. The main intent of this principle was to explain why the lack of function words can make processing more difficult when compared to the same sentence that has the function words added. If it is interpreted more strictly, this has problems, (see [Frazier and Fodor 1978], [Church 1980]) but we can satisfy the basic intent. Kimball's examples are: [287] He knew the girl left. [288] He knew that the girl left. \Vhile it is true that grammatical function words do start new nodes, it is not true that nodes are started only by the presence of a function word. The parser follows Kimball's principle of 'New Nodes" closely. Prepositions start new PPs, an auxiliary verb starts the AUX, and determiners start NPs. As KLrnball predicted, the parser has difficulty when the grammatical function words are not present. In !v1'arcus's parser, prepositions did not start PPs, instead the PP was built after the NP was parsed by the Attention Shift. But when the parser was modified to follow this principle, the extra mechanism of an Attention Shift was no longer neeeded. (see Section 2.5.2) This is one of the improvements of ROBIE when compared with lVm"cus's parser. His parser did not follow this principle, and was conceptually more compli- cated. ROBIE follows this principle and is conceptually simpler. 4. (Two Sentences) The constituents of no more than two sentences can be parsed at the same time. 131 Kimball motivates this rule with the following pairs of sentences: [ 289] That Joe left bothered Susan. [290] ?That that Joe left bothered Susan surprised lVIax. [291] The boy the girl kissed slept. [292] ?The boy the girl the man saw kissed slept. Church [Church 1980] has fully explored this principle. He introduces the A-over-A closure principle: The A-over-A early closure principle: Given two phrases in the same category (e.g. noun a>hrase, verb phrase, etc), the higher closes when both are eligible for Kimball closure. That is, ( 1) both nodes are in the same :category, ( 2) the next node parsed is not an :immediate constituent of either and ( 3) the ;inother and all obligatory daughters have ·been attached. to both nodes. [Church 1980] Church has .a very convincing discussion on how his deterministic parser explains this principle. Neither :rvm-cus nor I have investigated this principle and the center embedded sentences it explains in detail. However, [Cowper 76] presents a theory of this phenomenon which is consistent with the model presented here. t5. (Closure) A phrase is closed as soon as pos­ sible, i.e., unless the next node parsed is an jmrnediate constituent of that phrase. In ROBIE, a node is closed by ATIACH, when the node is attached to its mother. This node then, is closed when we are sure it is finished. Tirls approach fol- lows roughly the above principle. Tilis principle interacts with Right Association (above) and Kimball wonders whether it is actually distinct. Church [Church 1980] has an extensive discussion of this point along with the counter proposals of Frazier and Fodor [Frazier ·and Fodor 1978]. J.VJY' account for this point will be covered in the discussion of the Sausage :Na.chine below. '6. (Fixed Structure) When the last immediate bonstituent of a phrase has been formed and the phrase E is closed, it is costly in terms of perceptual complexity ever to have to go 132 back to reorganise the constituents of that phrase. Kimball uses this principle to explain why garden paths are so difficult to handle. This is the essence of determinism. Once a decision is made, it cannot be reversed. The current parser has not been designed. to recover from mistakes, so via- lation of this is impossible. 7. (Pr~cessing) 1Nhen. a phrase is closed, it is pushed dm.vn into a syntactic (possibly semantic) processing stage and cleared from short term memory. In previous chapters, the 'rule by rule" approach to semantic note taking has been motivated. It has been proposed that the semantic interpretation stage hap- pens continuously, not only after a node has been closed. In ROBIE, the node is closed when it is attached to its mother. Be-cause of t:his, the node will no longer be in the Active Node Stack, or any of the buffers. Once a node has been attached, it is not pas- sible syntactically to examine its structure. Th.is restriction then, meets this princi- ple. In summary, the seven principles of Kimball can all be met in ROBIE. :Jv.bst of the principles are necessary results of ROBIE's design and principles. The remainder are necessary results of Church's work on memory limitations. 6.3 The ATN as a Psychological Model The ATN was proposed as a psychological model in (Kaplan 1972]. In this next section, we will not explore the validity of the ATN as a psychological model. InStead, we will use the criteria which ~plan establ.iShed and used in that paper to see how it fits ROBIE. Several experimenters, ::M:lckay and Bever (N.Iackay and Bever 1973], 'Winner [Wanner 1968] and Bever [Bever 1970], performed experiments to establish 133 the reality of the deep structure-surface structure distinction. Kaplan observes: 'These experiments suggest that an adequate model of sentence comprehension must incor­ porate some mecha11ism for recoveri.:ng a deep structure representation of a given stimulus word string" [Kaplan 1972, p. 80] !Kaplan presents three other requirements for adequacy which our model must meet. These are: .1) ''.A perceptual mcdel must process strings in essentially temporal or linear order, for this is the order in which sentences are encountered. in conversation and reading.'' -ROBIE is a left to right parser just as Kaplan's ATN was. "Whilst there are iparsers which do not observe this, many parsers do, including mine. -2) ''.[t must process strings and provide appropriate analysis in an amount of time proportional to that required by human speakers. For example, since perceptual diffi ­ culty does not rapidly increase in length as the length of the sentence increases, the amount of time required by the model should be at most a slowly increasing function of sentence length.'' The time taken by ROBIE to parse a sentence is roughly linear in the ill.umber of words in that sentence. The parse time should not be an e_"'S{act function of the number of words in a sentence, but rather a function of the complexity of the sentence. In ROBIE, the time varies, within sentences of the same lengt~ according to their complexity. This will be fully explained in Chapter 8. The complexity of a sentence is difficult to measure, so we will assume that, if the parse time grows toughly linearly with the number of words, subject to the following point, then it meets this requirement. ROBIE certainly meets this paint. 3) 'The model should discover anomalies and ambiguities where real speakers discover them and for ambiguous sentences the model should return analyses in the same order that speakers do." [Kaplan 1972, p. 80] 134 This point has been the main motivation for all the work in this paper. We have tried to demonstrate for each ambiguity handled, that the parser does as people do. We have used the performance of people in an ambiguous situation to decide what strategy ROBIE will use. We have also shown how and why ROBIE will fail on the same sentences as people fail on, i.e., the garden paths. For some ambiguities, psychologists cannot agree on what people do, but there is no reason ROBIE cannot do whatever it is .people do once it is agreed. In this situation, requirement 3) is applicable. Kaplan asSUJ.!leS that all readings of an ambi- guous sentence are perceived and in a certain order. I feel that there is insufficient evidence for this conclusion. This point will be discussed in full in Chapter 9 and an alternate explanation proposed. Kaplan's first examples are: [293] The dog bit the cat because the food vvas gone. [294] Because the food was gone, the dog bit the cat. [ 295] The editor authors the newpaper hired liked laughed. [296] The editor the authors the newspaper hired liked laughed. [Bever's 1970, 24a-b, 27 a-b] In these examples, [294] is supposed to be more difficult to process than [293], and [295] is more difficult to process than [296], even though Kaplan admits that both are exceedingly difficult since they are center embedded. For [293] and [294], Kaplan's account varies by one step in the ATN. ~ grammar has not been extended to deal with these specific examples, but the fronted phrase will require several additional steps in the syntax and an extra inference at least in the semantic interpretation. ROBIE should have as much extra difficulty as theATN. In [295], there is the noun/verb/plural-end of clause problem which was discu.sSed at length in Chapter 3. Because [296] has a determiner before "authors", it does not have the noun/verb/plural ambiguity problem and hence will not require a semantic check, making it easier to process. Kim.ball's Principle number 4, 'Two 135 Sentences" explains why these are difficult and Church's account of what to do in this situation shows these are unparsable for his parser. It is accepted that people also have trouble here. These examples illustrate the power the ATN has and how it can out perform people in some situations. The next examples Kaplan uses are these: [297] They are fixing benches. [ 298] They are sleeping monkeys. Kaplan then demonstrates that six more arcs need to be traversed before a succesful parse for [298] than [297]. This account is based on the stategy that 'sleep- ing" is intransitive whilst ''fixing" is transitive.- This seems right, but the alterna- tive reading for both sentences is possible as these two illustrate: [299] They are fixing agents. (300] They are sleeping pills. This account would predict that these alternative readings are not possible or should cause greater perceptual difficulty, which they do not seem to do. OUr next examples are: [ 301] The red plastic box ... . [ 302] *The plastic red box .. . [303] The large red box .. . (304] *The red large box .. . [Bever 1970, 67 a-d] Kaplan accounts for these by addfn.g 'nounness" to the adjectives. This approach is certainly not a feature of the ATN only. If desired, nounness could be added to the current parser to handle this. In the current parser, the non-syntactic processor can abort any of these examples should it make no sense~ If the non- syntactic processor is in 'accept anything" mode, then these are acceptable. I feel this accounts better for the performance of people on these examples. Finally Kaplan shows that his model can predict the pref erred reading for sen tenees such as: [ 305] Tuey are frightenillg monkeys. [306] The Irish water boils. (307] The French bottle smells. 136 This prediction is based on the ordering of arcs. By adjusting the order in which arcs are attempted, it is possible to get the desired readings. Unfortunately, his account does not explain why the preferred reading of [305] differs from the pre­ ferred ·reading for [ 297]. It also does not explain how the same person can have dif­ ferent ·preferences in different contexts. Tyler [Tyler and :rv.m-slen-VVilson 1977] has .convincingly demonstrated that prior context has a ~r effect on the preferred reading of these examples. Kaplan's arc order account, as presented, does not account for these facts. In Chapter 9, It will be shown how our theory of interaction with the non-syntactic processor can account for these more satisfactorily. Sentences [306] and [307] show the noun/verb/s problem which was the sub~ of Chapter 3. It was explained that a non-syntactic decision, based on knowledge, intonation .and context, is what affects this reading. Kaplan's simple arc order account is not adequate to cover the data presented in Chapter 3 and Experi­ ment One (Section 3.4). It also seems that the arc order account is unable to explain how preferences change, but the non-syntactic decision account explains how this can be. The other difficulty with the ATN model is in garden path sentences. In Section 4.81 tt was shown that people have considerable trouble trying to understand garden path sentences. The backtracking account of garden pathing does not provide an adequate explanation of this. If recovery from a garden path sentence is as simple as back.tracking, then people would not have as much trouble recovering from a gar­ den path as we discussed in Section 4.8. In summary, Kaplan presented several very interesting examples and showed how his A'IN parser could account for them.. We have seen how ROBIE can also explain these same ex·am.ples. VVhilst this alone is insufficient evidence that ROBIE is psychologically plausible, ROBIE does better than Kaplan's ATN. 137 6.4 The Sausage Machine Lyn Frazier and Janet ;Fodor (FF) in [Frazier and Fodor 1978] propose a two sta,ge model of the Human Sentence Parsing l\li:chanism (HSP:M), called the Sausage N.Bchine (SNJ). This model has sparked a debate between its supporters and those of the model with which it is compared, the A 1N. In the next section, the I advantages of the SM and how well FF's data fit ROBIE will be looked at. FF proposed. that the syntactic analysis of sentences by hearers or readers is performe.i in two stages. The first .stage combines words in phrasal nodes as they are received. 'They call this the 'Preliminary Phrase Packager" (PPP) or the 'Sausage IVrachin.e". :Tue second stage combines these phrases into sentences. This stage is called the 'Sentence Structure Supervisor" (SSS). They did not propose a specific mechanism for parsing and did not even make specific proposals of exactly how the devices should work. Their entire discus- sion was very general and hence it misses many importa.11t points. For example, they claimed 7.hat the PPP can see several words at a time. T"ney did not give a number, guessing that it may be seven plus or minus two. All the NPs they used in the paper were of simple enough structure so that typically an NP with a PP seem-Pd to fit into the PPP's ·range. They have been accused of not having said enough to make their theory testable. In [Fodor and Frazier 1980], they answered this accusation by saying: this did not show that, 'the model, in so far as it is specified., is false". They defended themselv·3S by saying that they 'had not said enough to be shown wrong yet. It could be argued, as others have, 1that they ·have not said ~nough to allow others to , I decide w::i.ether their mndel is correct or not. It would be fair to let them not specify detail. as loru! as they did not crit~cise .otP:~l." models to a .level of detail which they __ _ have not ·resolved. However, in 'this paper, I they criticise the ATN on very specific 138 details, details that are not even thought out for the SM yet. The fine interaction of various components of the system is very important to any model; that is partially why we build working computer programs. Because of this lack of detail, I feel this model cannot be fully evaluated. Their first main point is the Principle of Right Association (RA). This is a slightly different version of Kimball's Bight Association. FF feel that there is some data that Kimball's RA cannot account for. The difference is the way it interacts with their second point. 'We will return to this later. Right Association states that 'terminal symbols optimally associate to the lowest non-terminal node" This predicts the pref erred interpretation of: [308] Tom said that Bill had taken the cleaning out yesterday. (309] Joe called the friend who smashed his new car up. [ 31 OJ John read the note, the memo and the letter to :N.my. [311] The girl took the job that was attractive. (from [FF 1978, p. 297]) In each of these sentences, the preference is to attach the final modifier to the lowest right node. This is what their principle would predict. This principle also predicts the difficulty in the following sentences. [312] Joe looked the friend who had smashed his new car up. [313] John read the note, the memo and the newspaper to JVm-y. [314] The girl applied for the jobs that was attractive. FF then tried to explain how this principle might be true in their parser. Their explanation was complex and depended on memory limitation. Church [Church 1980] has answered their memory limitation argument and revealed several problems in their explanation. He then provided a much more satisfactory account. His parser has a strict limit on the number of items which can be on the Active Node Stack. It is not possible for Church's parser to parse the sentences which violate FF's principles, because those sentences require too many incomplete constituents to be stored on the Active Node Stack. 'We will accept Church's explanation for the time being, but will look at an alternative explanation below. 139 Their other main point is what they called ''.lV.linimal Attachment" (l\ilA). This says 'Each lexical node (or other node) is to be attached into the Phrase marker with the fewest possible number of non-terminal nodes linking it with the nodes which are already present". [FF 1978, p. 320] This principle accounts for the pre­ ferred attachment of 'for Susan" to the VP in: [315] John bought the book for Susan. They suggested that this accounted for the preference for the conjunctive analysis of NP !\1-P in canter embedded :sentences and the preference for the first clause to be a main one, as we have seen in Chapter 3. They also feel that this accounts [Wanner, Kaplan and Shiner 1975], for "that" as a complementiser rather than a rela­ tive clause when after a NP. It even predicts the use of ''that" as a determiner over the 'comp'' usage. Even though I disagree with the evidence they presented to justify this, let us accept it for now. However, not everyone has accepted the claims of this paper as true. Wanner has replied to their claims and defended the A TN as a model of human sentence parsing [Wanner 1980]. In this paper he defended the ATN against the arguments of FF, saying that they said that he coul [to ][tense less] - > [that] -> [noun][noun] -> [noun,npl] -> [verb]-> [t] -> < actionl > < action2> < action3> < action4> < action5> < action6> < action7> A pattern with one word is more constrained. than a pattern with two features, since there is only one lexical item which can match the first, but several lexical items which may have, say "tenseless". A rule with no pattern [t] will alw~'"s be the triei last. This is necessary to handle many ambiguity issues. For, if the rules were tried in the opposite·order, the more constrained rules would never be matched. I emphasise, this same principle 'says that all default rules (rules with no pattern) will have lower priority than any other rule. 144 6.5.2 Right Association / ROBIE does not have the types of arcs, which were listed for the ATN ear- 1 tier, but the rules can be divided into several roughly similar groups. The equivalent of the SEND and JU1VIP arcs would be the default rules in a packet. If something is optional, typically a pack.et has a rule to handle the marked case and a default rule to handle the unmarked case, that is the default rule has no pattern. All equivalents of I the SEND and JITTvJP arcs will have no pattern in the current parser. Hence, according to our above ordering, these rules will be tried last. Thus VVan.ner's explanation of Right Association is a necessary result of ROBIE's design. 6.5.3 Minimal Attachment The deterministic parser has no SEEK arcs. A grammar rule in ROBIE with the pattern [np], does not create an NP. Instead, this pattern will match only if a NP node has already been started. But in the ATN, the arc with the NP on it will cause a push to the NP subnetwork and try to build a NP. Ordering this SEEK arc is the problem under d.iscussion here. On the subject of no SEEK arcs, l\Jlrrcus states: I 'The pattern that triggers on a specific con­ stituent, say a NP or an S, does not initiate parsing of a constituent of that sort. Instead, the pattern will only trigger if a constituent of that sort is already in the specifed buffer." [N.farcus 1980, p.22] If a pattern has the feature NP, this does not make ROBIE try to parse an NP. Instead the pattern will match only if a node with that feature has already been ·built. This can be contrasted with the SEEK arc of the ATN. The SEEK arc tries to .build 'a node of the type which was specified. on it. SEEK arcs are like recursive sub- routine calls. 145 Because ROBIE does not have SEEK arcs, the problems of ordering them are not relevant. The CAT and \\ORD arcs will be scheduled first as Wanner has shown necessary. NIA as characterised. by "\Mm.ner states that essentially the parser should be data driven ai.J.d should reflect the incoming words. Another way to understand the principle of N.linimal Attachment, is that the word should be used locally if it fits. Since ROBIE has no access to the Active Node :Stack, except for the active packets, then it is unable to see if the word could be used higher up. If the word could be attached. to the lower node, then the grammar rules must be written to handle it there. If these rules are there, then the optional use vvill be grabbed and this vvill behave exactly as lVli.nimal Attachment. One can see that ROBIE explains RA and MA as necessary side effects to the handling of some types of ambiguity. FF and Wanner are both unable to show, sim­ ply, why these principles are true. In this section we have seen that they must be true in ROBIE. 6.6 Some Predict.ions Cha?ter 5 explained that words have a compound lexical entry in the dic­ tionary incorporating each part of speech definition for-that word. "When the word is looked up, this compound lexical entry is returned. This was introduced purely to make the handling of ambiguity automatic. How well does this fit the psychological explanations? \\79 are interested in this qu.eStion for two reasons. Firstly, as a psychologi­ cal justification of the lookup routine. V\e have seen that it was necessary to have all meanings arrive at once so that the automatic handling of ambiguity will work. We will see that the method that is necessary for ROBIE to use, is psychologically valid and the same meth.Od that the lexical access literature prescribes. 146 Secondly, to check the prediction of the two buffers. The parser predicts several things in relation to buff er timings and lexical access. It predicts that all meanings are accessed at once and for how long the multiple meanings will be around. If ROBIE is to match two buffers at once, then some words V\ti.11 retain their multiple meanings until the next word has been perceived. · For example in .handling the word "to", I have shown that the correct approach is to chec..lt. the word following it to see if it is a tenseless verb or is some­ thing that can start a noun group. This means that ''to" will not be disambiguated until the next word has been perceived. Since the parser does not have the Attention Shift and words are not allowed to be placed on the Active Node Stack unless they are dominated by a non-terminal, the word ''to" must be disambiguated when the fol­ lowing word has been perceived. The parser has many similar examples, so in many cases a word will not be disambiguated until the next wo:rd is perc-d.vad. Not all examples of ambiguity need both buffers to be filled before the word is disambiguated. For example, the first word following a determiner that could be a noun, wtll be attached as a noun and the second buffer does not need to be filled. It should be noted that the grammar may change as it is altered and expanded, so a rule that needed two buffers now, may be re-formulated in the future. Th.is makes it very hard to say exactly which rules make the two buffer prediction. I.et us look at the implications of the·t"tvo buffer prediction and see how well they fit the psychological data. 6. 7 Lexical Access '\M:! will now look at the work done on Lexical Access. The research in this area has been trying to find the answer to the question: ''When a word is first per­ ceived, how is it looked up in the human 'tlictionary". Are all its possible meanings returned, or just the appropriate meaning?" 147 There are two main schools of thought on the lexical access question. AnY. time the prior syntactic and semantic context will affect the result returned by the lexicon, these two schools of thought make different predictions. The Prior Decision Hypothesis predicts that in a heavy biasing context only the appropriate meaning would be returned, while the Post Decision Hypothesis would predict that all mean­ ings are accessed in all situations, .but the non-appropriate meanings are eliminated. soon afterwards. 6.7.1 Swinney, Cairns and Kamerman A supporter of the Post Decision Hypothesis is Swinney. He performed two experiments to test this hypothesis using a cross modality task. In the first experi­ ment, target words were presented simultaneously with the occurrence of an ambi­ guous word. The results of this experiment showed that words related to both meanings of the ambiguity were facilitated in both a wea~ and strong biasing con­ texts relative to unrelated words, whether or not the related meaning was consistent v\lith the biasing context. In the second experiment, he presented the target word later in time. In this experiment, the test phonemes were presented three syllables after the test word (about 650 msec). This experiment showed that after the three syllables had passed, only the appropriate meaning was facilitated. This then provides evidence that all meanings are accessed at once, but after about 650 msec, only the appropriate mean­ ing remains. His conclusion was ''immediately following the occurrence of an ambi­ guous word, all meanings for that word seem to be momentarily accessed during sen­ tence comprehension" [Swinney 1979, p. 653] Given that a multi-syllable word is recognised in its first syllable [l\IIarslen-"Wilson 1980b], then the next word will have been figured out by the third following syllable. In this case, the two buffers will be filled and the disambi- 148 guating process can ·happen, as the word will have been fully parsed. Therefore Swinney's experiments 'support the compound lexical en try. However, since all the ambiguities in Swinney's examples were within the same part of speech, we cannot judge whether this experiment has any bearing on the two buffer lookahead. Cairns and Kamerman [Cairns and Kamerman 1975] further investigated this hypothesis. They were, again, trying to decide whether one meaning is accessed in context, or all possible meanings. They tested this in two different ways, a phoneme monitoring experiment and a sentence completion experiment. They presented two hypothesis: The Short Term hypothests says that all lexical information is retrieved and stored in working memory, probably until clause end and the Immediate lVEmory hypothesis says that the le..J{ical decision is made immediately after the retrieval . of information and only one meaning is carried on. Both of these fall under the Post Decision Hypothesis. They tested these two hypotheses with the p:Q.oneme monitoring experiment. They concluded that, 'The restI.l.ts of the phoneme monitoring experiment support the immediate decision hypothesis rather than the short term hypothesis." They said ''the process which produces increased monitor latency following an ambi­ guous lexical item is completed roughly two words later in the sentence". Again these results support the use of compound lexical entries, but all their examples were vvl.thin the same part of speech, so we cannot judge whether their results have any bearing on the two buffer lookahead. 6.7.2 Tanenhause, Leiman and Seidenberg· [Tanenhause, Lei.in.an and Seiden berg 1979], investigated the lexical access question from a slightly different viewpoint from Swinney. The test words in all of SWin.ney's examples aiways had the same part of speech. In the work of the above authors, lexical ambiguity was investigated in syntactically biasing contexts where 149 the part of speech of the word was actually different in the different examples. Swinney's work was not as interesting to us because there was no syntactic ambi- guity, but this work may have a bearing on our buff er timing question. They tested words such as 'watch" in the following examples: [ 319] I bought, the watch. [320] I will watch. This was done using a variable delay naming paradigm to see which m2an- ings of a vrord were facilitated at various times after its occurrence. The subject heard the s~tence over headphones. At the end of the sentence, the test vrord (for example 'look") was presented on the screen and the subject pronounced the v.rord out loud. If the word was facilitated, they predicted that reading it would be faster than an unrelated word. The time delays were at 0 msec, .200 msec, and 600 msec. They predicted, as ROBIE would, that all meanings were accessed. at 0 msec, but by 600 msec, o:n.ly the contextually appropriate meaning would be facilitated. Their experiment showed that all meanings were facilitated at the 0 :nsec time, but at 200 msec and 600 msec, only the relevant meaning was facilitated. This is consistent with the results obtained by Swinney. They concluded: 'Both noun and verb readings of the ambiguous word were initially accessed with the appropriate reading selected within 200 msec on the basis of syntactic context". Tlris is the result ROBIE would predict. All possible readings are looked up initially an:i when the word is disambiguated, only the relevant meaning reID2i.ns. All of their examples were biased by the syntax in such a way that there was no need to see the word following the target word to disambiguate it. All target lt'\i~ords were also at the end of the sentence, so there were no words following it. For example in (319], the word ''watch" was immediately after the determiner, so "it had to be a headnoun. Similarly, in [320], the word followed. the modal, so lookahead would not be used to resolve the ambiguity. Therefore, ROBIE would disambiguate all of these examples as soon as the word was recognised . .150 We can now decide if their experiment had anyt:hiri...g to say about the look- ahead predictions. They showed that all meanings were facilitated at 0 rnsecs but the facilitation is over 200 msecs after the onset of the target word. We know that it takes 200 n:sec to identify a word. (NBrslen-VVilson 1980b] Th.erefore the disambi- guation has taken place as soon as the target word was identified &J.d without using the next word. This implies that the disambiguation took place before the second I buffe:r could have been filled. This seems to contradict the lcolr-2.!.~ead prediction, but, as I stated above, all the examples they used can be resolved using the techniques in Chaptar 5. In that c.i."'1.apter, we saw that the rules to resolve tl""l.ese cases do not use the sECOnd buffer. Since the lookahead is not used, the word can be disambiguated as soon as it i3 identified. This is exactly what their results suggest. Unfortunately this means that their experim.eD.t has nothing to say about OlIT use of look2.!~ead. To resolve the look.ahead question 1Nith this approach, further e~-peri:rnents are neces- sary. AL. example of the type of experiment which n22ds to be pe:rforlll.ed is as follows. In ROBIE, both meanings (auxiliary verb and preposition) of the word "to" would be available until the next word has been identified. I explained in Chapter 3 that this ambiguity is not resolved until the features of the second buffer h.ave been checked. The ambiguity of the word ''that" between a complementiser and a relative pronoun will remain until the features of the next word have been checked in the follm:Ning SE:ntences: [321] I know that boy did it. [322] I know that boys did it. We have also .seen (experiment two, Section 4.8) that the fragment 'Will · be" can lead to a garden path. This is based on the fact that lookahead is used to disambiguate the word "will". It is also predicted that all meanings for "will" are present until the next word has been perceived. That is, all meanL,gs of "tvill" are present until the next word (''be" or "was'1 is recognised in the following sentences: 151 [323] The paper will be destroyed. [324] The paper will was destroyed. vve have investigated these works to see if there is any experimental evi- dence that would support or reject the predictions of the two buffer lookahead. \Vhile the predictions from ROBIE are compatible with the data presented here, no conclusions can be drawn as there is no data on the key situations where disambi- guation rests on the folloWi.n.g word. 6.8 Learning the Deterministic Parser :rvmt.y psychologists believe that if a parser is to be psychologically plausi- ble, the grammar must be learn.able. I have not investigated this, but Berwick [Berwick 1979] has. It is his opinion [personal communications] that determinism is definitely a boon to acquisition for this reason: during a deterministic parse, if the parse fails, the parser knows it was correct up until that point. One can then build a rule in the style of [Berwick 1979,81] that will fix this situation. 'When a non-deterministic parser fails, it could be in two situations. Firstly it may have failed as above and a rule needs to be added to fix the situation, or secondly, it failed because it was pursing the wrong branch of the non-deterministic computation. If it had failed for the latter reason, then back.tracking is used and a rule should not be added. \Vhen a non-deterministic parser fails, it is not possible to tell which of the above cases apply. Hence learning is much more difficult. Berwick has shown that it is possible to learn~most (70%) of the 'eore- grammar'' of Niarcus. The part that his system could not learn was the diagnostics. I have eliminated these, so the learning should ·be simpler. Unfortunately, Berwick did not investigate how to learn number agreement which is vital to our approach. In summary, learning is not easy, but it is certainly easier to learn a gram- mar when it is being U.Sed by a deterministic parser. 152 6.9 Timing of the Parser In Section 6.3, Kaplan stated that the parse time should be at most a slowly increasing function of sentence length. In this section· we will see whether this is true for ROBIE. Parsing time was collected on the parses of 130 different sentences. These times were collected on an PDP-10 timesharin.g computer. Because of the timeshari.ng, there is an u.nkncrwn amount of time spent duriI1.g the parsing of each sentence on system and PROLOG overheads. "'Whilst this means that the times are slightly mislead­ ing, the time does give a good indication of the total effort required by the parser and the resulting times are very consistent. Based on this, ROBIE 's speed was 50 msecs per word (.05 sec per word) plus 100 msec per sentence of overhead. This includes the syntactic and non-syntactic analysis. The analysis presented. here was purely informal and no special statistic tests were used. The sentences tested were from two groups. Neither group contained gar­ den path sentences, as they cause the parser to fail and hence do not provide parse times. First were the N.IECHO examples in Appendix B. The mean number of words in this group was 14. The shortest sentence was 3 words and the longest was 32 words. The average over 60 sentences was 51. 7 2 msec per word 11\lith a standard deviation of 11.18 msec plus the 100 msec overhead. The time to parse a sentence grows roughly linearly in proportion to the number of words, although it must depend on the com­ plexity of the sentence as well. For example, the mean time taken to parse sentences, of length 12 words, was 47.67 msec per word, with a standard deviation of 6.5 msecs. To parse sentences of length 17 words, the mean was 57 .33 msecs per word, with a standard deviation of 7 msecs. This shows the deviation due to complexity and the growth in length. It is not the case that all examples over 17 words average longer than 57 msec/word to parse. In fact, most do not. 153 Below is a graph of the above data. The vertical axis is the time taken in msec, per word in the sentence. This is computed after the 100 msec overh9ad is removed. The horizontal a>..i.s is the nu..."'!lber of words in the sentence. Graph to show Parse Time Per VJord against Number of Words in a Sente...TJ.ce for examples from the l\IJECHO problems. parse time in 1Il3ecs /word 90- x 80- x 70- x x x x x x x 60- x x x x x x x x x x x 50- x x x x x x x x x I x x x x x x x 40- x % x x x x x x x 30- 20- 5 8 11 14 17 20 wcrds/smtence Figure 1: Examples from M ECHO x x x x x x I also collected timings for 70 examples from. M arnJS and Church [Mam.is 1980], [Churd:i 1980]. The average length of the:)e sent.ences was 7 words, the short.Est having three words and the longest 11 words. Average parse time:; fer these examples was 48. 2 msec pa- word with a standard deviation of 17 1mre. These examples were choom. fer romplexity and linguistic c:ove:"""age. Fer sEntm.u:s of leagth 6 words, the Illffiil time was 45 rrnecs and the deviation was 10 :msecs. Thenfore the timing is roughly 50 msec ~word Bdow is a graph of these examples. The vertiml axis is the tirJE takm. in rrnec. pa:- word in the sm.tm.re. This is oomput.ed aft.a- the 100 msec ovErll.red is rm:ioved The hor- izontal axis is the number of words in the sentrn.ce. 0 ne can see that the parser tirre grows 154 slc;Nly as a function of the ncmber' of woni~ in tbe scnLqice. Graph to sho·H Parse Tirr.e Per W cn:l against N ur:ntcr of W crds in a Senlcna:! for example:; from M illUJS and C b urciL EX1fSe lime in rmecs /wcrd oo-' 00- 70- x 60- x x x x x x x x 50- I x I x x x x x x x f x 40- :x x x x ::r: I x x 30- x x I x 20- x x 4 5 6 7 8 9 words /smt.Ence Figure 2: Examples from M arms and Church f>.fOSummaty x x x x x x x x 10 11 This is the end al our SUIVey al mated wock in psydiolcgy. As the reada:- can judge for himsaf. the a.rrnnt pll."Sa." dce3 Vf:TY wdl when ax:npared in this way. We have shown that it can easily acm.mt far the principles of M inirra.l Association. and Right Assa- ciaticn. the Sewm Principles al Kimball. and the rdewant lexical ruxe;s lita-ature. The examples pccsalted here do not prove that ROBIE is a psymo!~cal modd of the HSPM. but I fed they show that it has many straJgths and is betta- than other existing rmdeis. In st.lil1DElY the nrst important ~chcicgical dEilHlts are: ddinitioos return. all pcssilie feat:urel. tha-e are no special anb.guity handling rulese autocm.tic disantiguation. the nm-syntactic ~ s int:a:-adion with the synt.adic pro)E;'SS(r, the prcrlucticn system grammar. and oo.ly two buffEr.;. 155 This now cornplcl:.cs the development of our model of nonnal proce3sing in ie H SPM. This rncdel cx.nsists of t:..aro p._~sors, a syntactic processor and a non-syntt . ic p~or. We have seen that conceptually the two ~ses work in parallel during the o­ ce:;sing of a normal sentence. During prr.ce:3sing ci. some smtena:s, the syntactic prcce. _ r, will at key pcints, ask the non-syntactic proc€SSOi to make a raision in order to r<:Solv :- lil. amliguity. W e have seen that the:::e key points are these situatic:ns whid::l. the syntactic :~ _ :r CffiSor cannot guarantee to resolve aA'Ted.ly with its-two buffer lookahead The n:st of this thESis is devoted to further cEtails and areas for furtha:- invese ~ 1- tioo.. The.next m~pter will describe other apprcad:ies to parsing and Chapter 8 will pre-' 1e more details cl. how ROBIE works. Some readers may wish to skip thESe two cilaptt1-s :::~1d continue with the discussion rK. problems for fublrework: inveiligation induding 'Have' E~.~1d global ambiguity in Chapter 9. 156 Rei ated Work In this chapter, we will loOk at othEr approachES to parsing. The work dEScribed in this chapter has had Same influa:ire on my own work, even though the influmce; of SQ[Ile waks is very slight. These works are presented l? enable the rearlfr to gain a perspective on my work in relation to other works. 7.1 Determinist.ic Parsers 7.1. l Resolving Noun/Verb Ambiguity The first attempt to handle part of speedl amliguity in a det£nninistic parscr was by [Milne 1978} This ~ ccnanied it.sdf with noun/vat> ambiguity only. Catain wcrds were ddined as both a noun and a va:b in the dictionary and thESe two d.Efiniticns were retmned un arri-yed into a blffa-, the parsEr woold Attention Shift and activate a special packet of rulES for disambiguating the word. The grammar then had special rule:; to handle the pcssiblc ~- For example to handle: [~] I want to kiss you. The parscr bad a rule with the pattern; [to!nounlverl>] and the adicn: make 2nd a verl>. This rule would make any noun/vero ambiguous wcrd following ''to'' into a vab. These rubs wa-e simulated on free text and had an ext:ranely high success ra.t.e. Even 159 though this methcd was very effective, it had a major deficiency as well. The objection to this approach is that it usa:i very special purpcse rules to handle ~auity. For each rase of amliguity. there was a special rule to decide what to do. The desire for a more el<:gant rrethc.x:l of handling ~rruity motivated n:mt of the work in Chapter 5. This appn:i;ich is rompletcly surpassed by the wOik in this thesis. 7.l.2Church's YAP Church [Church 1980] designed a parst:r called YAP (for Yet Anotha- Parsa-). This is a detHm:inistic parse- in the form of a finite state machine. The main focus ·of Church's work was limiting the menory which the parsa- can use and henre, keeping the nlIDlOO." of pcESible inta:nal states finit.e. It is poosible to avoid backtraddng by having infinitdy many states. one for eadJ. pcESible step of each pt'ESible path. If tha-e are an infinite riumlxr of states, it is not possible to determine the unique suC£ESSor ronfiguration from. a given state and henre, the systa:n would not be detenninistic. Church's wcrk lim­ ited the IlllIIlba:- c:i states to avoid this probla:n. Church's YAP. was de;igned using the features of M arcus' s parsa-. Like mine, it is vey sirrilar in dEfilgn to M areus' parser. Churdt, however. added a few essential diffEr­ m.~. YAP's a:ntral feature is the "= =WALL= = ". This Wall divided the tJPPEr and Iowa- OO:ffEn> in the parser. Below the Wall wa-e the three lcx:Kahead buffers. Church called these "cbwnl, oown2 and down3'. Above the Wall, Church had an "uppa-- Wffa-1'. This repla~ the A dive Node Stack used both in M arrus's parsa- and in R 0 BIE. Both buffers grow towards the Yf all. The upper hlffa- built constituents downward that is mother.; look­ ing for daughters. The Iowa- buffers built constituents upward that is daught.a's looking for r:mt:h.er.;. Ha-e is a snapillot of Church's paISEr. [Church 1980, p. 44] 160 sentence: I am a ooy. input p) it is ideal. All meanings are availalie simultaneously. For this reason the chart parser is important [M~ Church, and Patil 1001] have built a large. efficient cilart. parsEr which is dffiigned t.o :return all prnsible parses. As the 't;nain emphasis in this work has reen on pure parsing. it is not directly relevant here. However, this system. has shown the large number of poosi.lie parses t.o be found if all parses are sought. This hel~ to demonstrate that the chart is a usdul method for parsing although it doe:; not meet the a:it.aia laid down in this thesis. 7.4A General Syntactic Processor Kaplan [Kaplan 1973] explains that the key principles of both Wood's ATN and Kay' s Chart ran be maxnpassErl undff" the General Syntactic Parsa:- ( GSP). This is an ahrt.:rad. madllne which IIBpS strings of trees to strings d. trees. GSP USES a d:lart and has frur variables indicating the curnnt edge., it.s t.ail and ~ infama­ tion about it Kaplan shows that the d:lart can be impl.mrent:ed with only a few, vrry prirri. ­ tive construdions and using grammar cc:rnpilErS, many grammars can be implemented This system. was prrsmt.ed as a genernl Supa:' parsa-_,which by adjusting a few parameta"s rould be made t.o rmdd an A TN or a Chart. In orda- to acromplish this, many special glolBI. variables and fundions are added t.o the system. Whilst this system. is interesting. it ~ had no infl.1.lfnce on the work described here. It is prffiEnt:ed to illustra.t.e that there is a similarity lxiwem the A TN and the Chart parsrr. Whether GSP rould :rmdd a dete:n:inistic parsEr is not known. 167 7.5 Steedman and Ades Sta?dman and Ad?s [Sta?dman and A~ 1980], [Ades and Steerlman 1000] have prop:sed a psydidogical pal-sing model oonsisting of a stack and five· simple grammar rules which they have OO:nonstrated on a few simple example:;. Their categories are ci the fOIDl 'X/Y'' which means ''anX lacking a Y'' [Ades and Steedman 1980, p.14]. For example, S/NP means'~ S node lac.king a NP. Their rules pro- -. vide an automatic way of combining it.ans of das3 Y with a node of type X IY. They also explain the "stacking constraint'. This st.at.e; that IDCVffllent in senta:ia:s happens as if the it.ans being moved were stored on a push-down stack. Their parser consist.s of a single stack en the t.op d which new it.am arrive. Sent.Ena:s are parsed by CDIDl:ining itea:Js until thE!"e is only one ita::nleft They claim that therr simple rules and the stack natnre of the parser accnunt for the IDOVffi'.lfflt phenomm.on in English and several syntactic cc:nstraint:s. This parser is attractive IEca.use of its simplicity, even though it has hem. tffited on. a very small grammar. The main.psychological attractions are the simple grammar rules and the stack. With a few extmsions. this stack parser is aln:Jart. isorrnrphic t.o the parser we have see:i here. Our Active Node Stack is equivalmt t.o thar stack. with the two buffas being the tcp ita:ns of the stack. All tbar rules combine the top two itmls d the stack. ~ Because ROBIE bas two static buffa-s, the second and third itans d. the stack are usually Here is a comparison when parsing the senta:ice ''I will many her' at the time the NP 'her' is abmt to be added to the VP. On the left is the stat.e d the Steahnan and Acks Parser, en the righ~ ROBIE. Steedman and Ades cell3: NP - her cell2: VP /NP - marry cell!: S/VP - I will ROBIE B2: . Bl: NP - her ANS: VP - marry S - I will SS-VP 168 N otire that ROBIE has an extra stack a:ll, the 2nd Buffer. After the next step. the two states will be: Steedman and Ades cell2: VP - marry her cell 1: S/VP - I will ROBIE Bl: VP - marry her ANS: S - I will SS-VP B2: One difference bctwem. the parsa:'S is-that ROBIE ran combine ncdes on othE!" than the t.op of the stack. e.g. It can romline the two buffers. This is rardy nea:ssary and might be eliminated in a different grammar. Stffrlman and.Ades have net cnmmi.tt.ed them- selves to a detaministic or na::i-deteministic parse:-. I fai that if ita:ns were combinai between the seaxid and third stack alls. rather than the t.op two, their Ifil'S€I" would re vey similar to mine. The grammar for ROBIE was basal on Chornsky's [Chornsky 19?3, 76,77] EST theay. There is no simple way to automatically combine rules using the feature of EST as is pUlar approaches to parsing. While knowledge d. the fol- lowing syst.ems has inflrmced my thinking in sane way, none d. the fdlowing systems were &:signed to be psychol~cally plausible and had no rm.jor inflllfflce on this work. a) Semantic gran:mars as prop of PROLOG, small rut ir.n.pnssive systa:ns ran be built rapidly. These parsers have been compared with the ATN [Pereira and Warren 1980] and shown to be as efficient and ronceptually cleaner. c) Conreptual parsing as done by [Schank 1973]. This approach is very int.ensting but doe:; not use any form d syntax. This work is not related t.o the syntadic ~sor, but rather the ncn-syntactic ~sor. There are seva:al ciher top-down. 'qepth first seard:i algorithms for non- detenninistic parsing. None of these are have bad more than historical influm.re on my work This includes: [W inogra~ 197~ [Sager 19731 and [Charniak 1973} 7. ?The Timing of Semantic Interaction In Chapt.Er 4. we saw whm non-syntactic information was extrad:.ed by ROffiE during the parse of a sentence and when non-syntad.ic inf crmation was used t.o rmke dec:i- sions. We noted that each rule make; a contribution to the sa:nantic intffpretation of. the parse. The non-syntactic proa:ssor can stop the parse if the contribution make; no sense. At points where the syntactic proressor. with limit.al lookahead finds two rule; of equal constraint matcb.ing. non-syntactic inforn::Bti.on is used to decide whim rule t.o ruIL This happens at the end of a NP and in example; of global ~auity. In the fdlowing sections. other major app:roo.d:l.es to the timing of semantic intaadion timing be discussed VI e will look at four other approachES to the timing of. sanantic interaction. The;e are: at the end of the smtence. at the md of mch dause. and ~ two approo.ches to Clrison, we mn look at two wdl known systEms and two newer ~s and see whm nm-syntactic infonm.tim is used The;e are Woods LUNAR syst.a:n. [Woods H172]. Wincwad's SHRDLU [Wincwad 1971,72.]. Mellish's "inutlll31t.al evaluation'.' [M dlish 1981] and M arslm-W ilscn's on-line approo.d:L [Tyler and M arslm- 170 Wilson 1977], [M arslen-Wilson and Tyler 1980]. 7.7.lAt the End of the Sentence In LUNAR, the semantic analysis did not start until the entire sentence bad ban. parsed If the semantic analysis failed to work, then the parser 00.ck:traDtribution that will be made by the rule. as part of the rule matd:ling. In the on-line theory. the rule could not matdt if the sm:iantic ccntribrti.on made no smse. Tb.Ere is a fine distindioo. 173 between the semantic te:>t that may be made to run a rule and the sa:nantic CCilb:ibution marle by the rule. It must surely be moce efficient to separate these. ln the on-line approad:J.. they have added an extra stq> of semantic interpretatiOIL even though the next step may fail. In my theory,' the rule would rrn.tdi and then the parse wculd fail. The timing difference between these approaches is so fine, that they can probably never be dis- tinguished M arslen-W ilson says: [M arslen-W ilson 1980b] mm.tally. ''Note that this (experiment}. da:s not oompletdy rule out the pcssibility that syntactic proa:ssing is autonornrus, since it is ~ble to imagine a systan in which the intamediate products of aut.onormus syntactic analysis are rm.de immedi­ atdy available to subseq_Ufllt intapretative proa:sses, and that it is the d:feds at this levd the expEriment is tapping. In fact. this altHnate propcsal can prroably ~er be excluded on the hams of expaime::ltal data' I This IlElllS that these two explanations can probably not be distinguished expm.- M arslen-W ilson and Tyler have ckne sevEnll experilIEilts to test this theory. The first is the sprech-sbadowing paradigm [M arslen-W ilson 1973, 75]. In this experiIIHlt. the subjects were askal to repeat a sent£nce while they heard it (shadowing). They were then given sa:ita:u:e; whid:::t containal errors and mispronounca:l words. I ri some cases the sub- jects would corred the En'tt" and l"ffitore the mispronouncal word M arslen-W ilson intapreted these I"ESUlts as showing that the on-line theay was valid The experiment involved sprech comprehension, under.rt:anding and production. With so many major fundions in1H'acting. it is hard t.o be sure that the experi.nEnt was measur- ing an efiect of compcebm.sion. The results cf this eKpEriment are consistent with the theoiy I have explained and do not really hap to distinguish the theories. the farm: Tyler [Tyla- and M arslen-W ilsm. 1977] t£Sted the on-line theory for examplffi pt [3?7] If you walk t.oo near the runway, Jamti ng planes .... [328] If you've ~ trainOO. ~ a pild, landing planes ... 174 · [329] If you watch them. as they swcxy down for the kill, hunting eagles ... [330] ~ince it's forbidden by law, hunting eaglffi ... [Tyler and M arslen-W ilson 1977] This experiment demonstrated convincingly that prior coot.ext infiuencai which reading was preferred for the ambiguous example. I will return to this experiment in Chapter 9 and give details of it in that ch.apta:-. 1\ These are particularly g~od examples on which to test this theory, since they ran lmd to global ambiguit_r. .. · In [M arslen-W ilscn and Tyler 19801 tWo experiments are pnsenterl to invertigate the an-line theory. Both of thffie used word-monitoring tasks in wbidi the subjed was t:old to read. to a target word Examplffi in the first experiment were d three fonns; N on:nal ~ Syntactic Prrne. and Random Word order. For example: [331] The churd:i was lroken into last night. SODE lhievffi stole most of the lmd off the :roci. [332] The powa- was located into great wata-. No buns puzzle some in the lmd cif the text [333] Into was powa-water the great located SODE the no puzzle buns in lead text the cif. [M arslen-Vfilson and Tyler 1980] In each of thESe, the target. word is 'lead'. The sentfnCES WE:re presm.ted ewer hmdphonES and the subject was asked to .re;pond to the tffit word. Th£re Wa'.'e then three type; of task. In the Ida:ltiml task, the subject was told in advance which word t.o expect. In the Rhyrr.E and C~ tasks, the subject. was told to lock for a word that rhyrrai or was in the same ca.Ugory as the wonl For example, in the Rhyme task, the word might be 'bread' and in the ca~ory task. ''a kind of metal''. This expaiment was designed to test wcrd-reccgnition. as wen as the on-line theay of senJr'.lntic intffacti.on. M arslm-W ilson's investigation is primarily focused on sperll. reccgnition. In this thESis, I have not invertigat.ed this and can make no mmmmt on it. The speech :rea:gnition aspects of this experiment will be ignored It is difficult t.o separate which phenone::ra are due to sperll. reccgniticn and which to the parsing stage. Marslm-Wilsa:i. has~ [Marslen:-Wilson and Wash H178l [Marsla:i-Wilson. 1980b] that sanantic int.Eradion bas an important re.le to play in speech reccgniticn.. I feel 175 that thc:re is nothing in this theory that is inrompatible with the theoriES I have pnsa:lted ha:-e.. The results of the experiment showed that the time; for the N orrnal p:rc:se are fas­ ter than these far the Syntactic Prrne, which are 8::,oain faster than those for the Ran.dam W ord Orda-. ( 372 rmecs N onnal Prooe. 407 msecs Syntadi.c Pruse. and 439 msecs far Ran­ dom Word crder). Simi.liacy, the times takm to perform the three tasks increased. There l"ESUlts were inta:preted as supporting the theory,~ syntactic and semantic informaticn is brought to play in spoken-word recognition. M arslen-W ilson felt that this data suwcrted the on-line theory. He pa:;tulated that the increasing times far the three tasks were due to the lack of dfecti ve sa:nantic infor­ mation to assist the parse at md1 stq>. The amount of work required to pErlorm these tasks is not known and this exp:rimeat did not exclude the pSing must be pa:fomred belore redl. task can be performed It is~ that the Identiral task can be rffiPOllda:l to before phonetic analysis, the Rhyme task aft.a: phonetic analysis and before syntactic analysis and the Ca~ory task only aft.Er full analysis. The time diffen:nce for the three tasks would thm show how long each stage of pnxessing t.ook. These assumptions are net. checked by the experin:mt. It could be that the subject pa:fonned full analysis of the saitence before any of the tasks wa-e pa:formed The longa- reaction times would thm show the raative diffirultiES of the tasks. Far example, if the ldealical task rould be pafonned aft.a-word recogniti~ but bdore syntactic analysis, this would mmn that the word-recognition st.age was t.otally independm.t of the syntactic stage. This is not nec€Ssarily true. It could be that full syntac­ tic and sm:iantic analysis is obligatory. This 00.ng the case, all three tasks will be :re:;ponded to afur the smtence has hem. fully proce>Sed Again. the ex:pErin:e:lt does not preclude athe:- moire. The main result of inta:t:st is that the time taken to perfam. each task i.namsed with the increasing ca:nplexity of the smtenre. The raative time diff~ces within the 176 tasks WEre raativeiy mostant M arslen-W ilson fat that the increase in reaction time for the three types of sentena:s was due to lack of syntactic and semantic information t.o assist these derisions. My explanation fee the increasing ~ is as fallows. In Normal P:rase. the speech rea>gnition. parsing and re;ponse t.o the task processes are all normal. For Syntactic Prcse. the semantic interpretation fails on a rule by rule basis as has been explained Marslen-Wilson assumed that the sentencrs were parsed syntactically, so the sa:nantic pro­ cessor would have to be in the mode where even items that made no sense were accepted It is fair to assume that, for the semantic processcr to decide the item. made no sense and then agree to oontinue, rrnre time will be needai than for it to deride the word make:; sense and oontim.Je. This is assumed because I believe that the sanantic prcn::ssor will do its be:;t to make sense of the ita:n b:foce it gives up. hence taking roore tinE. Since this is so, Syntac­ tic P.fa)e should tak~ looga-, because the sa:nantic proa:ssor will be continually trying to make smse cnt of nonsmse. M arslen-W ilson assumed the tasks took place bdore the full analysis was finished If we assume that full syntactic and serrantic pnxESSing is obligatory and that it is not possible to nspond to any d the three tasks until aft.a:" the sentence has bea:l fully p:rocESsed. thm for Syntactic Prrne the sanantic intapntation will not be t:otally criberent. so the rESpODSes to the tasks will take lCJ:lgEr to pErlorrn. For Random word~. the syntax and the sa:nantic:s will be mntinually failing and. hmce, very time consuming. so a lcngEr rmdim time is predicted on this basis. If the tasks are based on the results of the analysis. thm the sanantic inta-pretation will not be roha:-a:tt enough to make decisions. As a final romplication., the examples d Syntactic Pl"u3e are suppooed to be syn­ tactically. but not sa:nantically. wdl-fonned Smtence [~] is not syntadimlly wdl­ for.mOO. The main vErl> of the tert: sm.tence is "puzzle' andithas twoPPs. one with "in'.' and me with "off' in addition to the quantifi<:r "son:E'. If it is truly Syntadic Prme, then it should be poosible to suhrt.i.tnt.e mOOifiers in this smta:ice, without manging the main vErl>. t.o .make it wE:ll formed. assuming the the mtegai.sation for modifia:s depa:lds on the main 177 vai>, and dianging the main vffi) also change; the aca:ptabl.e rrndi.fiers. For [332], it is not poosible t.o sul:stitut.e other modifers because ''puzzle' dOES not take the two PPs as mOO:ifiers. i. e.. the following is unaca:ptabl.e. [334] *The boys puzzle som.e in the class off the teacher. Marslm-Wilson has indicated [pers. comms.]. that many of the other examples used in the experiment also have the problan that they are not synt.adimlly wdl-fon:ned Because of this, the increased rm.ction timffi may ~t from pI'OCffiSing diffirulties . . This expai:m:::nt does show that semantic information bas an dfed on the pro- CffiSing of sentm.ce:;, but I fed it dee; not predude my explanation. Crain and Coker [Coker and Crain 1979] t:ffited the flan-line' theory with the nam- ing paradigm explained in Cb.apta:- 4. The examplES they used to tfSt this were the same reduced relative dause examples that we have disrnssed in Ch.apUr 4. They ronduded that sa:nantic derisims are made for this case. They then generalise over all cl. language and say that this shows it applies t.o all rules in the parser. They did not distinguish as I have. Mween rules that have a cl:loice to make and rules that do not. This is in common with the error of TylEr and M arslen-W ilson. Clark (Clark 1973] called this the 'The language-as-fixed-meet fallacy''. He pcinta:i oot the dangEr of tffiting a specific case d langauge and gm_eralising CNEr all of language. These experiments have tested special cases, but not distinguished the cases where I have said that non-synt.adic infCl'.'IIlation docs not play a role. As a result. their experiments do not refute my explanation and they suppcrt my theory of when interaction is used Hence. although the "en-line' thec.ir.-y is vuy attractive. I fed my theay helps t.o distinguish the two cases wha-e the non-syntactic proa:ssor chaEeS betwea:i altanati.ves and when it bas no altanative but t.o reject the sentm.re 178 7.8 Linguistic Analysis There are two major ~DUistic views which have influenced this work. The first is a transforn:iational ap,_:>rOadl to linguistic analysis [Chornsky lgj?,65, 73, 75, 75b, 76. 77] and the secnnd a non-transf:orrmticnal approach [GaWa.r 1979,80,80b,80c]. In this section. a brief explanation cf the main points in the:>e two throne; which are relevant to this work will be presmted 7.8.1 Extended Standard Theory The cum:nt pal"Sa" uses the Extended Standard Theory (ESf) analysis of smtena:s for three :reasons. Firstly, this is the system_ I was taught whm I lmrned linguistics. and hm.ce the system with which I was rrxEt familiar. Secondly, it is the mmt wicribed. The majcr area in EST which we are con NP VP. In an embed:led sentm.re, the subject NP may have ban. IIDVed t.o a high.Er sentence {raised} as the subject of a higher sentence. For example. in the smtence: [335] M ik:e sanis t.o have left. The undaiying strud:ure is: [336] It seems that M ike has left. In EST, it is~ that the word ''Mike' has been raised from. the lowest sm:­ tence to a high.Er smtence. Transfarmatims have the ova:all effect of changing the wcrd. orrlfr betwren the smface strud:ure and the deep structure. Anotrur example d. a transfon:m..ticn is "arntiliary inva'Siori'. This transforma­ tion DEp; sentence [337] into sro.tence [3381 179 [337] Is the boy at the movi.e? [338] The boy is at the movie. One ran see the mange in wortl arch-. The use of transformations greatly increases the number of possible sentence:; which can be generated from. a grammar and greatly increases the parsing probleni. We will return to this bdow. Chomskys analysis of the auxiliary, roughly, is AUX-> (MODAL) (HAVE EN) (BE ~NG) This analysis is vr:ry descriptive and whm the above transfonnation is taken into acn:x.m.~ providffi a good explanation of the English auxiliary systa:n. This will be mn- trasted with Gazdar's analysis bdow. In EST, non-tem:Iinal symbols have no int£m.al struc- ture. They are rnereiy the name of a node, e.g. S, · NP, VP. PP. etc. It is assumed that the read.a- is familiar with the ether details of EST. 7.8.2 Phrase Structure Grammar Ga7.dar has proposed the theory of Phrase Structure Grammar (PSG). All move- mm.t ruks, bounded and unbounded and all rules making referm.ce to identity of indices in EST are nmoved Gaz.dar's grammar is further rEStl'icted t.o be context-free. Gazdar argues that this yielrn two main advantages. Firstly, the dass of grammars is restrided mhancing lmrnalility. Seccndly, the ro:rulting grammar ran be parsed in a time which is proportional. to the OJbe of the lmgth of the smtm.ce er less. This is not true for the rocursi.ve or recur- sively enUIIJEralie sets of grammars which indude a transformational component. The are several mijor di.ffEnnces between Gamars system and Chomsk)Ts that are of importance hEre. The first is the lack of transfOIDlations. Whilst word order may change in EST. in PSG it does not. The above pairs d sent..a:i.crs are not considered. to be mated in the syntax. From a parsing viewp[:;NV~ [[aj v V= 1 .... > [+AUX] (In this rule, V- roughly means a veri>pbrase. The feature (a] must be the same for both sides. The phrase; will be initially rr.arked + FIN ite and +A UXiliary. Aft.Er the rule has run. the V- will be marked+ INV a:ted.) The fea~ +FIN and+ A UX bcl.mg to the V- node. This Meta rule will produce a new rule as on the right from. the rule on the lEft for all symbols "a'' in the grammar. These rules are not transfon:nations, but rrerdy a map- ping d. rules t:o rules. 181 In this chapter we have seen several alternative approadies to parsing and de:;criptioos of several works which have infl uenred my work. At this pant we have fin­ ished W"eloping the mod.cl, and we have seen that it mrnpa.re; favorably with the litaature. In the remaining d:lapt.ers, we will see a few details of the JXlrSer for those rnackrs intererurl in the workings of the parser. , 0 182 The Parser In Section 2.4, we saw in general how ROBIE differs from. PARSIFAL and how it works. This cl:tapter will provide an introduction to the parsing mecllanism itself and allow Mter understanding of the daims made in the thffiis. This chapter will also give the details of deterministic parsing in gena:al and the implementation of the principle:; discussed in this thesis. This chapter is intended only for these readers which de:iire a more detailed description ci the parser. Sorrre readers may pnf ff" to proreed to the following cl:tapter. There are several Im.jar differena:?S between M arcus's PARSIFAL and ROBIE. The:;e indude: ROBIE's ~tactic procr:::ssor has the ~on-syntactic pn.xESSor seed whether to nm a rule or not in certain key situations, ROBIE calls the semantic in~ at each rule, ROBIE dee:; not use the Attention Shift, ROBIE does not use any diagntimised in this way. ROBIE's st:.acX size cb?s not grow during the 184 parse above the size of the ad.ual ronstituent.s being bulll Henre it is just as efficient on lDffllory as if it were iterative. There are currently 104 grammar rules in 29 packets. The vcrn.bularj of the sys­ tem is 700 words. but can be easily expanded. The nuphology greatly increases the nurnlxr of words remgnised (To over 2,000 words). I will use the following rule as an example of the grammar rules. This rule is expressed in English-like nctation to make it easia- t.o read Rule DETERMINER in packet PARSE-DET: To analyse a determiner. if you have the fERtnre ''cEt'' in the first buffer then:- 1) attach the first Buffer to the bottoJn of the Active Node Stack as a detaminer. 2) t:ell the non-syntactic processor you have a detern:imr. 3) deactivate the packet cc:nt:aining this rule, PARSE-D ET 4) activate the packetPARSE-QP-2 Recursivdy call the rule rrat.cher. The ruie ''determiner' will matdt if the parsEr stat.e was: Active Node Stack: 2: < opn> s [SS-START. CPOOL] [PARSE-DET,NPOOL] 1: < opn> NP Bl: the B2: shy and leave ROBIE in the state below. Active Node Stack: 2: < cpea> s 1: < q>m> NP det-the Bl: shy B2: boy [SS-START,CPOOL] [PARSE-QP-2,NPOOL] The above grarmmrrule looks only at the feab:Ir'Es cl. Bl: The rule bodies are very simple, md:l. rule OOng alie t.o prlorm only a few fundims. The pn:gramruns on en a PDP-KL10-91S. The parser occupi.ffi 45K of core and the dictionaries, with a vorabulary of 700 words, add another lOK. The parsa- runs en ''Dre-10 PROLOG version 3' whid:t oocupies another 35K cl. cora Marrus's parsa- ocrupioo aver 3JOK c1 rore, including MACLISP and the parse time was approximately .1 sec/word, including a case-franE i.ntapretfr. This is aver twice the size of. ROBIE. and only half as last. 185 6.2The Grammar A grammar rule is :re:rtncted in the fdlowing two ways. Firstly, it can only per­ form a mmbtnatim of the seven grammar functions listed hlow within the rule bcdy. Secondly, the pattern matching for a rule is :restricted to the syntactic features of the first two buffers. The only exception to t.his pattern. matching re=;bidic:n is whenev-a· a non.­ syntactic te;t is made before a rule can match. It should be noted that the ''agreement~' follows the;e restrictions. An agree­ IIHlt te>t can only use the syntactic fmtuns of the first two buffa-s. 'fhffie te:;ts were only separated from the non:nal pattern to reduce the number of rule; by taking advantage of similar patt.a:ns. 8.2.1 Grammar Functions There are seven grammar functioos, each ci which is explained below. a) adivate a packet: This is used to reflect expectaticn and to control the parse. The form of the function is: activate< packetnarre> in< packEtlist> to get< new Packet list>. b} deactivate a packet: This is mrn:h like a POP or return from. node building would be. It indicates that the run-a:1.t packet is no longEr needed Deactivate can only deadi.vat.e the a.rrra:i.t packet. It has the same argurnmts as "activate'. c) attach; This function will be fully eKplained in a lat.er section. It takes four arguments: attach< Buffer> to the< Cmrent Active Node> as a< part c:l speedl> and rail the result < New Current Active Node> . d) new_n_ode: This gm.Erates a new node with the fmb.Ir'ffi specified or with no special. fmt~. An example cl. its use 'Y"ould be: make a new noun phrase node with the features 'np' and ''name' and return the rerult as "NPl". The features mn be lEft out if inappropriate. 186 e) lookup: This is used when transfon:natioos insa:t specific lexical itans into the buff a-. For example: lookup the wocd "yoli' in the dicticnary and call the remit, B 1. f) addfmts: This, is used to add a:rtain features to a node. For example: add the fmture:; ''major'' and "ded" to the Current Adive Node and ntum the rerult Cl. It is :ran:iy used and seldom. are the fmtures it adds checked by the rule matcher. The feature:; it com­ mmly add; are to assist a re.arlEr of the output. g) sa:nantirn: This isn't really a functi~ but is an D_PEration pmormed by evey grammar rule. The non-syntactic pl'OCe:)scr is given the name of the rule which is exeo.:rting. the cumnt active node and the two buffers. The pl'OCe)SOC thm extracts whatever infcrma­ tion it desire;. It shruld be noted that this function dee:; net return a result and could opmite in paralld with the above functions. Tha-e are two additional grammar fun.dims. Tha-e are two calls to ''coerre' and cne rail to ''pEYCOlat.e' in the grammar rules. Cce:re automatically disambiguate:; a wonl whilst pm:rlate is used to transfer fmWn:s to the AUX node if it has no leximl daughtfrs. The USES cl. COEn::e will be explained in Section 8.2.3 The penrl_at.e rail is a special fonn d. transfa- to set the taJ..se of the auxiliary and verb phrase. These are the only functions the grarmriar rule:; can pa:fonn The buffa-s are automalimlly kept filled so that they are the next three it.ems below the Active Node Stack. (Three solely to areommodate conjnnction.) This is done by an explicit shift d the next unseen worn int.o the third buffer. This could be done ~ a hidda:i mechanism, but is easy to do explicitly. The rule matdier ran look only at the featu:n:s d. the first two hllf6"8 to daide which rule to run. Also, on.re a node it attad:l.ed to its rmther, it is nevEr again looked at. This bas an int.en?sting implira.tion. The parser is nevEr able to look inside a h:lffEr at it.s inmn.al ~It can only look: at its t.op-levd fmtnr'ES. This rnea.ru that the syntax tree cruld be thrown away aft.Er the se:nantic intapra-.ation has ex.tracta:l the information it neros! The syntax tree doffi not r:mlly need to be carried around in its mtirety, only the top 187 level fmtnres of each constituent 8.2.2 Closing nodes It is believed that the darure of nodes is of special significance psychologically [Kimball 1973]. Once a ncxle is dCEErl. it is vey expensive romput.ationally t.o re-open and alt.Er it. As a :rerul~ when nc:rlffi are dared :reflects when the parser is certain that the ncrle is complete. The quertion of damre is not explicitly addressed in this thffiis. Since I am not cono.:nied with the size of the Active Node Stack. there is no need to dc:se items t.o conserve space. A node is open when first crmtai This IIlffiil.S that daughters can be attached t.o it When an open node is attad:url t.o anotha- node. the open nc:de is dooed. In this way, do­ sure of nc:rlffi is aul:omatic and no separate node dcsing rrechanism is needed M arcus's parser did not distinguish between words. q>a:i nc:O::s {item unckr cx:n­ struction) and dooed nodffi (axnpleted items). In ROBIE, there are three types of nodffi: a word node, a dooed node. and an open node. This distinction helps t.o make the darure of nodes more dear. Open. node3 are distinguished fl'OID. dooed nodes by the _pI"E:Sence of a "hole'. Ead:l open node has one 'hole' and when the node is dc:sed. the hole is ''pi~'. Each node has a list of daughters. the last of which is a variable. This varialie is the "hole'. W hm. a daughta:- is attad:l.ed to a node, it is unified with the hole. hence beaxning the next daughta-. The use of hole; is an efficient implemm.tation dEtail to save rerursivcly hlilding trees in PRO LOG. Wheneva- we attad:t an open node to andha- opal. node (for example whm attaching a PP to socre NPs). the:;e hde; can be seeded to chorne where constiblent.s are attached. For erample. when a S€CClld modfier is at:.t.ached to a NP with a mxlifiEr (i.e., an NP with a PP). the serond mxlifier may ma.lify the NP or the PP attached to it. The "attad:i' function mn chocse with which hole (the NP hole or the PP hole) it Should be uni- 188 fied The gena:alisaticn of this technique ca..tld be used to implEnJent pseudo­ attadmEDt [Church 1000] in ROBIE. Fer example, Church used pseudo-attachment to attach a PP to all it.s poosible rmthers. In ROBIE. fer each node to whid::t a PP could re attached. there is one 'hole'. Instead of pseudo-attaching the PP to all pcssible opa:l node:;, one could associate the PP .with a list of hdes. At some later point ROBIE rould decide to which hole t.o attach the PP and then unify theni_ In this way, it is prnsible to keep the PP attachTllfflt options c:pa:l. Pseudo attachment could be done by unifying all these holes and the PP together as the same item. PROLOG structure sharing would make the PP the daughta- of all the nOOffi. I thank Lawrence Byrd for pointing this ouL This would also provide the list of "pcssible cane two ai-e isOGJOiphic for the fdlowing reasori. Each time a buffe- m.atd:ies a pattern. (for example [noun] or [nsD. the parser is assuming that that buffer is a partia.llar part of speech (i.e., noun. verb. auxverb, etc.) For example if the parsa- matdl~ the pattern [ns]. then is assur:Ilffi the word is a noun. For ev~ pattern in the grammar, the oorresponding gram:nar rule will lat..Er attached the buffer which matched that pattern as the sarre part of speech as the pattern. assumed the word was. So instead the ATTA CH fun.dim. disamligu­ at.ES the word when it attached It doe; this by disambiguating the word to the part of speech it is attachai as. This method is the same as that used by [Milne 78]. Although dif­ fen:nt in implEma:ltation. it is isomorphic to the approadl de:;cribed in Chapter 5. It mn be noted that with one exception the ru:rnnt parser p:rlon:IE exactly as if the pattmi rnat.ching perfom:Ed. the disamliguation. The ooe exreption is the IMPERA­ TIVE rule. This has the pattan [tenseless] and is adive at the start d. a saitm.re. The rule bo9Y ch:s not attach the tenseless verb, but instmd insat.s the wcrd ''you'.' into the first buffer. The parser will make this the subject NP. A later rule will attadl the tenseless vEri> as the main verb. Since the IMPERATIVE rule ran assuming the word was a v®. the word is disambiguated t:o a verb by a special function (coerce) in the rule IMPERATIVE. In the above section I have indicated when nodes are dcsed. The action of. dcsing a node is simply to unify the "hole" which is its righb:noot daught.Er with "nil". The node is then dooed and will not aa::ept any additional daught:Ers, in the Cl.IITa:lt impla:nmtation. The automatic transfer of feab.Jres can be illustrated by the NP building rules. When the detaminer "the'' is attached to a NP. the number featurrs ''ns, npl." are transf ared to the NP node by A TT A CH. When the headnoun is attad:ted to the NP. ATTACH will try to transfa:- the nurnOO- f mb.Ira:; of the noun to the NP by intu'seding with the nurnOO- of the NP alrmdy pn:smt.. This apprm.ch was stJggffiled by M aIUlS (personal eonmnmication]. For example, if the hmdnoun is ''block". then the fea.tnre ''ns'' will be inta's€cted with the featuns "ns, npl" already thE:re. resulting in the feature "ns'' on the NP node. This inta's€C" tion d fmlllre3 is similar to the approadi used~ Church [Churd:l 1000]. 190 The agrea:nmt check will ensure that the featn~ agree More a grammar rule is run, but this check doe:; not change the features of the ncx:iES. ATTACH will then. update the number. as ~cribed above. when the rule runs. This autocnatic transfer allows fmh.u:IB to p;nrl.ate up the syntax tree and has the sarre Effect. as the explicit transf ErS in M arcus' s parsa:-. One nc:te for fulnre improven:e:it. Currently in ROBIE, node:; are mly attached aft.ff" they are corrJpleted That is, evm thought.be mother is known, they are attaclied to ther mother only when all the daughters have been foond This is purely an implernentatim detail. O~ the mother of a node is known bdore it is rompleted. Foc example. whilst parsing a PP. the next NP starta.l will ~ attad:ted to the PP onre it is built. It rould be attached to the PP while it was being built and. save a spare en the stack This seenis desir­ able as it would reduce the size of the Active Node Stack by one itan Also, AUX nodffi rould be attaclled to the cmnnt S node while they are being built and various relative dausES rould be attached to the NP they will modify whilst being builL In tfnns cf. ilie current implemen.tation, this approad:l. would probably take roore Illffilory cdls, rather than lESS. Items in the Active Node Stack are stored as a list, so the arldition of an item, means the addition of one pointer in the list If it were attached to its IIDther. then a point.a:" would be needed to indicate its mother, as wdl as a point.Er to indi­ rate that the mother has one or more inoomplete daughters. 8.2.4/mplementing the Production Syst.em Grammar Production system of the form used in the currmt grammar (one an~ent) are V£!IY easy t.o implemmt in PROLOG. Althrugh the exad: details are not important t.o the daims made here, I will explain this bridly. Each rule bas a pattern. ronsisting cl. the rule's pat:Xd narre, priority, the featnres which must be present on the two hlffa-s. the agrea:nent check. and the name of the rule. For example. the pattern fer the rule DETERMINER is: 191 padcet:PARSE-DET rule name:DETERM INER priority: 10 Bufferl: det Buffer'2: t Agree. t Based on the current active packd and priority of the rule ("10" in this rule), the rule matcher wiJ1 automatically match this pattern. The pattern give3 the featu.n:s which must be true before this rule ran run. In this case, Bufferl must have the feature "det: 1• If this mndition is met. the ruie matcher runs the body of the rule which is, railed ''DETER- MINER". The "t'' mmns that the rule matdler doem.'t care what is pre::;ent If it does not n:iatdL then a fail is gmerated This then muses the pattan matd:ter to find the next pattan and the pI"rffient.ed by "dotted rules''. These are best explained in [Mart.in Chunh and Patil 1981J The next part of the discussion will be based en this paper [ibid. p. B]. A dotte:l rule is ddined as a "OJD.text-free gramrmr rule, with a dot inserted to0 indicat.e how rrrud:t of it has been parsed'. For example, if we have the context-free rule: [339] VP-> Y NP PP we ran add a OOt showing how IIIIKh of the it.an has hem. l:xJilL At the start of the VP, the dot will be inserted bdore the 'V". Aft.a- the verb has been found. the OOt will be as follows: [340] VP-> V. NP PP A:fu:r the object has been found. the dotted rule will be: (341] VP-> V NP. PP Using this notati~ we have a Imihod of keeping track of the it.an we are build­ ing and how mud:l d it has hem. builL It has hem daimed that the packets are IIH"Eiy a less t:ranspam:it notation far the same strud:ure. Fer example, if we are prrsing a VP and do not have the main van yet. ~ding to [3391 we will have the packet PARSE-VP adive. Aft.Er the verb has been found. [340]. we will have the packet SS-VP adive. Atta- the 194 object has hem. found the packet 0 BJECT will be deactivated. this is the SanE as situation [341]. It has not bea:i explored throughly, but I believe there will be a one-to-one COfTffiPOildenre betwa:n the packets and the OOtted phrase-strud:ure rules above. Thenfare R 0 BIE rould do without P?-ckets and instead the rules would match on tb.nE items: the two lookahead buffa'.'S and the phrase structure dotted rule of the bottom of the Active Node St.ack. These dotted rulffi have one other advantage CNer the packets. The dotted rules provide a bet:Ur fonnalism for the top-down component of the parser and make subcategori..­ satic:n rmre explicit, hence helping with gap finding. This then assists with WH movcmmt and conjunction. Church USES the:re ruks for this puqn:>e and they work very well. In ROBIE, the rules need.to match three iten:Js, the two lookahead hlffers and a notion of ''state'. This notion of state could be in:Jpla:nented Either by the packets, or by the dotted phrase structure rules. The poosible ba'.iefit.s of dotted rules were diso::wa-ed. t:oo late, and the advantagES were too small to warrant implementing them in the cum:nt parser. 8.5A Few Notes on the Grammar There are three arres of the grammar which this researc:h bas not investigated in detail. ThESe areas have bEHi the focus of research by other authors. and the approaches I have adopterl has been OOsed on thar work. 8.5. l Passive The approach to passive used in ROBIE bas been cq>ied directly from [M arrus 1980J The patta:n [belen] indicated that the smtm.ce is a ~ve. and a grarmmr rule in the packet BUILD-AUX adds the fea..tnre ·~ve' to the main vErl>. When the VP is started. the packet PASSIVE is adivat.ed. The rule in this packet insat.s a trace into the first buffer. This trace is thm bound to the rumnt syntadic suiject by the non-syntadic 195 p:roce:;scr. The non-syntactic prcx::essar will note that the sentence is passive and use the log- ical subject as required. The justification and implications of this approam are in [M arms 1980]. 8. 5. 2 Conjunct ion This research bas not invefilgated the problem of ccnjunction.. ROBIE has a very simple and slightly ad-hoe rrethod for handling Conjunctions that waks fer many simple case:;, but is incorrect in gm.Eral. The methcd used will handle: [342] The b:>y and the girl hit M ary. [343] The lx>y hit the girl in the park and in the head [344] The boy hit and kissed the girl. [345] The ooy hit the girl in the park and street. but not: [346] The boy hit Sue and kissed Mary. When the parse:- encounta'.'S a conjundic.n. it is pushed onto the Active Node Stack and the packets CPOOL and PARSE....YP are activated Thffie packets contain the rule; t.o lxgin noun phrases and verb phrase;. When the ·constitumt following the conjunc- ticn has been fully parsed. the Cllljundion word is dropped from the Active Node Stack into the second l:uffer. At the same time, the constitumt before the ccnjunction word is dropped int.o the first buffer. The rule x_.A_ND_J{ will then atta:npt t.o match. If the rontm.ts of the first and third buffErS are synt.adically the same, they will be conjoined. This ted:mique can conjoin single woros, NPs, VPs, APs, PPs, etc. It will be wrong in the case where the first constitua:it after the conjunction is a sub-constitua:it of. a larger it.an. as in [346J This apprm.cb. does not use any nm.-syntactic informaticn or verb As pointed out in the introduction. no one bas y~t rerearched a satisfactay method of. parning conjunctions with le;s than three buffers. Churdl. [Churm 1980] has devdoped a promising approach t.o oonjundi.on fer a detenni.nistic parsa- based on vern sub- ca~orisati.on. 196 8.5.3 Movement The invffitigation cl movema:it phmomena has not beffi a focus ci. this rIBearCh. A method fer handling subject-auxiliary inversi~ relative clause> and W H-questions has been implemented. but it is partially ad:-hoc and will not be dEscribed in detail in this thffi:is. To handle moverre:it. a ''mova:rent stack" similar to the ATN 'HOLD List' [W oc:rls 1973] bas heal added to the parser. I t.ecrn are pushai onto the rrovement stack when they are first parsed and inserted into the buffer when the appropriat.e gap is detected This approad:l. is very similar t.o that used by [Charniak 1981l This movement stack sea:ns neressary, as a plare is needed to hold the n00e undergoing movement until the gap is found The author fees that this approad:l. has 1limited implications for the two VEl'SUS three buffer issue, but may have intererting implirations ngarding the use of ncn-syntactic information. The method used is adequate to handle the example; ci movEfiHlt which appa:ir in Appendix B. but would fail on more diffirult examplffi. For a discussion of move:nm.t phenomena and Hs implirations in a deterministic parser, see [Marcus 1980].. 8.6 The Agreement Checks Throughout the earlier chaptffs. we have seen several "agrem:Hit c:hed. It is needed f cr A uxi.liary inversion examples. [have] [be] [modal] [do] [en] [en or ing] [ tenseless J or or or V Erl>-N oun agree is used to msure that the subjed. and main vErl> agree in number and person. This te;t is only a simple darifing of the person-number codffi. [ns,n3p] [v3s] or [not(ns or n3p)] [v-3s] or [ns,not(n2p)] (13s] or [t] [vspl] or (nlp,ns] [vls] or [npl or(nlp,ns)] [vpL.2s] This is all the Usts and cne ran see that they really are mly a syntactic short- hand B.1The Semantic Database In ROBIE, sa:mntic intapretation bas hem. added in two ways, the samn.tic intapretation and the non-syntactic cftecks. As described in Chapta' 4, mdt rule has a mll t.o the sa:nantic inla:preta-. In M arws's parsa:-, the sa:nantics was done puniy by dmnis. 198 For psychological reasons, I feel my approach is better. The S€a)Ild usage is to OO:ide which rule to run whm two er more rule; match. This is used m~ rarely as was explained in Chapter4. The semantic datamse is really a cross between a repl"ESentation of the syntax tree as predicate calculus assert.ions and some semantic information. Most of the informa­ tion in it is derivable from the syntax tree. This database provide; a universal micklle representation d the senta:ire. which is sui~e ior almost any application. Although many cl. the items in it were devcl.oped for the MECH 0 syst:em. the representation could pro­ vide rnirl. for translati~ er database q_l.lffY. It should be noted that this sa:nantic interpretation is cnly partial. It provide:; the inbrlare t.o a largEr systan and a :much more elaborate se:nantics could be added How­ evEr, for the c:mrent application the MECHO front rod, the semantic interpretation is as large as was nf:cessary. The database is used by the semantic inferencing program for M ECHO [Mellish 1000] and has beEn dffiigned to meet its specific needs. 199 Other Ambiguities We have now finished developing our mOOel of the syntactic pn:x:e:;sor and t::ve seen that it compare; favourably with the raated. literatnre. In the two remaining d:lapt.0.."S, we will look at Se'lreral areas for future inve;tigati~. 0 ur intention is to explore the thro:::-=:t­ ical implications of these probla:m and, except whEre noted. none of the techniques cs­ cussed here have been implemented. We have seen how to handle many classes of ambiguity automatically. but fu_ye are some problems raised in the earlier sections that need to be rectified. In Ch.apter 5 ·· rre discussed the handling of "have'. but our solution required three buffErS. Since we have szid that we will use Only two buffers this is not satisfactory. In this cb.aptEr, we will disc-"ss global ambiguity and the method by wbid:i vfrl> particles are handled as well. 9.1 Verb Particle Handling The parser uses a fairly simple approach to parsing verb particles (~ticns used to m:xlify the van) and separating than f:rom.Prepcsitional Phrases. The approach. c:Jt­ lined here is implanmted. and wcrk:s well in the MECHO world. but is not genaally c~.Je­ quat.e. As I explain the basic technique. I will ecpl.ain its limitations and n~S::JY improvements. For each verb, there is a list of mm partide that, if it occurs with the verb, rr_:r_-;st be used as a partide. For ex:ample: [347] The robot picked up the blcxk. To handle this, tha-e is a high priority rule in the packet SS-VP with the pattern. "[partidel san rhk(partide)". BEfore this rule can apply. the sa:nantic dJ.eck routine dlecks the obligat.ory list cl partides associat.ed with the sa:nantia; cl the main ve.i:>. This 200 is implen::Hlted as a set of semantic markers. If the particle is on this list, then the rule will mat.eh and attach the partide. Otherrrise, the rule will not match and the particle will be left in the oorrer to be picked up by the pp rule;. This check is based only on a list of verb-particle pairs. To be corred., the list shruld indude the optional and obligatory prepositions that the va:b can take and make a more intelligent derision hlsed en these pc:ssililitie;. The current systa::n of vt:rl> ~or- isation doe; not indude this. If the above rule doe; not rnatdL thm the partide will remain in the first buffer. If the partide is followed by a word which can start a noon phrase, a PP would eventnally be Wilt and handled in the nonnal way. , Finally. if there is no element of a PP aft.Er the partide, then it will still be in the buffer. At this point, a low priority rule with the pattern. [partide] will matdi. and attach it to the van phrase as a partide The same rule will pick up the "stranded'.' particle if the partide has beaJ.. moved No attmJpt has be31 made to ensure that this doEs not a particles ~y. Mt.a- verb sulxategorisation is needed as well as sdectional ~ctioo.s on the verb:;. Neither of the;e are more of a problem for a det.Enninistic parsing than for a non-deterministic parsa--. 9.2 HAVE re-visited In Chapt.er 5, M arcus's 'Have Diagnc:stid' was replared by nurnbEr and vfrl> parti ­ de agreenent. With this new method. the patta.n fer the rule YES-NO-QUESI'ION should be: [ auxvab I np I vErl>1 agree( auxverb, vErl>), agree( vErb, np). This mrans the rule YES-NO-QUESTION should only be run if the subject and vai> agree in n~ and the auxiliary vErl> ran be plac:al before the main va:b in the va-ba.l dusbr. If this rule fails to match. then the rule IMPERATIVE will nm in.stead and the sentro.re will be parsed as an imperative. In that diaptEr. I showed. why this is needed and how it works. In~ to handle this rule, we need the pattern: [have][np][vErl>] This rule pattan pre;ents two problEms. Firstly the pattan use; three buffer.; in onkr to check the vab fffil:ur'Es as wal as the NP fmtnres. Secondy, in order to have the NP built in the second buffer, we need to ~arm an A tl.Ention Shift The three buffa-s sea:n. to be neassary in orda- to "see' the word ''Have' in the first buffEr, the NP that it mmt agree with in the serond buffa- and the vErl> it must agrre 202 with in the third buffer. This is a violation_ of. the two buff a- coru:,~t If this rule is correct, then we are unable to use only two buffers thru.:!ghout the parser. We have also eliminated the Attention Shift from_ the J:XlfSer. In no way can the NP be built in the second buffEr, given the current framework. ConvEYSdy, it seam to be nece3sary t.o build the entire NP before the decision is made, since we must know the IlllilJber of the NP. This is the only rule in ROBIE that seems to need three buffa-s and also the only rule that needs the Attention Shift! (Except. of rourse, conjundicn.) In order to handle this one phenomenon two special mechanisms would have to be added to the parser. Clearly, eitha- the rule is inc.orrect or the three buffers and the Att:a:ition Shift are n~ t.o the parser. ls it pcES.ible to find rounter examples to our H ave-Diagnootic? Consicll-: [359] Have the boys t.aken the exam? [360] Have the boys take the exam. (Our original smtences) [361] Have the boys taken the exam (Yes, an Imperative.) [362] Have the eggs broken? [363] Have the~ broken. (Both are a~e!) [364-] Have the stndents put it back? [365] Have the students put it back. {Again both are a~e.) Sent.Ence [361] can be paraphrased as: The sprek:Er instructs the list.Ener to make a third Pffi>OO. deliver the exams t.o the boys. If the YES-NO-QUESTION rulewen? correct. none of the in:Ipeative sentences above would be pa:;silie. Although SODE are oh:;rure, they are all dmrly acocptalie. It is certainly easy to produce munter example:; to this rule. The impaative reading of [359] is dearly pa;sible as [361] denxnstrat:e;. Even though [361] is areeptable, vey few people notice this reading More it is explicitly pointed out. The rule YES-NO-QUESTION correctly predid.s the prefff'fflce for these sentences, but the rule does not say how t.o get the impErative rm.di.ng. Our re-formulation of Marcus's HAVE-DIAGNOSTIC into the YES-NO- QUESTION rule shares something with the original rule. It is not always rorrect. But what wruld the cared rule look like and wh£re did this rule go wr:oni{! Consida" the fdlowing 203 [366] Have the stuOOits who missed the exam take it [367] Have the students who missed the exam taken it? (368] Have the stndents who missed. the exam taken to the room. [369] Have the students who missed the exam be5l taken to the rr:xxrfl [370] Have the students who missed the exam taken in the back room. . [371] Have the students who missed~ exam taken in the back :room. go hoCne. [372] Have the students who missed the exam taken in the back room gone home? [37.3] Have t.:.11.e students who missed the exam taken in the back room. finished ya? [374] Have the sturlents lV-ho :n::ib""'Sa:l the exam taken in the back room. finished off. Bdore rontinuing we nrust ask which, if any of these sentence:; mused us to gar­ den path and whEre we had trouble when reading them. Finally, was thEre any trouble with the word 'bave'? Each of. there examples meets the criterion of a potential gardrn. path. The dis- tance from the amliguous word ('have') to the dis~ouating word is far more than our three buffers can enrompass. If thc:se are t.ruly [cl.ential ga:rdm paths, the gardm path Effect should have been fcl.t in at lffiSt one of mch. pair. H r:Nl~ff". rmny people I'ffiding these examples eh not fed that they are garden. paths. These examplfs were tested in the sax:nd expEri.ma:lt to detamine whethff" they mused the ga:rdm path effect. The sentena:s whim were tested are as follows: [375] Have the students take the exam. (376] Have the students t.ak:en the e.xarrJ1 [377] Make the stndent.s take the exam [378] Did the studm.ts take the exam? [379] Have the sb1dents t.ak:en in the 00.ck roomfinishedyet. [380] ·Have the sb:Jdm..ts taken in the 00.ck room finished off. The results of this l:ffit will be disrussed in two parts. First the simple sent.ma:s: Sentence [375] [376] [377] [378] Mean Time 369 268 364 279 204 Imperative Yes-No-Question Imperative Yes-No-Question It was predid.ed that these examples would show that there was no significant difference in rea.ctic:n time between :reading ''have' as an imperative versus 'have' as a ye>- no-question.. There was no Signifirant difference between [375] and [376]. There was a slight order effect in that the first sentence of each pair pre:sented tock longer than the seand sen~re Sentences [077] and [378] show~ the proa::ssing difference of impern.tives versus yes-no-questions, (signifirantly diffErent at the p< .05 leva.). In OOth sets, the Imperative was one seccnd slower than the Ye>-No-Question. This suggests that the 'have' examples w,Ere not diffa:'fflt f:rom. the namal examples. Sentence [379] [380] Mean Time 832 930 There was a very definite order effect with thE:se examples. The times given are these obtained when the example occurred first in the order. However, when the example occurred second the ti.In£s were 300 and 490 respectively. The diffE!'e[]_ce betwea:i. the sen- tenca was net. significant It seans that when the subject saw the first smtenre of the pair, inesprlive of which ser:ita:ire it was, they had a caisidt:nlble armunt of trouble. One can see that the time was alnnrt. exactly the same at the first expamre. These times showed that the ~ed had problems with these sentences the first ti.n:Je, rut then learned .a:nrt. d the example. whim assisted the unrurstanding d the similar sentence the next time. It is not passilie to tell whether the prcbl.EDJS wa:e due to the redured Unforhmately we have gained no usdul data from this seron.d set, but the results ci the first set show that t:he3e do not cause the gardEn path died:. The examples in this section show an ex::tra complimtion with this diagn, ed.] must be presm.t fer the wcrd ''have' to move between and have ronstitum.ts to agree with. This is also the patt.em. for the reduarl raative dause whim we saw in Chapta" 4 and which mn also lmd to a gardm. path. ThErefore, this diagnostic must 205 interad. with the potential garden path, mah.'ing the situation much more ccrnp!ex, but hep­ ing to explain why our rule cannot be ccn-ect. Our simple formulaticn of the rule with number and verb agreement dOES not aCXX>Unt for the data on this type of problem. Therefore. the old diagncstic must be incomrl.. This diagnostic required two spaial medtanisms (AttEntion Shift and 3 :fumers) to be added I to the parser. If we keEp the parser as originally fommlated then all we looe is an ino::irrect rule. Hencef~ we will net use this and will 'not need the three buffers nor Att.a:ition Shift To see how the parser should handle the weed "have', it is important to see how pea- ple handle it. There doe; not appear to be any ex:pm_n:e:it.al. data relevant to this questicn. Its implications, are important and would make a subj ed for further research. The following are puniy SPECULATIVE explanations of what may be nearly the mrrect app:roo.c:h. Th~ acnxmts are given to show the important implicaticns they could have. There are sevaa.l pcssilie solutions to this problem. 1. Let the usage of 'have' as an impe:a.tive be illegal. In sane c.asffi of "British English'.', this is true. Many of my British informants tell me that 'have' is nevEr used as an imperative The use cf "have' as an irnpera.tive seerrs to be a rare prolimI. If it did not exIBt, thm cur theory would be simpler. One rould say that it is not used in "British Englisll' because of pl"OCffiSing difficultie:s. It sa:n:s the uSES of this are mainly mller would have to ccmmit it:sdf to one tree or the other Wore the vErl> was seen. If the _pa:rstr rommitted itself to the wrong analysis. when it discove:-ed it is the w:ra::ig one, erteasive d:tanges of the tree would have to be made to rorrect the mistake. IMPERATIVE s I \ NP VP I I\ you V NP I I have S I \ NP VP If\ I' theboysY \ I NP take 4. the exam YES-NO-QUESTION s JI\ NPAUX \ 4. I VP theooys I I\ have V \ I NP takm. /}\ the exam As there tree; show, if A ux-1 nversim. doe; indeed opera.te, the two examples have very diffa-mt structures. Given this analysis, in no way ran the two strudures look the 207 same until the dloice between imperative and yc:s-no-queruon is made. It seems that the only solution to this problem. given this analysis, is that the Attention Shift [M a..rrus 1980] is ccrrect. With the Attentiod Shift and three buffers, the NP can be built before the auxili ­ ary verb is moved The parsa- can then examine all the featun:s and decide whether the sen­ tence is a yes-no-question or an imperative. If it did not examine all this infonm.tion. it would make an Eil'OI" on half of th.e;e example;. Because of the extmsive change that would be neressary, some ronscious effort should be n~ced. But I have al.rre..dy stated that no conscious etrort seems to be involved here. Sin.re we do not accept the Attention Shift and the three buffer lookahead.. the Aux-lnversion analysis is net. pa:;sible using this parser~ However, if the constraints we have assumed hEre are rorrect., then this analysis is wroag. Only if anothEr viable analysis can be found to fit our ccnstraint.s can this type of sm.tence be sucassfully parsa:l by ROBIE. Sum an analysis rould have a significant effect on current cpinion conCH'Din.g transformational versus non-transformational grammars. Let ris assume that a person creat.Es the sanE structure for both possi.bilitie; until the disaailiguation take; plare. Then the only difference will re the feature; and it is simple to ruminate inccrred feab.ms. In dealing with amliguity in Chapta:- 5, we gave each word all the syntactic features it axdd use. We thm. disambiguated it by ra:roving all the incorrect featl.lrffi. The same method applie:; htYe. This would explain why the loog ''have' example; do net. cause gardm. paths, er indeed any trouble at all. It also providffi an approach cornpa­ tilie with the two buff er constraint. Many linguists have felt that a non-transfonm.tional approach to language is belUr than the transfon:national approach of Chomsky. A non-transfon:natimalist bElieve; that the inverted auxiliary verl> ranains in the front of the subject in the smface structme. We shall use Phrase Structure Grammar (PSG) [Gazdar 1979,1900,00b,BOc] as an example d. this school cl. thought. These two example; come under the following rule; in this sys­ tem: 208 The I mp...rative rule [Gazdar l~ p. ::?13] < 16. [V= V V= ] .... > [ + I M p I [ + BSE l The AU X I nversioo M eta rule: l)bid. p. 46 J SAi: < [y_ V V- ] ..... > = > < [y = v v = l. . ... > (+INV] fa] [ + F1N] [a] [+AUX] Using the:>e rule>, the tree ~ysi~ for [300] and [377] are as ~ow: IMPERATIVE = v [+IMP} I \= v v , (+IMP] [+ BSE] I I \ have = N V If\ [+ BSE] the boys I \ v = [+BSE] N I 4- YES-N d;Q UESTI 0 N = y [+INV] I \= v y [+INV] [ + PSP] [+ASP] I \ have = N If\ the boys v [+ PSP] I \ v = [+ PSP] N 1 ~ take the exams tak:ED. the exams Ignoring the features, there Lwo t.ru:s ~ ide:.1.tica.L This is exactly as suggested Givm. this analysis, a _pff'Son cru1d keep both feab..Ires oo. the word "have· and hm.ce the tree na&s. as fer ambiguous woo:E. tm.til the vcrb is fcund This analysis Scm:JS to fit the psydlolcgiral fads bett.a-. The sanE will be true fer any analysis that doe; not move the Auxiliary verl> in the surface stiudnre. These example; may provide an inta"esting area in which to investigate the psydlolcgiral implications cl. ChOOEky's Ext.Ended Standartl Thrxry (EST) versus Ga7.dar's Phrase Slrudure Grammar. The results cl. the serond a'J)Erirnmt suggest that the propcsa1s · hEYe may be axrect. Before this can be cx:nfinned b.owevEr, data need to be cdled.ed on the.re example; t.o rerlve the question cl. ronscious efftigation may show that. cne lingui::t.ic analysis is wrong. To i.mpla:nnt this in the rum:nt. parSEr would be impcsc>ible, but a a::rnplete re- write to PSG n:ay make this pcssilie. ·209 In rondusi~ we have seen that our reformulation of M arrus's Have-Diagnootic required two extra mechanisrrn and violated ROBIE's ron..le seem tD pErCiive bdh mmnings at mre, it may be necrssary t.o expand the typical notim d detemlinistic parsing to allow this. Unfortunatdy the ansWer t.o these questions is unknown. I fm that this area is a thesis by it.sdf. The explanation dfa:alhEre is purely SPECULATIVE. I will not dfEr a finn answa- t.o these questions, but will give several pmsible sdutions and my theoretical. predictions. 210 As we are not sure exactly what people do whm they read these sentence:;, let us look at all possible cptioos. There are essentially three options: a) Return a prderred rea.d­ ing and then the other readings, b) Get both readings at once, c) Return one and only cne rm.ding. Let us look at these 1 'in turn. 9.3.1 Prefer,red Reading, then Other Readings At first only one !'f'flding is perceived ·and then the other possible rea.dings are found by backtracking. This would mean that the prefffi this mean that cent.ext will dynamimlly re-ort:b:- the ~ to n.fled the ~oe in pnf auice? If contert does net cause some sort c::l. re-onkring. how can the parser rdlect the differm.t prdff'Ences? W hilst it seems pcssible to dffiign ~ systan. tlra.t is capable c::l. handling dynamic re-ordmng. I fed that an arc ordmng approach to prefera:i.re seem:; unable resily t:o I"Eflect how the preference can change. We have met this problem befa-e in Chapta:" 6. W e will assume that this cption is not corred.. 9.3.2A// at. Once It rm.y be that all poosilie rrnrlings are, son::Hiow, produced at the same time. If the throry predids that all mrnnings ran be simult.aneously perceived then it must explain how this could ccrur and when ead:l. should be produced Ra:na::nl:xr that we have forbiddn 211 pu-illlclism for cur deterministic parser. [f it can be shown lhal p;oplc prcduce mulbple l readings in paralla. it may be nEXXSsary Lo expand the traditiooal notion of dci:.crministic parsing. Martin's pcm:;er [Martin. Church and Patil 1981] shcws that thcr-e are so many ~ible raidings for sentence;, that, dearly, all pcssible rea.~as ran not re pcrcrived. 0 ne of bis exampl.e:> has 95B parse:>! Consider tbe3e two seotm_a:s: [3ffi] W bat are the prcduction CXEts as a Pa-ca:itage cl sales? [ 300] List price:; of single unit prieffi for bd.b. 72 and 73. (from [Martin. Churd::t and Patil 1981. p. C2D For [38.5], Martin's parsEr finds 14 rarses and fer [386], it finds ffi parse:;! Clearly a pe-soo doEs not ~ve all 85 parses of [300] one at a time by backtracking. I do not believe that, even without ba.cktracking, as many meanings are found as a Chart parsa- can prOOuce in paralla. If each pcESitle parse started a separate parallcl ~. thEre wculd be a vrry large nun:tlEr cr parallel p~es. So we must assume that the ure cl prdcrcnce and nm-syntadic inforrmtion eiiminat.Es rncst of these pcssibilitiES. Finding all ITlGlnings sm:rn pcssible ooly by using SOOJ.e fGml of p:;eudo parallcl.- ism This cculd be done by: 9.3.2.1 Similar Trees One ~bility is that ead:t rrn.cing has the same tree up to some pcint This wculd be an extrnsic:n of the 'bave' case we have just see:i. For example in the smtenre: [387] Have the balls hit the wall. In Gazdar's systan the won! ''hit' would have both the feature [ + PSP] and the fcabJre [ + BSE} This nnms that the tins would be idmtical. and Dfitrur set cl. features rould be Wminat.ed and hm.re bd.h. .re£dings encOOed. into iL Unfortunatcly, frcm. the rurrently ~ linguistic analysis. mcst cl. the:>e examples sea:n to have too grmt a varia- ticn fer this to apply. Psycholinguistimlly, this a:Ight be the easiest and mmt d~t 212 Here is and:her example oi how this may work. Consider. [388] He looked up the street [389] He looked up the addnss. [300 J He looked up the mruntain. In lliffie examples, it is generally ~ted.. that [388] bas two poosible readings; in [389], "up' is a pat-tide and in [390], "up the nnmtam' is a prepcfiltional phrase. Notice that the crossed meanings are all prnsible given the corred: context. How can this be done? One approach is to inteqret [388-390] as 'He (looked up) NP', that is, with the wcrd "up" as a particle, and allow the non-syntactic proce:sor t.o decide which reading relevant while it intffprets the vfrl> phrase. .But consider. (391] *Up the adrlre:;s he looked [392] Up the mountain he locked [393] He looked the street up. [394] *He looked the mountain up. - In [3921 the PP ran re fronted. but not in [391 J This data is used to show that. in me analysis the word "up" is a partide, whilst in the other it is a preposition. The una~ity ci the stamrl examples above is not based c:nly en syntax. Instead it rould be bemuse the selectional nstrtdioos of the verb are violated and it is on this basis that these examples are rejecied Assuming they are differa:il before the ambi- guity in thESe ex:amples can be mrolved. the ser:mntic meaning of the NP must be ted syntactic ~alysis for thffie srn.tena:s in rn:Et. case:; is taJ diifermt to make this approach pcESilie. Short of a major revoluticn in l~m.Iistics, it does not seem pcssible to have idm.ti- cal tree:; for all example:; ol global aml:iguity until disambiguation can be made. Conversely, this shows an a.cm. to inveiligat.e the _EEYd::ldcgiml plausibility cl. variC'US linguistic syst.errB. The o:rlhcd used by pecple to handle these example; may help to guide linguists in the devec:pm:nt of theorie:; of UnivEYSal Grammar. This should make an inl:.ff'Esting area fer future invEStigatioo_ 9.3.2.'J Produced in Paral lei Finally, it may be that t.he two rrndings are prcxiuced .in parallei. W ha:i the amli- guous point is detected by two rulES matching at once, thm two pn:xxsse:> are sp:iwned From this point both analyse> are produced in parallcl.. Ea.di pa3Sibility will thm. proreed as an independent ~- We can detect the ambiguous point wi.th the "rule matcher'. In pro- dudion syst:a:ns, the parsa- runs the the first rule tn rnatd:t. Oftm. there is a rooflid:. with a rule cl a less ronstrained pattan and a rule with a more coastrained pattern.. By nmning the fin;t. rule to~ we resolve the roollict. but what happens if two rules cl. equal ccn- straint rm..tch? To handle this problan. both rules IIIIBl be run. This would be easy to impl.e­ mro.t this on a paralld madiin e. we would need our systa:n to find all rules cl equal ccn- straint that match and run than all in paralld. Each rule thm mlls the ''rule mat.cha:'' 215 recursively and ir~dependently. In this way both paths can be parsed at once. On a smal machine. we would have to simulate the pai.-allelism in the traditional ways. The no'.l-syntadic pr-ocffiSOC ·can fail either path at any point. Rernembe:- that each r...lle mak~ a ccntrirution to the semantic interpretation of the sentence and ea.di path now OOilds its own semantic interpretation. If one of the:se paths fails to make semantic smse. then the ncn-syntadic ~or for one of the rules will abort the parse and the path will die. Cont.ext. ~JI thm have its Effect just alter the proa:ss has spawned This is the moot simple n:Elhcrl ci areounting for glob:l.l ~au:ity for me. Th~ 2re however a few ranaining probla:ns. The effect en ma:nocy load is not dear. Several parallel ·~ may be in violation of a limited rre:nory view. The biggest probla:n in the thea:y is one of debnr.inism. This approach is exadly the approach of the Chart _parsff' and c--is such is the nm.-dda:I:ninistic approach we are trying to avoid To accept this would change mrst of the theory. There is, howe'Ver, one further option 9.3.3 Only tlte f3ig:7t. One We have now explored the poosibility that a _pEYSon p:roduce> a pnfE:rred reacting. then other rm~"3 and that a pasoil may sonrliow produre all pcssi.ble ITTXlnings in paral­ lel. The final possibility is that ~vm. a nalllral situation,, only cne meaning is percffved. That is. in mnt.ext ooly one IIEaning is notirni This means that context must. feed diredly int.o the handling cl these examples so that it may influence the riate word The juc\gemmts writtm. down by the subjects also showed agra:ment with the interpretation of the sentence. For ambiguous examples, as ' above, the naming latency for the appropriat.e word was 519 msecs while the inapprcpriate word took 555 rr:Becs. For the unambiguous examples, the tirru:s wrs:e 554 msecs and 581 msecs n:spectively. Tyler [Tyler 1977] thm daims that the ambiguity of the fragment "flying pla.nffi' is :rffiolved with the aid d the prier OJ11text. We will acrept this condusion and see how it fits my thoory. 217 When we de._~ned global ambiguity, we said that, given the sentenre in isolation. thEre is no way to tell which is the cksirai meaning. This is just anothEr way of saying that syntax cannot deride which of several possible analyses to choose. However, we have already decided en a method to use when the:re is a dt.oice cl alternative) between which the syntactic processor cannot d:locse. We let the non-syntactic prcn::ssor decide. If disrourse can affect these examples. then the non-syntactic prcx!e)Sor is making the decision here too. Thenfcre, my theory.of the "whm and why'' of non-syntactic interadicn is applicalie to this case. Foc globally arnbi..guoos smtenCES th~ the parser must first dd.ect that two rules of a:iuaJ. constraint can match. Rather than exeruting each in parallel, the non-syntadic pro­ ~ deciill:s which of the two it shou1d run. This path the:n runs to completion. This explains how people can always get a preferred reading and the prcl'ared. reading~ change with time. Since the non-syntactic proce;sor makes the decision. it will always pick the ront.extually appropriate :reading. Therefore, these do not normally cause the garden path effect. in people. 9.4 Why are These not Garden Paths? We have ddined a garden path sentaice as one in whim the ncn-syntactic proa;s­ soc makes a decision; this decision was inCCJil'."ed. and the parser failed to mmplete the parse. In a garden path sentenre. if the wrong path is taken, itis impcssilie to finish the parse.. In an example of global ambiguity however, there are two prnsible paths to a ~ parse. The non-syntactic prreessor IIH"ely decide whim path to pursue. Eith.Er is guaranteed to be suree:;sful; so it is not possible to make the wrong choice. The approach to global ambiguity which is the most rompatible with the wcrk in this thesis is: The parser first finds the potmtial globally ambiguous fragmmt. This cruld be done by having two rules mat.di at cnce in the production system grammar. W hm both matdl. the non-syntactic processa- decides whim rule to run. This is how context Effects 218 the:;e examplc:s and is the same as the theay cl non-syntactic interaction pn:sented in Chapter4. 219 Using Serr..antic Information for PP Attachment and llP Parsing In the previous chapter, we have seen sugg€Stions for future investigation into the implimtions of the wcrd ''have' and global amliguity. In this chapter, we will look at areas ' for further invemgation involving non-syntactic infamatiori I have alluded t.o the ncn-syntactic check that is made before a PP is attached, for ca:nplex headnoons and reduced relative clauses. In this chapter we will look at how non- syntactic information is used in the "MECHO world' for making derisions in PP attad::unmt and how non-synladic information rould be used in parsing verb pbrasffi. M , and intcnation and discourse. i.nfamation. These last two items of information are net used in the rurrent sys- The following example will be used to explain how this is done:- [403] The partide hit the wall with veiocity 12 ft/sec. , In this sentenre we are cona:rned with the PP (with veicri.ty 12 fUsec) and the NP (the wall). The semantic dla:k first isolate; the preposition and the two NPs involved In this example it will use: (with) (the wall) and (vclcrity 12 ft/sec). ROBIE uses a simple set of semantic markers to make its decision. For this example, it will use a set of semantic markm; called "can have'. For ead:l. word. there is a list oi mai-kers :ra:'1ecting dimmsions that it can have. For example: can have( wall, [hagbt.lmgth wi~mass D. can have(partide,[mass, velocity D. can..have(spring. [lrogth,mass,constant,elasticityD. There markers state that a wall can have height, lrogth, width and mass, but cannot have velocity. Similarly a particle doffi not have lmgth or width. The sa:nanti.c check then See:J if the wall "can have' the headnoun d the NP, ''vaocity" in this case. It cannot, so the PP attachrralt does not take place. [ 404] A mass is hung from the spring with a constant of. 8 l~/ft. In [ 404 l springs ran have constants so the PP would be attadml t.o the spring. This particular t.est does not consickr the main va:b of the sa:i.ta:ice nor the prepa;ition asso­ dabrl with the PP. 221 This is OPly one of several strategie; ROBIE usc:s for PP attacbmenL Some of the other strat.Egi_e; usai include always attaching ''of" PPs and if an Adjectival Phrase is the hre.d of the PP and it triES to attach to a NP, then it is alwa_ys attached This would ocnir in a sentence such as: [ 405] A stone is dropped fron1 a cliff 7 00 meters above the sea. Even thrugh this methcd locks va:y simple, it makES the oorrect decision foe all the PP attachment proliEnS in Appendix B. The majcrity of the PP atlacbment strategiES were devE!opedjointly with McKay [McKay 19811 This approadl was two fdd Eva:y word has a "sa:nantic definition' consist­ ing of sa:nantic marka:-s. The non-synt.actic ~or checks tn see if one head noun is a sanantic feature of the other. This is an extended versicn of "carLbave' described alx>ve. The seccnd t:ec:hnique is tn .have a semantic "specialist" for each prepcsition. For example, a PP with the prepooi.tion "at" will be attached to an NP if the NP is a physical object and the NP of the PP is a location. Th~ techniqrn:s are vay simple and count.Er examplES to thffie strategi.ES are not hard tn find None the less, they produce the corred rffilllt fer the MECH 0 protla:ns in Appa:idix: B. A proper PP attac:hrrent schane shoold take into aa::ount the rrain vErl> and what types of PP modifirrn it needs. 1 t should also take into aroxm.t the other poosible place::; to att.ad::i the PP and make a bEst fit acrording tn all factors.. W ocds used a very good method in Lunar [W cods 197J] for this, Sdective M edifier PlaCEilalt (SM P). The SM P facility romputed the prnsible praitions of a roodifier. For example, if on.e has tn attach a PP. then the SM P will compute all the poosible attad:imrnt points. W oods dOES this by a special . form of POP arc called SPO P. The SM P looks up the push­ down stadc cl the A TN and returns all the it.am on this stack. It th£n asks which cl. the items on the stack rould take the modifier tn be att.adled For ead:l cl. thESe stack da:na:lts, the SM P romputes how :much it needs the DDdififf'. This ~t was based on a tech­ nique sirrilar to the "paraplatffi'' d. [Wilks 19?5]. The preferred altanative then is ''the dCSESt item. that needs the rmdifier the mmt". 222 In ROBIE, the elements c:i. the Active N c:xE Stack are it.errs under oonstruction. rut work on them has been suspended until a node further to the right is built. W hm con- struction b:gins on a new node, it is pushed onto the Active Node Stack and work is stopped en the old node. This qx:raticn bas the same effect on the stack as a PUSH. So the altm:ia- tives found by the SMP will be the same as the it.errs in the Active Node Stack. If a maiianism was added to ROBIE to asse;s how much each node needai a modifiEr. we would have an equivalent of the SM P. But this would involve locking at the Active Node Stack, which we have stated we will not do. 0 ur point is merely that the SM P is similar to the approach outlined in this sec- tion, if this :re:rtiiction was lifted However, it is not dear that lifting this n:striction is n~ in order to I'ffiOive the PP attachment problems. In Chl.II'd:l's YAP [Church 19801 he did not use sanantic checking. Instead he "p:;eudo attad:ied' the modifier to all it.s p verb [noun/verb][ of] -> noun no headnoun or singular -> noun [t]-> noun The last three rulES, whim will use the word as a noun are really handled as a ddault They are inserted only to make the:;e case; explicit. If there is no headnoun for the NP so {ar. or the delen:niner was singular. or if an "of'' is next indicating a PP fdlows. then the word is made a noun. If the word in the semnd buffer is: "ngst:art:'. "prep'. "adverb" or "pronoun'.'. then it will be used as a vErl>. Oth- erwise. it will be used as a noun.. This heuristic is very acairate and is used only because the current non-syntactic interpretation is insufficient to mOOel this phen_onxncn effectivEiy. In Chapter 4. we saw that the redured relative garden path rase is remlved by a smiantic te;t. In the curnnt parser. the non-syntactic interpretation is only the mininmm neo:ssary for the MECHO wcrld (Bundy. et al. 1979b]. Henre it is not sufficimtly powaful to make the decisions that Crain and Coker have proposed for this. Instead the parsa- US€S a heuristic which says; "one may ooly have a reduced relative if the smta:ice bas a main vff'D'. This is exactly the strat.Egy of Bever [Bever 1970] as was explained in Section 3.1. 10.2Parsing l!Ps with Semantic Information We can say: [ 400] I told a story. [ 407] I told Bill a stocy. [408] I told Bill to kiss Mary. [409] I told Bill that Tom hit Mary. rut not: [410] *I told a rock. [411] *I told a story Bill [412] *I told Bill the liock [413] *I t:old a story to kiss Mary. [414] *I t:old the Hock that Tom.hit Mary. The first set of. smta:u:x:s are acceptable hem.use ''told' is subcategorised f cr the following list of NPs and complm:Ja'.lt.s: 224 told NP told NP NP told NP VP told NP S- This subcategortsation is VffY important fer several reasons. It can be used to reject ungra.mmatira.l sentencxs, help to find movement gaps, and assist with handling ambi- guity. If a verl> doe; not accept two objects (''kis8' for example), but the sentence bas two NPs after the verb. then we know that the sentenre would be ungrammatiml if the semnd NP was an indirect objOO:. This is one reason the following sentence is unacreptable: [415] *The boyki~ed the girl a check. Verb ~sation ran also assist with finding gaps. If the proper numlxr of obligatory objects is not present and one has a W H-comp that has not found a horre. then it ran take.the place of the missing cmstitumt. [416] WhatdidRobgiveVal? "Give' require; two oqjects. Since only one object is pt'ffimt and there is a W H- romp being moved. it ran be plared in the gap. This is a simplification of the prd>lern rut dearly this subcategorisation is very useful for deteding gaps. It is EN:lential for distinguish- ing between various type; of verlE. Finally subcategorisation is useful for re;olvtng amliguities. We have already used it this way in our earliEi- discussions. We have S€Eil that if a sentence is not sub- mt.Egorised for a VP and the word ''t.o'' ocrurs after the verb. itcannot be an auxiliary vErl>. Similarly for "that:'. If it oa::urs after a NP in the VP and the veri> will not accept "that'' romplemaits. we know it is not a ccmpla:nentisfl". We can use subcab:gorisation fer detecting relative dauses. If the sentence is: [ 417] I picked up the roin the boy dropped We know that ''picked up'' doe; not take an indinrl: object: in this case, so ''the boy'' must be part d. a relative dause. 225 It is also believai [Pulman 1980] that subcat.Egorisation ran aax;unt for many of the syntadic constraints that have been propcsed It has been suggESted. [Pulman itid], [Gcmiar 1980] that the Spediied Subject and Tensed S mnstraints are mercly side effects of proper verb subcattgorisation. Clearly ve:b sutcat..Egcrisaticn is vecy important and can assist parsing. So far we have only subcat.Egcrised verbs on the basis of syntactic coosti- tuents. This is not the whole story. [ 410-414J are ungrammatiral sentenCES but mm the subcategorisation that we have given. All of thffie examples are unacozptable fer sa:nantic reasons. In [407], the first NP must·re a listen.Er and the second NP must be something to be reJ.atro. such as a story. The reason eadi of these is ungrammatical is that the NP must also meet a:rt.ain semm.tic :rEStrictions. In ot..hEr words, verbs should be subcat.Egcrised smJailtically as well as syntactically. This is commonly known as "selectional :restrictions". The ClllTffi_t parser does not make extensive use of sclectiooal .restrictions, which constitutes a major affect A truly psyd:idogically plausilie parse:- should have selectional restrictions in the rule patl.Enls.' In the current parser. these sentence:; are rejected by the sa:mntic analysis as soon as the rule is run. This give:; the same result as having the ns- triction in the patt.Em.; i. a. the parse fails. In othEr wcrds, the pattern. for "tdd' rould be: told [NP.listenEr] r told (NP. list.ma:-] [NP. story] told [NP. animate] [VP. action] told [NP. listenEr] [that, fact] Investigation of this arm. is beyond the scope of this thffiis, but tb€re is other work in this arm tbai. I feel should be added diredly onto the a.mm.t parser to remedy this ddiciency. 10.3 Preference Semantics 10.3.1 Wilks 226 W ilks [Wilks 1972, 75a. 75b, 78] has propa::;ed a very gocd systa:n f cr semanticaJly parsing sen.tenCES based on preference. W ilks's system was aimei at machine translaticn and designed explicitly to handle the problems of word SffiSe The system_ did not perform. a syntactic analysis, even. though it did use sorre syntadic infonnation. Instead it was based on the ooncept of "pref er­ ence semantic:S ', where ''prdf2refice' is distinguished from' 'sbid requ:irm:e:lt'. '. Wilks used several levels of structure, fue lowest 00.ng ''semantic primitivffi' .. He had 80-100 primitives, which formed the basic building blocks of the "semantic formulas". Eadl. formula exprESSa:l the meaning of the worn. with which it was associated Evey sense of each word has a separate semantic fcnnula The:>e formulas had a rigorously ddined syn­ tax. The next levei was the ''bare template'. Each bare tenlpl.ate represented a specific "uncErlying ~age', having ACTOR, AGENT, OBJECT. etc pcsiticns. These :rq>resental skeletal propositions. The::;e bare templates were then. expanded to ''full tenlpl.ates'' by filling there _IXJSitions in the t.Emplate with the semantic formulae of the corresponding words. There ta:nplat:.ffi corre:;ponded roughly to simple sent.ena:s. When W ilks's program analysed a smtenre, the initial word stii.ng was first frag­ mmted into template bxm.daries. The fragmmts of the surf are text w~ then. expanded into full ta:nplate:;, thESe gave the possible mmbinatioos belween the word-smse fonrrulae and the surlare strings. Links were then ertablished between the main cl.emmts of the ta:n­ plates and thEir pcssilie dependa:lts. For example in "John drinks wat,aJ', "drink" expeds an animate subject and a fluid c:bjed.. Likewise for "sly fox": 11sly" can m:xiify cnly animate en.tries. The rare ta:nplate foc "drinks'' would be [*pot cause •ent]. (*pot is potential actor and *mt is an mtity.) This bare ta.nplate would matdl. the above sen.t:a:ire. The ta:nplate having the IDCEt links is the pn:fanrl intapretation. For example [Wilks 197'5]: 227 [ 418] The policaJJan. intarogated the ~· The formula for ''interrogate'' defined the preferenre: *pd. force *ent. There wEre two possible template:; (MAN FORCE MAN) and (MAN FORCE THING) produced by this sentaire. The primitive "*ent' ca.tld be associated with "MAN", but not 'THING", so the template (MAN FORCE THING) was eliminated In this way, the meaning of "crook" as a "walking stick" was eliminated. It can be seen that this is very similar to selectional nstric­ tions. It is this part of Wilks's work that is of interest here. This approach givES a semantic representaticn for the sentenre and at the same time enforres selectional nstric­ tion and handles word sense ambiguity. Yl ilks used syntactic information only in his fragmentaticn routine to split the fragments. I ~est that R 0 BI E could find the fragments and words so that a modified vcr­ sicn of his template:; could then be used to :remove extra word sa:ises and handle selectional reilli.d:ions. 10.3.2 Boguraev Boguraev [B~ev 19?9] used If ilks' s theories as the starting point in his work, but added syntax and severcil other changes. His system was dertgned to pa.raphrase English sent.En~ to demonstrate its understanding of word sense ambiguity. In this scdi~ we are oo.ly roncemed with the analysis part of his program. Boguraev's sa:nantics lock very similar to W ilks's semantics and many d the same levels of semantic structure are identifiable. Example; of Boguraev's definitions for "grasp' and "crodC' are: [Boguraev 1979, p. 3.3] 228 graspl (''gr&~ the block''): ( ( *ani subj) ((*physob obje) ( ( (this ( IIBn part)) inst) (touch sense)))) grasp2 ("grasp the idea'): · ((•humsubj) ((sign obje) (true think))) crookl ("persoo"): ((((not.good act) ooje) do) (subj man)) crook2 ('I object:'): (((((this beast) obje) force) (subj man)) pc:ES) (line thing}) - . An explanation of the details of this diagram would be extra:ra.y complex. For each pcssible word sense, Boguraev had a ''semantic fcnnula' like thrne above. The p~ of disambi.guaticn. was one of chcosing the mrrect dictionary entry. , We will use an example from [Boguraev 1979. p. a24] to illustrate bis method of disambiguation. Consider the sentenc-e;: [419] John asked Mary a questicn. [ 420] John asked M ary to CODE with him ( 421] John asked Mary for the book. The dictionary entrie; for ''ask'' are: askl (inquire): ((man subj) ((*ani obje) ask)) ask2 (want): ((rmn subj) ((ad. obje) ((((man (plmse feel)) cause) goal) ask)) ask3 ( reqtJe3t): ((man subj) ((*ea.t obje) ((*hum.from) want)) It can be sa:n that eadJ. sa:itence alx>Ve fits a diffa-ent ddinition of "ask''. By matd::ting the tare templat.ES of the worifs with the dictionary ddinition of the vem. it is pcE- silie to arrive at a single meaning of the smtence. If two Tllffining; are pa:>sible, then both templatffi would be matched and both me;ming;; returned B~ev used an ATN-style parsEr to rerognise syntactimllywell-fon:red ronsti- tuent.s and coordinate the sa:nantic routine; which ronst:ructed sanantic ~tatiros for the sa:nantically valid Ollffi. The major addition to W ilks's approach is the additic:n of "con- texbJal verb frarrE3". This is a semantic pattern. that operat.e; on the sroiantimlly rq:resm.ted constitum.ts that have hem. rerognised by the A TN. There we patterns as l:xiow 229 and. h:lsed on the con...tjih1ents, helped to gui~ the parse and renx>Ve pa;sible templates. For example the fnunes for ''ask'' wculd be: [Boguraev 1979, p. 3. 34] framel: *hum ASK (*hum) (@.sign) (ABOUT *ent) frame2: *humASK *hum TO *do(@ad) frame3: *humASK (*hum) FOR *ent · Here upper case words are keyworos ("w-ords as they appear in the tert:') and lrac:kets denote optionality. These f:ra.mffi are the central part of the parscr- and guide the parse. Th£:se f:rame3 COI'I'ffiPOild very closely to verb subCategorisation. but with selec­ tional I'ffitrictions on the NPs. As stated in the previous section, this is deillable fer VPs. Inst.md of subcategorising verlE as I have done in ROBIE. I fed that it would be b:nefirial if the system dfficribed here were fitted ont.o it The rejection of selectional .l"EStriction viola­ tions could happen just as it dee:> in Boguraev' s system W ortl sense cEfinitions ~d be resolved at the semantic level, as Boguraev docs in bis system. The syntactic porti.ons wculd ra:n:we the part oi speech ~ty and the semantic routine could Chide on the sense of the word Whilst I am net rorrrrnitt.ed to any partimlar semantic system. the addition of a systa:n sudl as this shows that it is pcssible to build appropriate sffllillltic analysis programs. 230 Conclusion This week was an inve;tigation into part of the human sentence parsing med:ian­ ism (HSPM), where parsing1."'""-"t:': syntadic and non-syntactic analysis. It was propcsed that the HSPM consists of ft least two process~ We called the first procESsoc the sy nt ac­ t ic processor, and the sea::nd the non-syntactic processor. The syntactic proce:>scr is unconsdotJS, deterministic and fast. but limited. The rESOluticn of lexical ambiguity was used as a vehide to inv~oat.e this hypothe:ri.s. We then saw that the two ~ could work in parallel during the prrn-ssing of a normal senta:ire. with the non-syntadic proce:;sor 'lista::ling'' to the syntactic p~or. During processing cf some sentences, the syntactic processor, at key points, could ask the non-syntactic p:rc:ia:ssor to make a decision in onier to resolve an ambiguity. There key points oa:!UI" whenevf3'." a situation arrne in whim the syn­ tactic proa:ssor could no longer guarantee a correct analysis. A major fccus of this .rrsmri::h was the identifimtion of those situations in whidl pebple use the non-:=,J1Iltaclic processor to assist with the resolution of ambiguity. It was shown that the;e situations can be correctly predid:.ed using the two buffer lookahead of the syntactic ~or. Only the syntactic p~ has ban inve:rtigated. A major test of the p>yd:lo­ logical validity of this IIKXld was that it failed on precisely thooe sentences that humans find tD be gardm. paths. As a starting point, we used M arcus's work on deterministic parsing. The advances reported here were:- - Rmciion time ex:pErimen.t.s were usai to provide a non-subjective dassificati.on of sm.t..a:ice:; as garden paths or not. Using this dassificaticn it was shown that .Marrus's parser would succeai on some garden path sentences and fail on some non-gardm. path smtro .. ces. - This ddiciency can be oorrected by the me ci non-syntactic information for ambigui­ tie; which may lmd. to a garcEJ.. path. 231 - Non-syntactic information is to be used to help re:;olve an ambiguity when the syn- ta.die processor can no longer guarantee a correct analysis. All other ambiguities are to be resolved on the basis oi syntactic informatiOCL - An amended parser, ROBIE was pra:;ental which inmrporate; these condusicns. It was shown that the situations in wbich noo..-syntactic information is to be used mn be ao:ma.tEiy predicted by ROBIE's two buffer lookahead. ROBIE was shown to be compatible with the psychological evidmce currently available on human sentence romprehension. - R 0 BI E i~ romputationally and cc:nO?!ptually simpler than M arrus' s parser. Perb.ap; the lilCfil signiiirant re:rult of this thesis is the idmtifiration of the sill.la- tions in which the syntactic and non-syntactic processors int.Erad.. These ocrur whenevEr an ambiguify arises which the syntactic processor cannot guarantee to ~olve ccxredl.y. because of it.s limitatiaos. This is CC11trasta:i with a thecey which might suggests that the non- syntactic prlXESSOC is used only when the syntactic prc:nssor bas been 1Ied astray. Rather I ~ I than using the non-syntactic processor only when an ambiguity has taken the syntadic pro- re:;sor astray, the non-s-yntactic proa:ssor is used whm an ambiguity arises which might lead the syntactic proa:ssor astray. 11.1 Summary We first looked at~ d lexical ambiguity that can lred to gardm. paths. In the first few chapters. we investigated the gardfn path prediction of Marcus's PARSIFAL. YI e saw that it was inaes a yes-no-question. We %rrain saw that it was impcssible to fon:nulate the EST analysis in a two buffer deterministic parser.We disrovered that the PSG analysis was cornpatilie with our two buffer lookahead These two problerrn illustrate how the parsEr mechanism can constrain the gram­ mar. For both of there examples, the EST analysis CDUld not be implmHl.led in a two buff~ detaministic parser. These considaatims provide a new ground on which t.o evaluate gram­ mar lhenes. The cype of parsing medianism nEBIOO t.o parse a partirular grammar is very important. In this thesis it is suggested that SODE grammars can be parsed with a two buff~ debnri.nistic _parsEr, whilst oth~ cannot. 234 Onre the grammar is written. the parser is only a simple devire that interpt_-ets it Assuming a parser contains only lEgal rules. thm violations d the principles presented here are not possible. The parsing mechanisms ensures that tht:re are only legal rules. Henre, whilst the parsing mechanism influrn_a:s the grammar, the grammar is eveI}'-tJJing. 11.3 Areas for Future Study There are still many important prOOlea:E that need to be invertigated and arms that should be fertile f cr rub.Ire :re:;earch. This thffiis has considered ooly the English language. A very int.ererting querticn is: '1s it, pcss:ible to prrse all otha- languages deterministically?" The answer is not unk- nown. It would be very int:a:-Esting to try to write a grammar using this deterministic parser for sevEnll languages and observe how wal the principles will hold true. One must nma:nber that for many languages the linguistic analysis is not nE£IB- · sarily agreed. upoIL This paper doe:; not ronsider the exact linguistic analysis d otha:- , languages. There are three key points, disrussed in this the:fils, whim may be valid in other 1) the pcodudion systa:n type grammar and some farm d top-down parsing sub- mtfgorisation. 2) the parsa- structure. that is a stack of partially built ccnstituents and the two lookahead buffEn>. 3) The timing of. non-syntactic inta:action at those sitnations in which the syn- t.adic proassoc cannot choare the correci altemalive hem.use of. its limited lookahead It would be inta'e:rting to design pam:rs for sevErnl languages, whilst obeying there three paints. In Chapter 5, it was suggested that the amount of ambiguity in a language is inva:-sely related to the stridness of. word ordfr and complexity of. inflections in that language at bcth the clause and OJDStiturot lE.Vel. This int:a:-Esting hypothffiis rould also be 235 invffitigated It would be intererting to write a grammar based on the Phrase Structure Gram­ mar of [Gazdar, Pullman and Sag 1980cl This may reveal several things. First. whether it is poosible to write a rmre El~ant grammar in this way? Second. by using the cat.Egory featurES. handling conjunction and WH rnovemmt rould be much easiff'. Thinily, rould the I packets be eliminated or rOO:ucErl by the category grammar? The probla::ns d. global ~aui.ty also rieed further .re:;earch and mere data needs t.o be collected It is dear that cent.ext can affect bow people deal with global amliguity, but do people really cane up with bro :readings in a situatim. where the non-syntactic proa:ssor has no prefEn:n.ce and do the;e situations cause any gardro. path effect? It was propcEed that the non-syntactic interad:ion throry shoold apply in this case. This hypothesis needs to be tested as do the ''have' theories and the predidicns in Section 6. 7. In Sedion 6. 7. 2. an experiment was suggest.ed which would t£st the prroidion of the two buffer lookahead as it mates to lexical acc:e;s. In order to EStablish the validity of the two buffers, this and 3imilar experiments should be pmonned An adequate ted:mique for W H movmient and gapping in a deterministic pan;er needs to be found PErllaf:s this rould be tied in with using Phrase Structure Grammar. It is often simple to know that rrovmEl.t has taken plare., but what is the 00,i. way to nDVe the trace and what is the hffit way t.o detect gaps? In Section 4.6, a method to recovE.Y from a garden path situation was outlined The ideas pnsfflted there need to be devaoped further and the nature of the errcr re£CNery oomponent noods t.o oo investigated. Deterministic parsing depends on the exista:lce c:i. some fonn of a"rtr cocnpcnmt, making this t:opic especially important. Finally. any psychdogimlly plausible systan. needs to make sorm wnnn:nts about handling fraginnts and ill-fanned utterances. I have raied heavily in this pap:r, on the use d. grammati~cy and ~ed:ing ungrammatical uthran~. but people ran urula-­ stand ungrammatiml uttcranres. Little is known about this area and dat.a neaJs t.o be col- leded. 236 Appendix A: An Annotated Example To hep the reader understand how the parser w01-Xs, I will go step by step through the parse of 'The shy ooy has kissed M ary.'' I will show a ''snapshct'' of the parser before each rule is applied ,Them snapshots are t.aken from. an actual parse. In this fcnn of tracing. the feature; of the words and nodes are not printecL as they would dutt:a- the snapsh~. ' Each snapshot is taken just bEfcre the parser runs the rule mentimai. These show the Active Node Stack. with "1:" bang the botta:n. or the Current Active Node. Beneath it are the two buffers. noted by Bl: and B2:. The item < open> indicat.Es that the node with which it is associated is still "opE5'.l'. that is, it may not have all its daughters. To the right of ea.eh node en the Active Node Stack are the packets that are active when that node is the Curnn.t Active N cde. On the right of each rule name is the patt.Ern_ for that rule. To parse a sentence the user first type; ''go.'' The systa:n will thm prcmpt with ''Sentenre:" The user thm typES in the sentm.ce as seEn below. The packet and rule narIIES have beefi ropied from. [M arrus 1000]. There rules are very similar to bis original rules. I- go. Sm.tenre: The shy ooy has bit Mary. Packet CPOO L pattern: [ngstart1 agree( det) Rule about torun: MARKED-STARTNP Active Node Stack 1: < opm> S [SS-START,CPOOL] Bl: the B2: shy Initially the parser gena-at:ffi an Snode and activates the packetc:; SS-START and CPOOL. The buffers are the next two ncrlfs frc:m the bottom of the Active Node Stack, in this mse the next two words. The first rule to matdl is the rule MARKED-STARTNP. This matd:le; the feature "det:an:Ji.ner' in the first buffEr and will create a new NP node. Packet: PARSE-DET Rule about to run: DETERMINER patta:n: [ det] Active N roe Stack: 2: < opEn> s 1: NP Bl: the B2: shy [SS-ST ART. CPOOL] [PARSE-DET ,NPOOL] One can now see that the NP ncxJe is at the bottom cl. the Active Node Stack. The rule MARKED-STARTNP created the NP node The rule abrut to run will attach the word ''the' as the detaminer. The packet NPOOL rontains rules to locate numl:xr _phrase:; inside the NP. 237 Padni.: PARSE-QP-2 Rule ab:x._-rt to run: DETERMINER-DONE pattern: D A dive Node Stack: 2: s 1: NP det-the Ill: shy B2: boy [SS-START,CPOOL] [PARSE-QP-2.NPOOL] The parser now deactivate; the packet PARSE-QP-2 which OJntains rules fer various pre-noun m:xlifi~. Since there are nc:ne, a default rule runs, cEa.ctivating this pack:et and activating the packet PARSE-ADJ. Pack:Ei: PARSE-ADJ Rule about to run: ADJECTIVE pattern; [ adj] Active Node Stack:: 2: s 1: NP det-the Bl: shy B2: boy [SS-START.CPOOL] [PARSE-ADJ .NPO()_L] The rule ADJECTIVE in the packet PARSE-ADJ has now matched. and will attach the word ''shy" to the NP being built Packet: PARSE-ADJ Rule about to run: ADJ-DONE A dive Node Stade pattern: D 2: s 1: < open> NP det-the adj-shy Bl: boy B2: bas [SS-START,CPOOL] [PARSE-A DJ .N POOL] The adjective lliis been attached and the dEfault rule in PARSE-ADJ will dmc­ tivate the packet PARSE-ADJ and activate the packet for parsing nouns. If thEre were mere adjectives, they would have mat.died the ah.we state sinre the rule was still active. Packet: PARSE-NOUN Rule about to run: N 0 UN Active Node Stade 2: s 1: < open> NP det-the adj-shy Bl: boy B2: has patt.Em; [noun] [SS-START,CPOOL] [PARSE-NOUN,NPOOL] The rule NOUN in PARSE-NOUN has now mat.died the [noun] fmb.Ire of "boy'' and it will be attached to the NP. If the norm. rould be any other parts of ~ they would no lcnger be carried forward · 238 Packet: NP-COMPLETE Rule about to run; NP-DONE patt.errr D A dive N c.de Stack: 2: s 1: NP det-the adj-shy noun-boy Bl: has B2: kissed [ss-sr ART. CPO 0 L] [NP-COM PLETE,NPOOL] At this point the NP is completed. and the packet NP-COMPLETE is active. This ccntains rulES to find any pest-modifiers for the NP such as a PP or a relative clause. Since there are no mOOifiers, the packet is deactivated by the rule N P-D 0 NE and the fin­ ished NP is dropped into the first buffer. Packet: ss-sr ART Rule about to run: MAJOR-DECL-SENTENCE pattern: [np][ve:-b] A dive Node Stack: 1: < opm> S [SS-START,CPOOL] Bl: < open> NP det-the ' B2: bas adj-shy noun-ooy The S ncde is now the Curra:it Active NOOe, with the NP in the first buffa-. -The next rule to match bas the pattern [np][vEYb]. The parser mn now decide that this is a declarative sentence. It ad!E the feab.JrEs "ded-s. major' to the Sn~ deactivat:e; the SS­ START packet and ad.ivat.Es the packet to parse the subjed.. Packet: PARSE-SUBJ Rule about torun: UNMARKED-ORDER pattarr ~ np ][ vErl>] A dive Node Stade 1: < opa:i> S [PARSE-SUBJ,CPOOL]. Bl: < ope:i> NP det-the B2: has adj-shy noun-toy The packet PAR SE-SUBJ will now attad:l. the NP as the subject d. the cumnt S nooe and activat.e the packet for parsing the auxiliary. Packet: PARSE-AUX Rule about to run: ST ART-A UX patta:n; [ vErl>] A dive Node Stack: 1: < ope:i> S NP det-the [PARSE-AUX.CPOOL] Bl: has B2: kissed adj-shy noun-boy The rule START-AUX in PARSE-AUX will matdl. with the featnre, [verb], and start a new A UX ncxJe. This also sets the tense of the smtm.re and will aut.orm.tically transfa- the te:ise and nUIIlb:r from the auxiliary verl> to the A UX node. The S node is no looga- the Current Active Node, so the rules in the packets associa.Url with it cannot mat.d:t. 239 PackEt; BUILD-A UX Rule about to nm; PERFECTIVE patL..cwn: [have][ en] A dive N 00e Stade 2: S NP det-the [PARSE-AUX,CPOOL] adj-shy noun-boy 1: AUX [BUILD-AUX] Bl: has B2: kissed The rule PERFECTIVE now runs. This has the pati.€fn [haveJm], which matches the cum:nt state of the two buffers. It will then attach ''bas'' to the A UX node. I Packd: BUILD-AUX Rule about torun: AUX-COM:PLETE pattern; D A dive Node Stack: 2: S NP det-the [PARSE-AUX,CPOOL] adj-shy noun-boy 1: < opm> AUX bas [BUILD-AUX] Bl: kissed , B2: Mary The AUX node is now completed, and no other rule in BUILD-AUX will rm.~ so the default rule. A UX -COMPLETE matdles. This rule will ''rmve' the A UX node from the bottom. cl. the Active Node Stack to the first buffer, signifying that it is finished Packet: PARSE-A UX Rule about torun: AUX-ATTACH patU:rn: [aux] A dive Node Stack: 1: < opm> S NP det-the [PARSE-AUX,CPOOL] adj-shy noun-boy B 1: < open> A UX has B2: kissed One ran see that the buffers have aut:ornatirally shifted so that they are the first two ita::ns after the bottom. of the Active Node Stack. The rule AUX-A TT A CH will now attach the A UX node to the S node which is active. Note that the A UX ncx:le was dcsal whffl it was attached The rule also activatffi the packet to start the vErl> phrase. Packet: PARSE-VP Rule about torun: MAIN-VERB pattern: [vErl>] A dive Node Stack: 1: S NP det-the adj-shy noun-boy AUX-has Bl:ki.ssed B2: Mary [PARSE-VP.CPOOL] The rule MAIN-VERB will now mat.d::t the pattan. [vErl>], in the first buffEr. This will start a new VP node, activat.e the appropriate packets and attach the vErl> as the main verb. The packds activat:ed are for finding various cc:mplEmmts such as a toVP or an s-. 240 Packet: CPOOL Rule aOOut t:o run; PROPN AM E pattern: [name] A dive Node Stack: 2: < opm> S NP det-the [SS-F1NAL,CPOOL] adj-shy noun-boy AUX-has 1: < opm> VP verb-kissa:i Bl: Mary B2:. (SS-VP,CPOOL] At this time. the feature [ nar:re] will match the rule PR 0 PN AM E in tlie packet CPOOL. This packet ha!:; been active all the time and handlffi dause levei items. It is very impcrtant to have several packets active at onre in this way. This rule will make ''Mary'' int:o a NP in the first buffer. . Packet: SS-VP Rule aOOut t:o run: OBJECTS pattern.; [np] Active Node Stack: 2: < opm> S NP det-the (SS-F1NAL,CPOOL] adj-shy not.m-boy AUX-has 1: < opm> VP verb-kissed [SS-VP,CPOOL] Bl: < open> NP name-Mary. B2:. The rule OBJECTS will now matdl the NP in Bl and attach it as the object of the VP. Packet: SS-VP , Rule aOOut to run: VP-D 0 NE pattcrrr 0 Active Node Stack: 2: S NP det-the [SS-F1NAL,CPOOL] adj-shy noun-boy AUX:-bas 1: VP verb-kissed [SS-VP,CPOOL] NP name-Mary Bl:. B2: No rule in SS-VP or CPOOL matche; on the final punctuation. so the dffault rule VP-DONE will run and drop the VP into the first buffEr. Packet: CPOOL Rule about to run: VP-ATTACH pattern; [vp] A dive Node St.ack: 1: < opal> S NP det.-the [SS-F1NAL,CPOOL] adj-shy noun-boy AUX-bas B 1: < open> VP vE!b-kissed NP ruure-M ary B2: . 241 The rule VP-ATTACH wiU attach the VP to the Snode, which is again the active node. and the pa..rser will be in the state below. Packet: SS-FIN AL Rule about to run: S-DONE pat:U:nL [finalpunc] Active Node Stade 1: < cpen> S NP det- the [SS-FINAL,CPOOL] Bl:. adj- shy noun-boy AUX-has VP verb-kissed NP name-M ary B2: The rule S-DONE in SS-FINAL now rna.tdie:; the final punctuation and ter­ minat.ffi the parse. The parse tree is then printed: S-1 [s.major.dai] NP-1 [np,def,ns,n3p] DET THE [det.def,ns,n3p] ADJ SHY [adj] NOUN BOY [noun, ns,n3p] A UX-1 [ aux, past, v3s] AUXVERB , HAS [auxverb,{X1St.v3s,verb] YP-1 VERB Kl SSED [ vff"h, rn. past. vspl] NP-2 [np,name,ns,n3p] NAME MARY [name, propnoun, ns,n3p] 242 Appendix B: Some Example Sentences The following examples show some ci. the linguistic CXNrrage of the parsEr. All of there can be succxssfully pa..rsed by the pa..-rser. In addition. the sentence:; in Section 4.8 illustrate further examples that can be suca:sfully parsed by the parsEr. The movement stack, as d:scribed in Sa:tion 8.5.3, is used with ea.di smta:ire that involves auxiliary inversion or relative dauses with moved objeds. The sentence:; with conjunction are the only sentenCES which require, three lxlfiers to prcx::ess. Sentmce3 which require these extensions are givEn in the sOOJ11d half of each section. This division is based on the adual execution of µie full parser. All other sm.tence3 can be parsed using only the two buffers and no additiorial rrecbanisms. Examples by Milne Tam_ found her. T OIIl found her dog. The block will block. W bat block hit her. W bat hit her. Which. boy kissed her. W hi.eh men kissed her? The boy which kissed her kissa:l Mary. In the park. t.om. hit mary. I believe you want to kiss me. I wantyou I wantyou·to leave. I want you to kiss me. l want for you to leave. I saw tom. ' I saw torn hit mary. The trash can be taken out The trash can was taken cut The pape- will be destroyed. The paper will was de:rtroyed. Jack: is 4 time:; heavier than Mary. If they gain 20 pounds Jack will be 3 tim.ffi heavie:- than M ar:y. Examples by Milne with M ova:nm.t The boys will be lmving next werk. W as the ooy's mother's block broken by Torn? Have the boys taken the exam? Is the last boy running down the street? The boy which I saw kissed her. who kissed Roo? Torn hit the boy and the girl. Tam. hit Mary and kissed Sue. 243 Rob hit the girl on the hill and in the pai.-k. Jack is 10 years old Jill is 2 years yoonger than Jack. How old is Jill? Jack: is 4 t.irIE3 older than M ary. In 5 years Jack will be 3 tim:s older than Mary. How- old is Jack? PP ATTACHMENT using SemanticMarka-s A mass is connected to a spring with a constant of 5 loo/ft A mass is connected to a spring with a light string. The particle hit the ball with velocity 5 ft/sec. The pariicle hit the wall with vaocity 5 ft/sec. The boy in the park: en the bill saw Mary. The mass of the partide of :mas:; 3 lbs is falling. 13.1 Examp/ es from Marcus The following examples are from [Marcus 1980} These are the examplffi bis parser ca.tld do which R 0 BIE can as well. I told that boy that ooys should do it. The pencil sean3 lroken. There seerrs to be a penCil lroken. I wantedJ ohn to do it. I want to do it. I persuaded John to do it. Scheduie a meeting for friday. A meeting sa:m:; to have ban sd:leduledfor friday. I t:OI.d the boy that I saw sue. I t:old sue you would schedule a IJEeting. I told the girl that you would schedule the meeting. The boy who wanted to meet you sd:J.eduled the meeting. The boy who met you scheduled a meeting. I promised John to do it. You promised to gj ve the book to John.. Example:; from M arrus involving M OVffIEllt. The boy who you met scheduled the meeting. Who did John sre? Who lroke the pencil? What did Rd> gj..ve Sue? Who did Rob give the book? What did Rd> pjve Sue? What did Rd> gj..ve to Sue? There sea:m t.o have bea:l. a meeting scheduled for friday. Is thae a meeting scheduled for friday? Does there seem to be a meding scheduled for friday? Who did ycu say that T ocn told? , 244 W bat did you give Sue yesterday? Who did you give the book yESterday? 13. 2 Examples from Church The:re examples are from [Church 1980] and show w·hich ci. ·the examplES his parsa- can parse which R 0 BI E can parse as well. It seems likdy that J obn would be sitting. There seems to re a blo:k: in the table. That I might take a ball seems likely. Fer me to take a ball seEnE Dire To take a ball seems nire. I wcnderwhat to dd? I wender what I should do? I wcnda- what shoold have been done? I know a nmi. that was nice. I know that was nice. I know that that was nice. I know that boys are nice. I know that boy is Dire. I know that he is nice. That he is nice is a fact. That that boy is nice is a fact_ That that is nice is a fact. That that blject require the "m:Ner:nent stack". in addition, many of these sentences require grammar rule:; that were not discussed in the thffi.is. These additional rules did not require any additional mechanisms beycnd the disambiguating heuristics de3cribed in Sedim 8.6 and the PP attachment te.st:s dEScribed in Section 10. 1. ' A hammer d :mass 2 kg travelling at 15 rm-1 is l::rought to re3t when it strikes a nail. What impulse ads on the hamrIJErl (from Bmtock and Chandler 1975) A small object. of weght 10 N rffits in equilibrium on a rough plane indined at 30 degree; to the hcrizont.al. Calculate the magnitude of the frictiooal iorce. ii (from Bcstodc and Chandler 1975) A stcne is· dropped from a cliff 100 m above the sea Find the spea:l with which it hits the sea. (from. B~ and ChandlEr 1975) A ball is thrown vErtically upward to a h6ght of 10 m Find the ti1:ne taken to reach this hEight and the initial speed of the hill. (irom. Bcstodc and Chandler 1975) A stone is projected vertically upward with a speed of 21 ms-1. Find the distance travaled by the stone in the first 3 s of its moticn. (from. Bcstodc and Chandler urns) A ball is thrown vErtically upward with a speed of 15 ms-1 from a point which is 1 m above ground levd. Find the spea:l with which the ball hit.s the ground (from Bcstodc and Chandler H175) 247 A stone is dropped froni the top of a tower. In the last secm.d c{ its motion it falls through a distance which is 115 of the height or the tower. Find the height of the tower. (from Bostock and Chandler 19?5) A stone is dropped fr'~ the top of a building aJ m ~ A second stone is dropped from. a point half-way up the same building. Find the time that should elap3e betwee:i the release cl. the two stones if they are to reach the gn::und at the same time. (from Bostock and Chandler 19?5) A particle which is IDJVing in a straight line with ronstant acnieration takes 3 s and 5 s to a:Nfr two suca:ssive distances of 1 m. Find the acceleration. (from_ Bostock and Chandler 19?5) A lever 10 ft Ieng is pinned at its left md The lever is supported by a spring with a ronstant of 40 lb/ft. The spring is att.ad:ied 6 ft from,the left md of the leva-. A weight of 20 lb is attad:led at the other end of the lever. The wa.ght of the lever is 8 lb. How much is the spring stretched? W hEre must a weight be hung on a pole. d mgligible weight.. so that the boy at one md supports 1./J as mud:l as the man at the other end? (from Novak 1976) A sraffold 10 ft long is supported ~ rope; atiadied at mch md The sraffold wEigbs 100 lb. One painter weighing 150 lb stands on the smffold 4 ft from one end. while a seandpainterweigbing 175 lb stands on the scaffold 2 ft from the other end What is the tension on each of the rope:; supporting the scaffold? (from.Novak 1976) 248 A hcriwntal unif crrn bar lO m long is suppcrted by bro ropes attached at its ends. The rope on the left end rr.ak.ES an angle d 45 d~ with lhe horiwntal, while the rope on the right end makes an angle of 60 degreES with the horizcntal. A w~t of 100 nt is attached 2 m from the right end or the bar. What is the weight of the bat? A unifonn scaffold 12 ft long and waghing HJO lb is supported horizontally by two vertical ropes hung from its mm. Find the tension in each rope when a 180 lb painter stands 4 ft from cne end ' (fromNovak: 1976) A uniform bar B-C is 100 cm long and weighs 50 lb. The bar is to be supported at m.ds B and C. An upward forre of 40 lb is applied 00 cm frocn B. Compute the forres on the supports. (fromNovak 1976) A unifonn pole 20 ft long and weighing 3J lb is supported by a boy 3 ft fn:m one en.d and a man 6 ft from the otht:r end At what point must a 150 lb weight be attached so that the man supports twire as IIlll.Ch as the boy? (fromNovak 1976) The hinges of a door weighing 20lbare12 ft apart. and the doer is 3 ft wide. The weght of the OOor is supported by the upper hinge. D et.ermine the f~ exErt.ed on the door at the hinges. (from N ov-ak 1976) A bridge is BO ft long. W bat force must the pier at each a:id of the bridge exert to support an aut:.oroobile weighing 2 tons which is 30 ft fra:n one md of the lridge? (fromNovak 1976) A gun has a maximum range of 200 m on the horizc:ntal. Find the velocity of a shell as it leaves the muz:ile d the gun. (fromB~ and Chandler 1975) 249 The grrat.Est range of a pai.-tide, with .. a given velocity of projection. on a horizontal plane is 30t.~ mehcs. Find thegrmt.ESt range up a plane inclincl at 30 dEgr"82S to the horizont.al. (adapted from_ Humphrey i930) Two partides of n:::ass Band Care mnnected by a light string passing CNfr a s:mrl.b. pulley. Find the accderation of the particle oi mass B. A partide cl. mass 4 kg :nst.s on. a smooth horizontal table. It is ronnocled by a light inartensible string passing over a smooth pulley at the edge of the table tn a partide of mass 2 kg, which is hanging freely. Find the accderation of the systa::n and the tension in the string. (fromBcstcCk and Chandler 1975) A partide d :rmss 5 kg rests on a rough hcrizontal table. It is ronna:fed by a. light mexteosible string passing over a smooth pulley at the edge of the table to a partide of mass of 6 kg, which is hanging freely. The codficirot d rridion between the 5 kg mass and the table is l /J. Find the aCCEierati on of the system and the tension in the string. (from_Bcstock and Chandler 1975) Two partides cl. mass 3 kg and 4 kg are CCIJ.Ilecied by a light inextm.silie string passing CNer a smod:h fixed pulley. The systa:n is raeased fromrert. with the string taunt and both particles at a haght of 2 m above the ground Find the vacxity cl the 3 kg mass whm_ the 4 kg mass reaches the ground (from Bcstock and Chandler 1975) Two partides cl. mass 3 kg and 5 kg are ca:mected by a light i.next.rosi. lie string passing CNer a smooth pulley which is fixed to the ailing cl. a lift. Find the t.a:Jsicn in the string whEn the systlmis moving freely, and the lift has a downwanl accelaalion G ms-2. (from Bcstock and Chandler 1975) 250 The driver cl. a car travelling due East on a straight :read at 40 kmh-1 is watching a train moving due N crth at 75 kmh-1. What is the apparent speed and direction of motion ri the train? (from. Hoste.ck and Chandler 1975) A partide cl. mass M 1 is suspenda:l from the end of a spring d l~uth Ll andEiasticityE. A second spring with length L2 and elasticity 2E is attached to the first partide. and another partide of mass M 2 is suspended from the second spring. Find the extension of each spring. A light elastic string of unstretched length A and mcdulus of Elasticity W , is fixed at one end to a point on the ceiling cl a:room.. To the other end of the string is att.ad:ied a particle of weight W . A horizontal force P is applied to the particle and in equilibrium it is found that the strtng is stretched to three tim€s its natural length. Calculate the angle the string makes with the hori7.0I1tal. and the value of P in t.enm d W. (A -level exam (part): U of L) A licx::k: of ~ 500 kg is raised a height of 10 m by a crane. Find the work done by the crane against gravity. (from. Bus irom [Martin. Churd:J.. and Patil 1981]. This analysis showednoviolatiorts of importance to these techniques. As the grammar and diciionarie:; are ncilarge enoogh to handle all the smtrn.ces found in these texts. I have hand-simulated the rule>. This hand simulation is Vff.Y accu­ rate as careful thought gee; into all :mar.ginal cases. For mm word I' discussed in Chapter 5, I will list the number of ocrurn:na:s in the the texts and any problems. This inciudei: "td', "for'. ''have'', "what'. 'which'.'. and ''that''. These texts are all fairly large. There are 276 sentence in the MECHO prolians with an av~ae number of words pa:- sentence of 14. This make; about 4000 words t.otal. The ASH OK rorpus is larger than this. The first TIME articles was a:oout 5(0) words and the S(:U)[ld TIME article contained 4000 words. 14.1 HAVE The five s~ were chednrl for violations of our "have'' as a yffi-no-queilion versus an i.mp:rative rule. There are no smtaice initial example:; of 'have'' in the MECHO smta1ces. Tha:-e are three sentence initial examples of ''have'' in the ASHO K corpus. All three are yES-no-questions. In the Scotsman and OOth TIME article;, there were no smtm.ce initial usages of 'bave'. One ran see that the use of "have'' for yes-no-questions is rare and the use cl. it as an in:IpErative is extra:nely rare. It sea:ns that the problem we have concerned OUI"S8ves with does nol cxn.Ir in frre text 14.2TO The five s~were checked for conflid:s betwem. ''td' as an auxva:b and ''to'' as a prElXEition. In c:N6: 3JO occurrences of the wani ''td', there were no violations. The follow­ ing talie shows the occurrences d the word ''td '. Source Number of Occurrences Errors Scotsman 96 none ASH OK 34 none TIME (81] 55 none TIME2 [81b] 102 none ME CHO 80 none 253 14.3FOR The soun::e:; were checked for ronflids between ''for' as a preposition and ''for' as a oomplermntiser. All oca.rr.renres of ''fer'' in the MECHO CXllplS were as a prepasition. The Scotsman had 16 c:xxmrence; of "for'. The one possible ex08_ption is: ''An act was hurried through Parliament suspa:iding a 10 day rule for mses to be brought to trial.'' But this is a nominal modifier probla:n. Tha:-e were a surprising 224 cxn.nn:nres of "far' in the ASH OK corpus. All were used as a i'>rePosition. The one interesting one is: 'List actual and rudgeted unit ~ for -: product 1 for 63 to 73.'' A rule that said [focPP][to] -> S- would be wroag he-a TIME contained al ocrumnres of "for'. all were di.s~auat.ed corredly. In TIM E2, there were 33 cx:xmn:nres of "for''. Several of there had the pattern "for'' ... 'to". but ''for' was not a ca:npla:nentiSff'. This indudes the following example:;: -the Spa.eel.ah, a self-cmtained srimtific compartmm.t far up t.o four . . . . -placed atop a modified Boeing 747 for a slO'ff return ... to Cape Canaveral .. -So far the space agro.cy bas been unable to scratch up the rooney for the opportunity t.o inta"cept this visit.er from the deep space. -is a bit too risky for most corpcrate chiefs to mntanplate. Like the 'have' problem the issue we have disrussed rarely ari~ in free text. There were more ocnirrences of the pattern [for][np][to] wh~ it was not to be an embedded senta:ire. than where it was suppased t.o be. 14.4WHICH The sOt.Jro.:S were checkm for ccnflicts retween "which" as a cEt.Ennimr and as a relative pronoun. In the Srotsman. thfrewere two occurrai.res ci. ''whidl' both of which were used to start raative dauses and hence disambiguated correc:tiy. All 0CC11Innres d ''which'.' in the MECHO smt:m.crs were as detEnniners. In the ASHOK rorpus. thfre were 11 occmnnres of "whid:l '. Eight of these were sentence initial. Tha:"e were no arors armng all of these. induding: ''At plant 2, which produd. acrounted for the low& perce:lt.age d. total sales in ddlars?'' and ''In 1972, which product er products bas largest varianre:t?'' In TIME and TIM E2 tha-e wt::re no violations. 254 14.5WHAT The SourcES w~ che&:ed for ''what' as a determiner versus a relative pronoun. There were no oca.:rr:rence:; of 'what:' in the ScotsmarL In the MECHO sentences there were no violations. All oa:mrenCES were of the form: what forre ... or what is ... All ~ces in the ASH 0 K rorpus were mrra:i.. as were all occurrences in TIME and TIM E2. 14.6THAT The word ''that'' ran be used as four pcirts of~ detem::Iiner, complEnlffltiser, pronoun and relative pronoun. All the SOlll"reS were checked for mistakes in handling this. In the ASHOK rorpus there were surprisingly only 3 cxn1rra:icEs. None of these lead t.o an ernr. In the MECHO sentenr:ffi, there were 16 occurrences of "that'. It was used as a com­ plemt:ntiser with ''so that' 9 ti~ and as a complena:itiser 5 other times. The rncst difficult senta:ire: Find the time that should elapse bEtween the release of the two strings if they are to hit the ground at the same time In TIME, there were 28 occurrences of "that:'. Three of these were as determina:s and 1 as a pronoun. These were handled corredly. Tba:'e were 13 ~as a complem:ntiser and 11 uses as a raative pronoun. In all the relative pronoun cases, the subjed was missing from. the relative clause. In TIME2 there were 33 ocrum:nces. There was only cne sentence initial exam­ ple, and it was used as a pronoun and rorrectly handled by the rule:s. The word ''that'' was used as a determiner 4 times, and as a pronoun 4 time:;. These uses were resolved mrrectly. It was used as a compla:ru:ntiser 10 times. None of these uses were after a hmdnoun. so no ernrs rerulted Finally, "that:' was used as a raative pronoun 15 ti.r.nEs. All the:;e USffi carre after a headnoun. It is very significant that eadl relative clause had the subject missing. There were no OCCl.HTffi.a:s with the object of the relative dause missing. This mmns that the hard problan of Wling a rompla:nentiser from a raative pronoun did not happen at all! 14;7 Noun/Verb Ambiguity To Ust. the coverage of. the ruhs presented in this thcis fer disambiguating words which can be both a noun and a verl>, I checked [TIME 1978]. In the lead artide. there was 315 sentence;. containing 213 examples of this amliguity. There was only one error on thffie examples: 'Kids his friend'. The use of. noun/vffh words as verm was al.mast al.ways deta­ mined by the morphol~. The next .mast mmmon use of noun/van words was as nouns, and many of thffie wt:re dale by wocd onhr. This leaves only a small IlllIIlla- of case:; where the word is a vat> with no morpholq;ml. changes, rut these were also handled correctly. 255 Appendix D: The Annotated Gramrr:.ar The following is the grammar fer ROBIE. The atails r:i PRO LOG have been left oot and instead, for mch rule, the pattern whim must be preimt before llie rule can run and the st.ate of the parser aft.er the rule has run will be given. The reader may find it ustful to compare the rule; pn:smted here with these used in. Appendix A. This appendix rdleds the grammar as d. May 1981. , A few not.Es on the notation usal in this appaidix. In the rule patifm.s, a'&' rreans "and'.', a '#' means "or'. [X] means a buffa- containing X. [X-Y] means Y is attad:led to X. This rorresponds to the rerults of the command "attach Y to X." [[X(a.b)] rreans a buffa:- with the fmbrres a and h This CQIT'e)poil.ds tD the rommanffi "add the features "a' and '1>' to X. "next'' means the buffern are shifted to the next word Ranerri.bEr the buff em are always the next two it.Ems aft.ty the Active Node Stack. We have seai the rule DETERMINER in Chapta- B. Before we see the grammar as a whole, the cx.itT'ESpOlldence betwem. the rule and the diagrams used in this appendix will be explaimrl. Rule DETERMINER inpicketPARSE-DET: To analyse a det. if you have the feab.Jre "ad' in the first buffEY then:- 1) attad:i the first Buff Er t.o The Bolian of the Active Node Stack as a detem:Jiner. 2) tell the semantic proce:;sor you have a determiner. 3) dmdivate the packet ront.aining this rule, PARSE-DET 4) activate the packet PARSE-QP-2 Recursively call the rule mat:d:ier. This rule rorre;pcncE to the following diagram Rule: determine'." Priority: 10 Active Node Stack: A dive Node Stack [NP] WARSE..JJET,X~ . = > [NP-det] ~PARSE._QP--2,X~ Buffa-s: Buffa-s: [det] next The first buffEY ront.ains the fea.tnre "det', indimt.ed by the symlx:>l: [detl C, the bottom of the Active Node Stack. is an NP nOOe. The result of the attach function is indi­ cat.ed by the symbols: [NP-det]. mmning the detamirur has been attached to the NP. The packet PARSE...DET bas ban n:placed with the packet PARSE_QP--2. The call tD the sm:iantic proressor will not be illu&.rated in these diagrams but happens in every rule PACKET: ss....sTART This p:u:ket is adive at the very start of. the senta:ice and deactivat.ed. oo.ce the subject has hem. located This packet finds smtmre initial modififf'S and detmni.nes the type c:1. the senbnre. This packet assuII1ES that C (The Curnnt Active N roe) is a S. 256 BEFORE: Rule: iLwbat Priority: 5 Active N cxle Stack: (S] ~SS-8TART .X~ Buffers: [binder] AITER: A dive N o::le Stack: => [S] [Sl-binder] ~ss_sTART .x~ ~CPOOL,SS__fil'ARTj Buffers: next This rule is used. to parse sentences of the form: If the mass is 3 lbs. what is the acnie3:a- ti ? on. Rule: wb__quest Priority: 10 Active N cxle Stacie A dive Node Stack: [S] ~SS.....START .X~ = > [S(major,wb__quest) WARSK_SUBJ ~ Buffa-s: [wh&(np#PP# ap)] -wh &( np# PP# ap)] Buffers: next This rule handles wh quffitioas and may insert. a trare if B2 ccntains a verl>. Rule: majoc_daLs A dive N crle Stack: Pricrity: 10 Ad:.ive Node Stack: [S] ~S8-START.X~ = > [S(ded,majcr) ~PARSE....SUBJ ~ Buffers: Buffers: [npJverb] [np][vErl>] Rule: adverb Priority: 10 Active N crle Stack: A dive Node Stack: [S] ~S3-START.Xj = > [S-advErl>] ~ss....sT ART .Xj Buffers: Buffers: [ advm> I ngst.art l [ngst.art] Rule: aux..invc:rt. Priority: 10 Active Nale Stack: A dive Node Stack: [S] ~S3._START.X~ = > [S(ynqufSt.major)] ~PARSE....sUBJ~ Buffers: Buffers: [ auxverb I ngst.art.] [ ngst.art. ]next This rule repl~ the old and inrorra± yES-no-que:;ti.cn rule. The auxverb is pushed onto the moverne:it stad<: and will be plared at the start of the auxiliary. The movffilent stack is explained in Section 8.5.3. Rule: np.._pp_.ddault Active Node Stack: Priority. 10 A dive Node Stack: [S] ~ss....sTART,X~ = > [S] ~Ss_sf ART,Xj Buffers: Buff~: [npJpp] [np-pp ]next 257 This rule handlES clause initial PPs h-.1 attadtlng them. to an initial NP. Rule: np_utterance Priority: 10 Active Ncxle Stack: Adive Node Stack: [S] ~ss_srART.X~ = > [S(utteranre,major)-np -finalpunc] Buffers: Buffers: [ np ][finalpun~l the parse is finished. Rule: pµ..ntteranre Priority: 10 Active Node St.a.de Active Node StaCk: [S] ~ ss_sr ART .X ~ = > [S( utterance..major)-pp -finalpunc] Buffers: Buffers: [pplfpunc] the parse is finished Rule: imperative Pricrity: 10 Active Node Stade Active Node Stack: [S] ~ss_sfART,X~ = > [S(impa:ative.nBjor)] · Buffers: [tmscl.ess] ~PARSK...sUBJ ~ Buffers: (you][tmseless] Rule: frmt:ecLpp Priority: 10 A dive N cxle Stade A dive Node Stack: (S] ~ss_sf ART,X~ = > [S-pp] ~Ss....sTART.X~ Buffers: Buffers: [pp] next R ul~ wh...np Priority: 10 Active Node Stack: A dive Node Stade [S] ~ss_sf ART,X~ = > (S] ~S8-START,X~ Buffers: Buffers: [wh] [np-wh] This rule a:isures that wh quffition words are dominated by a NP node PACKET:CPOOL This packet is active wh£neva:- C is an Snode. This packet rontains the rules to start constitua:its sum as NPs andPPs andhandles clause level modfiErS. BEFORE: AFTER: Rule: X-and..X Priority. 5 A dive Node St.a.ck: A dive Node Stade [S] ~CPOOL,X~ = > [S] !CPOOL.X~ Buffers: Buffers: [X][anjJX] [X-X 258 -ronj -X] This rule builds ronjcined iterrs into a complex node. The pattan also contains a check to ~--ure that the Xs are syntadimlly and scrnet:imEs serrmntically the same. Rule: poss_rlet Priority: 5 Ad:ive N 00e Sbck: Active Node Stade [S] ~CPOOL,X~ = > [S] {CPOOL,X~ Buffers: Buffers: (poos_np ]. agree( det) [ det-pos&..np] This rule ~ ~essive NPs to detamine:r.;. Rule: sn.ibat P:ri [S] {CPOOL,X~ Buffers: [ romp-mmpadv -that][.] This rule is to handle phrases such as "so that:' and "such. that:'. The mmrna is picked up by a rule is ss....FINAL. Rule: propname Priority: 10 Adive Node St.8ck: Active Node Stack [S] ~CPOOL,X~ = > [S] ~CPOOL,X~ [NP(narre)] [BUILD_NAME] Buffers: Buffers: [~ not(np)] next Rule: propnoun Priority: 10 A dive Node Stndc (S] ~CPOOL,X~ Buffers: [propnoun] Rule: pp Priortty: 10 A dive N 00e St2£k: (S] ~CPOOL,X~ Buffers: [prep I ngst.art l Rule: maiked_st.artnp A dive Node Stack: (S] ~CPOOL,X~ Buffers: Active Node Stack: . = > (S] ~CPOO L.X~ Buffers: [N P-propnoun] Active Node Stack: = > (S] ~CPOOL.X~ [PP] {PARSE.....PP.CPOOL~ Buffers: [prep I ngstart] Priority. 10 Active Node Stack: = > (S] ~CPOOL.X~ (NP] ~PARSE....DET,NPOOq Buffers: 259 [ Q:;t ]. agreE( det) [ck.-t.] Rule.: sta..;np Priority: 10 Active Ncde Stack: Active N cre Stacie [S] ~CPOOL,X~ = > [S] ~CPOOL,X~ [NP] ~PARSE._QP -1,NPOOL~ Buffers: Buffers: [ ngstart¬(pronoun/f da)} [ngstart] This rule starts NP which do :tot have a det:£rn:iiner. This pattern is needed for bist:orical reasons. Rule.: comparative Priority: 10 Active Node St.adc A dive Node Stade [S] ~CPOOL,X~ = > [S] ~CPOOL,X~ Buffa:-s: Buffers: [than eomp][np] [than cornp-np] This rule handle:; phra.sffi such as ''3 ft longa:- than the rod'' Rule: and Priority: 10 Active Node St.adc (S] ~CPOOL,Xj Buffa:-s: (ronj] A dive Node Stack: = > [S] · ~CPOOL.X5 [ cxnj ( andc)] [CPOOL,PARSE...YP,PARSE._CONJ] Buffern: nfXt Rule: comp....tn._np Active Node Stack: Priority: 10 Acfive Node Stade (S] ~CPOOL,xi = > [S] ~CPOOL.X~ Buffa:-s: Butlers: [CX>UJP-S] [NP-romp_s] Rule: np._pp Priority: 10 Active N cde Stack: A dive Node Stack: (S] !CPOOL,Xl = > [S-pp] {CPOOL,X~ Buff er.;: Buffers: [pp] Sfill rnk(pp) next Rule: pronoun Priority: 10 Active N cde Stade Active Node Stack: (S] . ~CPOOL.Xj = > [S] ~CPOOL,X~ Buff er.;: Buffers: (pronoun] [NP-prcnoon] 260 Rule: vp.....attadl Priority: 10 Active N 00e St.a.ck: Active Nooe Stack: [S] ~CPOOL,X~ = > [S-vp] ~CPOOL,X~ Bufia:-s: Buficrs: [vp] next Rule: pCEs__np Pricrity: 15 Active N 00e Stack: A dive N cde Stack: [S] ~CPOOL,X} = > [S] ~CPOOL,X~ Buffers: Buifas: [t~sive] [[t}preseIBive] PACKET:NPOOL This packet is adive whenevff' C is a NP and contains rules to handle nlllIIDE;r phrases and othef' pre-nominal mcdififf'S. BEFORE: AFTER: Rule: qp...and_quant PriOrity: 5 Active N cxle Stack: Active Node Stack [NP] ~NPOOL,X~ = > [NP] ~NPOOL,X~ [QP-qp -croj] Buffers: Buffers: [ qp ][ ronj][ quant] [ quant] This is a very old rule and handle:s ~ sucl:i as · ·3 lm and 5 loo' in a not very d~t way . . Rule: lmger-1han Priority: 10 Active Nooe Stack: Active Node Stack: (NP] ~NPOOL,X~ = > [NP(tharLcomp) ~NPOOL,X~ -than -name] Buffers: Buffers: [than I name] next For phrase:; such as ''3 Yffil'S olda:" than Tern'.'. Rule: 3..ft/sec Pricrity: 10 Active N cxle Stack: A dive Node Stack: [NP] ~NPOOL,X~ = > [NP] ~NPOOL,X~ Buffers: Buffers: [qplunits] [qp-units] ;261 Rule.: fLlong Priority: 10 A dive N crle Stack: A dive Node Stack: [NP] ~NPOOL,X~ = > [NP] ~NPOOL,X~ Buffers: Buffers: [qp][adj] [ap-qp -adj] Rule.: nourLqp Priority: 10 A dive N cicE Stade A dive Node Stack: [NP] ~NPOOL,X~ = > [NP] . ~NPOOL,X~ Buffers: Buffers: [qtrant] [ qp-quant] , Rule.: ap....attach Priority: 10 A dive N C [NP-ap] ~NPOOL.X~ Buffers: Buffers: [ap] next Rule: qp...attad:t Priority: A dive Node Stack: A dive Node Stack: (NP] ~NPOOL,Xj = > [NP-qp] ~NPOOL,X~ Buffr::TS: Buffers: [qp] next PACKET:PARSKJJET This packet is active at the start of a NP when a detamina:- is pre:;ent. (see rule ''maikecLstart....np in CPO 0 L"). BEFORE: AITER: Rule: detaminer Priority: 10 Active Node Stack: Active Node Stack: [NP] {PARSE_DET.X} = > [NP-det] WARSK...QP.--2,Xj Buffers: Buffers: [det] next PACKET: PARS~P-1 This packet is adive when a NP is started which dJes not have a ~. (see rule starLnp in CPOOL) BEFORE: . A IT ER: 262 Rule: how__rnany Pricrity: 10 A dive Node Stack: A dive Node Stack: [NP] ~PARSE_QP -1.X~ = > [NP] WARSE_QP -1.Xl Buff Er"S: Buffers: [how ][adj] [ap(how,rclpron)-how -adj] Rule: quant Priority: 10 Active Node Stack: Active Node Stac:k: [NP] ~PARSE_QP -1.X~ = > [NP] ~PARSE....QP -1.X~ BuffErS: Buffers: [quant][adj# num] [qp-quant][adj# num] Rule: next....week: Priority: 10 Active Node Stack: Active Node Stack: [NP] WARSE_QP-1,X~ = > (NP] WARSE_NOUN,X~ Buff ErS: Buffers: [ ord][ noun&time] [ ap-ord][ noun&time] Rule: alL.the Priority: 10 Adive Node Stack: Active Node Stac:k: [NP] ~PARSE_QP -1.X~ = > [NP] ~PARSE._QP -1,X} Buff Ers: Buffers: [all][det&dd] [all][ of] Rule: quantifier Priority: 10 Adive Node Stade Active Node Stack: [NP] {PARSE_QP -1.X~ = > [NP-quantifier] BuffErS: [quantifier] Buffers: next Rule: quanLdro.e Priority: 15 ~PARSE_QP -1~ Adlve Node Sta!X: Active Node Stack: [NP] ~PARSE_QP -1.X} = > [NP] WARSE._ADJ .Xj Buff ErS: Buffers: [t] [t] PACKET: PARSE_QP--2 This packet is used when C is a NP and it bas a detami.n.Er att:.ached BEFORE: AITER: 263 Rule: deL.quant Pricrity. 10 A dive N o:ie Stack: Active Node Stack: [NP] ~PARSE_QP -2.X~ = > [NP] WARSE_QP-2,X~ Buffers: Buffers: (quant][adj# noun] [qp-quant][adj# noun] Rule: OL""rlinal Priority. 10 Active Node Stack: Active N 00e Stade [NP] ~PARSE_QP -2,X} = > [NP] {PARSE._ADJ ,X} Buffers: Buffers: [ ord] [ ap-ord] Rule: deL.quanLdone Priority. 15 A dive N cde Stack: Active Node Stack: [NP] WARSE....QP-2,X~ = > [NP] {PARSE._ADJ .x~ Buffers: Buffers: [tJ [~ PACKET:PARSE_ADJ This packet is active when C is a NP and a.fur the PARSK...QP _x packets, i.e. the adjective is now expected · BEFORE: AFTER: Rule: adj_group Priority: 10 Active N crle Stack: Active N crle Stack: [NP] ~PARSE....ADJ .x~ = > [NP-adj] ~PARSE...ADJ .x~ BuffEYS: .Buffers: [adj](adj# noun# dim] [adj# noun# dim] Rule: adj......np Priority. ~O A dive Node Stack: Active N c:rle Stack: [NP] ~PARSE....ADJ .x~ = > [NP(ap)-adj] ~NP _GOM PLETE~ Buffers: Buffers: [adj] next nm np done next This handle; NPs which are really ooly an odjo:tive, for example as in: The hill is red Rule: adj Priority: 15 Adive Node Stade Active Node Stack: [NP] WARSE_ADJ.X~ = > [NP] WARSE...NOUN.X5 Buffers: Buffers: [t] [t] 264 PACKET:PARSE._NOUN This is active when C is an NP and the headnoun is now expected (i.e. after all the adjedi.ves have been found) BEFORE: AFTER: Rule; romplex..noon Pr:tortty. 10 A clive Node St.a.ck: Active N cde Stack: [NP] ~PARSE....NOUN .x~ = > [NP-n0un] ~PARSE_NOUN .x~ BuffEYS: Buffers: [noun][ noun] agree( complex....noun) [noun] · · This agreement check worriES about noun/modal ambiguity as wEill as semantic oonsidera­ tions. Rule nouns Priority: 10 ActiveNodeStack: ActiveNcrleStack: [NP] ~PARSE....NOUN,X~ = > [NP-noun] ~NP_GOMPLETE,X~ Buffers: Buffers: [noun.npl] sem mk(nouns} next This rule handks the case of ''building blocks'' and the plural garden paths. Rule; noun Priority: 10 Active N cde Stade Active N cde Stack: [NP] WARSE.....NOUN,X~ = > [NP-noun] ~NP_GOMPLETE,X~ Buffers: Buffers: [noun] next Rule; np_built Priority. 15 Active Node Stack: Active N crle Stack: [NP] ~PARSE....NOUN,X~ = > [NP] ~NP _coMPLETE,X~ Buffers: Buffers: ~] ~] PACKET:NP_GOMPLETE This packet is active aft.£r the NP has a hmdnoun. and is to find various NP modifia-s. BEFORE: AFTER: Rule qp....pp Priority: 10 ActiveNodeStack: ActiveNcrleStack: [NP] ~NP_GOMPLETE,Xj = > [NP] ~NP_LOMPLETE,X~ [PP-qp] ~PARSE_PP,CPOOq Buffers: Buffers: [qp](prep] (prep] 265 This rule handles phras~ such as "3 ft above the sea'.' by building a COITipOUnd pn:pa:;i_tion. Rule: prep_st.art Pricrity: 10 Active N cde Stade A dive N 00e Stack: [NP] ~NP_LOM:PLETE.X~ = > [NP] {NP_LOMPLETE.X~ [PP] ~PARSE....PP,CPOOq Buffers: Buffers: [prep][ ngstart] [prep I ngstart l Rule: reducaLrelative Priority. 10 · Active Node Stack: Active Ncee Stack: [NP] ~NP_GOMPLETE,X~ = > [NP] ~NP_GOMPLETE.X~ Buff era: Buffers: [va:i>,ing] or [wh][vern] [vffb, ed] san .. mk{ recLrel) This rule is for .reduced raative dauses andgardm paths. The ''ing'' case is handle; slightly different to match the MECH 0 worl.d Rule: reLattach Priority: 10 Active Node Stack: Active Node Stade · (NP] ~NP _GOMPLETE.XJ = >. (NP-relative] ~NP _GOMPLETE} Buffers: Buffers: [relative] next Attac:b.ES rdative clause:; to NPs. Rule: nipron_np Priority: 10 Active Node Stade Active Node Stack: [NP] ~NP __LOMPLETE,X~ = > [NP] ~NP __LOMPLETE.X~ Buffers: Buff em: ( relpron] [ np( relpron_np)-roprcn] Rule: wh....relative....dause Priority: 10 Active Node Stack: Active Node Stack: [NP] ~NP __coMPLETE.Xl = > [NP] ~NP __coMPLETE,X} [S( sec, relative) -relproo_np] ~ CPOO L,PARSE_SUBJ j Buff em: Buffers: [relproo_np] next oc [trace] may insert a trare if B2 is a verb. Rule: np...pp Priority. 10 Active Node Stack: Active Node Stade [NP] ~NP __coMPLETE,X~ = > [NP-pp] ~NP _coMPLETE,X~ Buff~: Buffers: [pp] sem mk(pp) next 266 Rule: and Priority: 10 Active Node Stacie Active Node Stack: [NP] ~NP_coMPLETE.X~ = > [NP]. ~NP_coMPLETE.X~ [ ronj ( andc)] ~CPOOL,PARSE_VP,PARSE._CONJj Buiie"S: Buffers: [ccnj] next Rule: mmma Priority. 10 Active Node Stack: Active Nooe Stack [NP] ~NP _coMPLETE.X~ = > [NP] ~NP _coMPLETE.X~ Buffers: Buffers: [ ccrnma] next run np_done next_ Rule: insert...JV H Priority. 15 A dive N orle Stack: Active Node Stack: [NP] ~NP _coMPLETE.X~ = > [NP] ~NP _coMPLETE.X~ Buff E!'S: Buffers: [ det# ngstart] [ wh ][ det# ngst:art] This rule is to handle phrases such as ''the boy the man knows''. It has not hem. carefully thought out. Rule: oLpp Priority: 15 Active Node Stack: Active Node Stack: (NP] ~NP _coMPLETE.X~ = > [NP] ~NP _coMPLETE.X~ Buffers: Buifers: [of! noun] [et I noun( %astart.)] Rule: np done Pri.crity: 15 Active Node St.ack:: Active Ncde Stack: [NEXT] ~X~ [NP] ~NP _GOMPLETE.Ys = > [NEXT] ~x~ BuffE!'S: Buffers: [t] [NP] PACKET: PARSE_PP This packet is to parse PPs and is used whenevEr C is a PP node. BEFORE: AITER: Rule: attadL.prep Priority: 10 Active Node Stack: A dive Node Stack: [PP] WARSE...PP.X~ = > [PP-pnp] WARSK..PP.X~ Buffers: Buffers: [pr-q>] nert 267 Rule: attach_np Priority: 10 Active Node Stack: Active Ncde Stade [NEXT] {X~ [PP] ~PARSE_PP,YJ = > [NEXT] eX~ Buffers: Bur-fe-s: [np] [PP-np] Priority: 10 Rule: witlL.:wbich Active N c:xie Stack: A dive N crle Stade [PP] ~PARSE....PP,XJ = > [PP(relpron_np) WARSE...PP,X~ Buffers:. Buffers: [wh] [np-wh] PACKET: PAR~BJ This pac:ka is used to attach the subject d a smtence. C is an S expecting the subject.. BEFORE: AFI'ER: Rule: unmarkecL.order Priority: 10 Active Node St.a.ck: A dive N cxie Stade [S] ~PARSE.....SUBJ .x~ = > [S-np] ~PARSE...AUX,X} Buffers: Buff~: [ np I vErl>] agree( subj) [verb] Rule: aux...inversion Priority: 10 A dive N cxie Stade A dive Node Stack: [S] ~PARSK..SUBJ .X~ = > (S] ~PARSE....sUBJ,X~ Buff3'8: Buffers: [ auxvero ][ np# ngstart] [ np# ngstart] The auxverb is n:nved by ted:miques not explained in the thesis. PACKET: BUILD-A.DX This packet builds the auxiliary. It assurDES C is theAUX node. BEFORE: AITER: Rule: modal Priority. 10 Adive N cXle Stade Active Node Stack: [AUX] ~BUILD_AUX,XJ = > [AUX-modal] ~BUILD_AUX,Xj Buffers: Buffers: [modal ltmse!E58] [t.msdess] 268 Rule: perfective Priorib;: 10 Ad.ive Node Stacie Active Nooe Stack: [AUX] ~BUILD_AUX,X~ = > [AUX-bave] ~BUILD_AUX,X~ Buffers: BtL.f'iers: [have I en] [en] Rule: passive....aux Priority: 10 A dive Node Stack: Active Node Stack: [A UX] ~BUILD_A UX,Xj = > [A UX-be] ~BUILD_AUX,X~ Buffers: Buffers: [be][m] [ai] Rule: pn:gre.ssive Priority: 10 A ct.ive Node Stack: A dive Node Stack: [AUX] ~BUILD_AUX,X~ = > [AUX-be] ~BUILD_AUX,X~ . Buffers: Buffers: [be ][ing] [ing] Rule: do...fillpport Priority: 10 A dive Node Stack: Active Node Stack [AUX] ~BUILD__AUX,X~ = > [AUX-do] ~BUILD_AUX,X~ Buffers: Buffers: [do JtmselESS] [ taisele:;s] Rule: ba.pred Priority: 10 A dive Node Stack: A dive Node Stade (AUX] ~BUILD__AUX,X~ = > [AUX-be] {BUILD_.l\.UX,X~ BuffffS: Buffff'S: [be](pnp# adj] [prep# adj] Rule: negative Pricrity:-10 Active Nooe Stade Active Node Stack: [AUX] ~BUILD_AUX,X~ = > [AUX~rn:g] ~BUILD__AUX,X~ Buffers: Buffe'.'S: [mg] next Rule: aux__advffb Priority: 10 A dive Node Stack: A dive Node Stack: (AUX] ~BUILD_AUX,X~ = > [AUX-adverb] ~BUILD_AUX,X~ Butters: Buffers: ~dverb] next Rule: amL.CCmplete Priority: 15 A dive Node Stack: A dive Node Stack: [NEXT] ~X~ (A UX] ~BUILD_A UX.Y~ = > [NEXT] ~X~ Buffers: Buffers: (t] [AUX] 269 PACKET: PARSE__A U-;C This packet is adive after the subject hac; be:n attached and when the A UX is expected next. Assumes that C is the S node. It both starts and attadie) the A UX node. BEFORE: AFTER: Rule: hLlnfinitive Priop. ty. 10 A dive Node Stack: Active Node Stade [S] ~PARSE..AUX,X~ = > [S] - ~PARSE-.AUX,X~ [AUX-to] ~BU!LD_AUX~ Buff6-s: Buffers: [t.o ][ teoseless] [ tensde:>s] Rule: starL.aux: Priority: 10 A dive Node Stack: Active N [S] WARSE_AUX,X~ [AUX] [BUILD....AUX] Buffers: Buffers: [verb] [verb] Rule: attad:L.aux Priority: 10 Active Node Stack Active Node Stack: (S] WARSE..AUX,X~ = > (S-aux] ~X.PARSE...YP~ Buffers: Buffers: (aux] next PACKET:PARSE....YP This packet is active aft.Er the AUX bas been attached and the MAIN vErl> is expected. C is the S ncrle. BEFORE: AFTER: Rule: prOOp Priority. 10 A dive Node Stack: Active Node Stade [S] ~PARSE..YP,XJ = > [S-pp(predp) ~SSJ'INAL,X~ Buffers: Buffers: [pp# ap] next Rule: main verb Priority. 10 Active Node Stade Active Node Stack: [S] ~PARSE_YP,Xj = > (S] ~SS_FINALJ (VP-verl>] ~ CPO 0 L. othEYS ~ding on the V ~ Buffa-s: Buff6"8: [vErl>] agrre( subject) next 270 PACKET: PASSIVE C is the VP, and the verb is passive. As per M arcus's rule, it inserts a trare into the first buff er. BEFORE: AFTER: Rule: passive Priority: 5 Active Node Stack: A dive Node Stacie [VP] ~PASSIVE,X~ = > [VP] ~X~ Buffers: Buffers: [t] [trare][t] PACKET: ss_vp This is the main packd to parse a VP. This packet collects the various VP modif- iers, et.c. BEFORE: AITER: Rule: particle Priority: 5 Active Node Stack: Active Node Stack: [VP] fss_VP,X~ = > [VP-particle] ~ss_vP.X~ Buffers: Buffers: (partide] smJ cbk(partide) next Rule: adverh_group Active Node Stack: Priority: 10 Active Node Stack: [YP] ~ S3-VP,X~ = > [VP] ~ss_VP,X~ Buffers: Buffers: [adverb][ advEri>] [Ca:npound-adva:b] Rule: adverb Priority. 10 Active Node Stade Active Node Stade [VP] ~ss_yp,x~ = > (YP-advm>] ~ss_:vp,x j Buffers: Buffs-s: [adverb] next Rule: pp_under:...Yp...1 Active Node Stack: Priority: 10 Active Node Stack: [VP] ~SS-YP,X~ = > [VP-pp] ~SS-YP,X~ Buffers: Buffs-s: [pp] next 271 Rule: partide.2 Priority: 15 Active N cde Stack: Active Node Stack: [VP] ~S8-VP,X~ = > [VP-particle] ~ss_VP,X~ BuffE!"S: (particle] Rule: vp_rlme Priority: 15 Buffers: next Active Node Stade Active Node Stack: [NEXT] ~X~ (VP] lS8-YP,Y~ => [NEXT] Buffers: [t] Buffers: (VP][t]· PACKET: OBJECT Obviously. this packet finds the object C is the VP. needing an object. BEFORE: AFTER: Rule: obj ed. Ptjority: 10 Adive Node Stack: Active N cde Stack: [VP] fOBJECT.Xj = > [VP-np] ~X} Buffers: [np] Buffers: next PACKET: TWQ_QBJ For va.-bs with two objects (such as give), this packet fuld the first cbjed. and then activates the packet for the ctha- object. C is a VP needing two objects. BEFORE: AFI'ER: Rule: firsLoQj ed. Priority: 10 A dive Node Stade Active Node Stack [VP] ~TW O....DBJ,X~ = > [VP-np] fX~ Buffers: Buffers: (np] next PACKET:NO..:...SUBJ C is a VP and the va:b is 'want'. BEFORE: AITER: 272 Rule: create_rl_Eita_subj Active N crle Stack: Priority: 10 A dive Node Stack: [VP] ~N Q_BUBJ .X~ = > [VP] ~X~ Buffers: Buffers: [to][ tensaess] [trace][ to] PACKET: THAT_GOMP C is a VP and the verb takes a "that oomplerrent'. BEFORE: AFFER: -. Rule: thaLs....start Priority: 5 Active N crle Stack: A dive Node Stack: [VP] ~THAT_GOMP.XJ = > [VP] ~THAT_GOMP.X~ Buffers: [nplvErl>] For the um:mrked case. Rule: thaLs Priority: 10 [S-( sec. romp_s) Buffers: [verb] -np] . ~CPOOL.PARSE....AUX~ Active N aie St.a.ck: Active Node Stack: [VP] ~THAT_GOMP.X} = > [VP] ~THAT_GOMP.Xj [S( sec. comp...s) -that] ~ CPO OL.PARSE_SUBJ} Buffers: Buffers: [ thaf[ ngstart.] [ngstart] PACKET: INF_GOMP C is a VP and the verb takes an infinitive romplemmt BEFORE: AITER: Rule: ~ Priority: 5 Active N 00e Stack: Active Node Stack: [VP] ~INF_GOMP.X} = > [VP] ~INF_£0MP.X} Buffers: [npltoltnsless] [S-( sec. romp s) -np] i CPO 0 L. PARSE._.A UX ~ Buffers: [to][tnsless] This rule has been re-formulated with two buffers. but retains the three buffers foc rornpat.alility with MECH 0 sa:nantics. 273 PACKET: TO_LESs...JNF._COMP C is a VP and the verb is ''see'' or "saw". BEFORE: AFTER: Rule: unmarked...s Priority: 10 Active Node Stack: Active Node Stack: [VP] ~T0--1..ESS-1NF_LOMP],X5 = > (VP] {TO....LESs..JNF_LOMP,X~ [S( sec, Comp_s) Buffers: [ np ][tnsle;s] ,-np] Buffers: [tnsless] PACKET: T0-13E_LESs.JNF_J;QMP ~CPOOL,PARSE...AUX} C is a VP and the verb is seem. This changes ''you seem. happy'' t.o ''you seen to be happy". BEFORE: AFTER: Rule:~ P:ric..-1.ty: 10 Active Node Stack: Active Node Stack: [VP] ~T0-13K..LES3-lNF_.COMP],X~ = > [VP] ~T0-131Ll...ESS-1NF_LOMP,X~ Butters: Buffers: (en or adj] [to][be] PACKET:EMBEDDED--.S.__FINAL C is an Embedded S that has a VP attaciled t.o it. BEFORE: AFTER: Rule: pp.._unc:B-_s Pricrity: 10 Adive Ncx:Ie Stack: Active Node Stack: [S-] ~EMBEDDED--3._FINAL,Xj = > [S-pp] ~EMBEDDED__s__:fiNAL,X~ Buffers: BuffErS: [pp] next Rule: 8-tb:le Priority. 15 A dive Node Stade Active Node Stack: [REST] fX~ [S-] ~EMBEDDED--.S__FINAL,Y~ = > [REST] ~XJ Buffers: Buffa:s: [t] . [S-jt] 274 PACKET: BUILD_NAME C is a NP that will be a name. BEFORE: AFTER: Rule: name Priority: 10 Active Node Stack: Active N crle Stacie [NP] ~BUILD_NA M E.X~ = > [NP-name] ~BUILD_NAM E.X~ Bufiers: Buffers: [name] next Rule: name done Priority: 15 Active Node Stack Active N cde Stack: [NP] ~BUILD_NAM E.X~ = > [NP] ~NP _co MPETE,X~ Buffa-s: Buffers: [t] [t] run the rule np done PACKET:PARSE_CONJ C is the word "and' and a conjunction is being proce;se4 BEFURE: AITER: Rule: drop_and Priority: 5 Active Node Stack: Active N c:rle Stade [REST] ~X~ [and] ~PARSE_CONJ .Y~ = > [REST] ~X~ Buffa-s: Buffers: [vp] [andJvp] Rule: drop_and Priority: 15 A dive N cde Stade Active N 00e Staclc [RESt] ~X~ [and] ~PARSE_CONJ,Y~ = > [REST] ~x~ Buffa-s: Buffers: [t] [and][t] PACKET: Ss....FINAL C is the major S with everything attarhed to it. BEFURE: AITER: 275 Rule: pp_unde!:..-S Priority: 10 Active Node Stade Active Node Stacie [S] ~SSJ'INAL,X~ = > [S-pp] ~SS_FINAL,X~ Buffers: Buffe.r.:;: [pp] next Rule: s_rlone Priority: 10 A dive Node Stade Adive Node Stade [S] ~SS-.FINAL,X~ = > [S-finalpunc] Buffers: Buffa-s: [finalpunc] The ~e is finished Rule: iniL.s..J:m- Priority: 10 Active Node Stack: Active Node Stack: [S] ~SS__F'INAL,X~ = > [Sl(major)]~CPOOL,PARSE..BUBJ~ Buffers: Buffers: [ sent...subj] [S( corrp_s) ][ senLsubj] Downgrades sentm.tial subjais. This is for sa:ll:ences sudi as: 'What little fish eat is WormS 1• Rule: a:njoined..s Priority: 10 Active Node Stack: A dive Node Stack: [S] ~SS__F'INAL,X~ = > [Sl-S-rooj] ~Ss...FINAL,X~ [S2] ~CPOOL,ss_srARTJ Buffers: Buffer-s: [comma][ conj# binder] next This rule handle:; a::njoined sent:m_ces in a simple. but effective way. It is not adequate in general. Rule: hypa._.s Priority: 10 Active Nooe Stade Active Node Stack: (Sl] [Sl-S2] [S2] ~Ss...FINAL,X~ = > [S3] ~CPOOL,ss_sTART~ Buffers: Buffers: [comma] next For senta:ieffi of the fonn: If sent.ence, wh queilion? 276 BIBLIOGRAPHY Aho. A. and Ullman. J. [ 1972] The Theory of Parsing, Translation and Cornpil­ i:ng. vol. 1, Prentice-Hall, Englewood Cliffs. N. J. Akmajian. A. and F. Heny [1975] An Introdudion to the Principles of Transfor­ mational Syntax. '¥IT Pnss. Cambridge, Mass. Ades, A. and Sta:rlman. M . [ 1980] "On W 0i_-d-Order ', unpublished papEr. Dept of Psychology, University of W arwic:k. B~ E. and Honl G. [1976] ''Rana.rks on 'Conditions on Transformaticns"', in Linguistic Inqui.:ly, 7, p. 256-299. B~ E. [1977] "Comments on a paper by ChorrnkjT', in [P. Culioover, T. Wason A. Akmajian 1977]. BG'Wick, R. [1979] ''Leaming Structural Descriptions of Grammar Rules Frum Examples", IJCAI-'19 Conference Proceedings, also MIT AI-TR 578, 1980. Berwick. R. [1979] "Computational Analcgues of COlli>-traints on GrammarS', Proceedings of the lBth Annual Meeting of the Association fer Computaticnal L~oUi.stics. Bewick. R. [1981] "A Model of Syntactic Acquisition'.' in the Univfrnity of Mas­ sachusetts W orki.ng Papa:-s in Li~.cruistics, volume 6. Bever, T. [1970] 'The Ccgnitive Basis fer Linguistic Structure>" in Cognition and the Deveopment of Language, Hayes. J. (ed), Jd:m Wiley and Soos, New Yark. 1963 p. 135-183. Bever. T .. Garret, M .• Hurtig. R. [1978] 'The Inta:-adion of P~ Process and Ambiguous Sentena:S ', in M a:n:ry and Cc:gnition. vol. lno. 3. B~ev. B. [1979] "Automatic Resolution of Linguistic Ambiguities", Technical Report no. 11 (Thffiis), Computer Laboratory, University of Camlridge. Bdc, L. [1978] Speech Comm mi cation with Computer, Springer-V ~fag. Berlin. Bdc, L. [197Bb] Nablral Language Cmrnication with Computa"S, Springrr­ V Erlag. BErlin. Bart.a:k, L. and Chandler, S. [H175] Applied Mecbani~. Stanley Thcm_es {Pub­ lisher), Ltd, Bath. Bresnan. J. [1973] "Syntax c:i. the Canparative Clause Construction in English'.', 277 Linguisti.c 1 nquiry 4: 275. Bresnan. J. [1976] ''Evidence for a Theory of Unbounded Transformations", in Linguistic A naI. ysis 2: 3.53. Breman. J. [1978] "A Realistic Transformational Gramn:ar". in [Halle, Breman and Miller (eds) 1978]. Bundy. A., Luger. G. Stone. M .• and Wcl.bam. B. [1976] "Mecho: Year One". AI SB-76 Confera:ire Prcceedi.ngs. Bundy. A .. Luga-. G .. Mellish C .. and Palmer. M. [1977] "Solving Mechanics PrOOlerrs: InterimRepcrt to the SRC". Deparb:nent of Artif ­ icial Intelligence, Edinburgh UnivEl'Sity Working Paper Zl. Bundy, A .. Luger, M ., Mellish, C .. and PalllH", M. [1978] 'Knowledge about Knowledge: Making DOO.si.ons in Mechanics Prolian Solv­ ing'. AISB-78 Conferenre Proceedings. Bundy, A .• Byrd. L., Luger, G., Mell~ C., Milne. R .• Pahnfr, M. [1979] ''MECH 0: A Prcgram. To Solve Mechanics Probla::ns", D eprunent of Artificial I ntelligm.ce, Edinburgh U nivEr.il ty, Working Paper no. 50. Bundy, A., Byrd. L., Luga-, G., M Eilish, P., Palr:ru', M. [1979b] 'Solving M ecbanics Problems Using M eta-Levcl. Inference', 1J CAI­ ?9 Conference Prcceedings. Bundy. A. and W cl.ham. B. [1981] 'Using M eta-Levcl. DE:Scriptions for Scl.ective Applications of Multiple R e-W rit.e Rule; in Algetraic M ani­ pulation", Artificial InWligmce 16, p. 100-212. · Burton R. [1976] "Semantic Grammar: an Engineai.ng Tedmique for Construct­ ing N ab.Iral Language Unoo-standing Syst.enJs' ', BBN R epcrl no. 3433. Cairns, H, and Kamen:nan, J. [1975] 'Lexical Informaticn PI'OeffiSing During Semantic Ccrnpn:heasian', in The Journal of Verbal Lmm­ ing and Y erbal Bcll.avicr. 14. Cai.ms, H. and H ~ J. [ 1980] "Effects of Prior Context on Leriral A al Leaming and Verbal Bffiavior. 14. p. 265-274. Humphrey, D. [1930] Intermediate Mechanics, Longrms Gran fill:dCo., London. Jackendoff, R. S. [1972] Semantic Interpretation in Gmerative Grarmnar. MIT Pre:;s, Cambridge, Mass. J ackmdoff, R. [1977] "X: Synt:ax: A Study of Phrase Strud.ure', in Linguistic Inquiry Monograph Two. Just, M. and Claik, H. H. [1973] "Drawing infenna:s fra:n presuppcsition and Implicaticns of Affirmative and N~ve sentaices'', in The Journal of V Erlcl Learning and V Erl>al Behavicr, 12, p. 21- 32. Katz. J. and Foder, J. [1964] "The Structure of a Semantic Theory ", in Fodor. and Katz eds., The Strudure of Language, Englewood Cliffs, New J er.3ey. Prentire-Hall. Kay, M. [H113] "The MIND System'.', in [Rustin HJ73]. Kimball, J. [1973] ''Seven Principles of Surface Stnrlure Parsing in Natural Language', Cogniti~ 2. Kaplan. R. [1972] "Augm:nt.ed Transition NetwOiks as Psychological Modds of Smt:m.re Cc:rnpnhmsi.an'. A:rtif:icial InWli.gmre, 3. Kuno, S. [1965] 'The Predictive Analyzer and a Path Elimination Tedmique', Corrummicatioos of the ACM, vol. 8.. no. 7. Marcus, M. [ H175] ''Diagncsis as a Notion of Grammar'', in Theoretiml Issues in Natural Language Pl'OCf5Sing. R. Schank and B. Nash- 281 W erer, eds., Cambridge., Mass. MaralS, M. and Shipman. D. [1979] 'Toward; Minimal Data Strudllrffi for D etaministic Pai-sing'. I J CAI-79 Ccnference Proczedings. M arrus. M. [1980] A ThEOIY of Syntactic Recognition for Natural Language, MIT Pi:-e:;s. Marslen-Wilson W. [1S73] '1.inguistic Strud:ure and Speech Sbaoowing at very short Latmdes' '. Nature. 244, p. 522-523. Marslen-Wils~ W. [1975] ''Smtenre I'ercq>tion as an Interadive Parallel Pro­ ce:;s", Srienre, 189.~ p. 226-:?28. M arslen-W ils~ W . and Tyler, L. [ 1975] '~sing Structure of Sentenre Per­ ceptiori ',Nature, 257. p. 784-786. Marslen-Wils~ W. and Welsh. A. [1978] ·~ lnta-actions and Lexical Access During Word R ~tion in Continuous Speecll', Cognitive Psychdogy. no. 10. Marslen-W~ W. and Tyler, L. [1980] 'The Temporal Structure of Spoken Language Understanding'', Cognition, B. Marslen-Wilson W. [1980b] 'Speed:l Understanding as a Psycholq;cal Proa:ss", in J.C. Si:mon (ed), Spcken Language Gmeration and Uncastanding, D orUred:it R eide. Martin. W .• Ch~ K., Patil, R. [1981] ''Prelimimuy Analysis of a Brmdlh­ First Parsing Algorithm: Theoi.-etiral and ExpeiIDEntal Rerult.s", MIT AI Lab, presented at ''M etanding Natural Language, Edinh.Hgh Univa-sity 285 •, Pr-Ess. W cxxis, W . A. [ 1970] 'Trans.i.ti en Network Grammars f cr Natural Langt : ·ge Analysis", Co..,,Dllmications a[ the ACM, 13:591. 'ff cces, W .A. [1972] 'The Lunar Sciena:s Natural Language Information SysL _1'. BBN Report no. 2378, Cambridge, Mass. W cods, W .A. [1973] "An ExpErimental Parsing System for Transition NW: rk Grammars!!, in [R~ 73]. Young. R. [1979] 'Production Systems•for rru:rlclling Human Cognition'.', lL!C Applied Psyd:lology;U nit, Cambridge in [M idJ.ie 79]. Abbreviations Used: IJCAI - Conferenre PI"Q08E!dings of the International Joint Conference or.i A:rtiiidal IntdHgence. COLING - Conference Pnxeedings of the Conferenre en Computational Li.%cuistic:s. AISB - Conferm_ce Pn:x'e?rlings of the Assoc:iation for Artificial Intelligenre and Simulation of Bebavior. MIT - The M assad:msetts I nstitut.e of T ec:hnology, Cambridge, M ass. BBN - Bolt, Beranek and Newman. Cambridge, M ass. 286