Section 27: Tokens of grammar

27. Tokens of grammar

The complete list of grammar tokens is as follows:

"<word>"
that literal word only

noun
any object in scope

held
object held by the player

multi
one or more objects in scope

multiheld
one or more held objects

multiexcept
one or more in scope, except the other

multiinside
one or more in scope, inside the other

<attribute>
any object in scope which has the attribute

creature
an object in scope which is animate

noun = <Routine>
any object in scope passing the given test

scope = <Routine>
an object in this definition of scope

number
a number only

<Routine>
any text accepted by the given routine

topic
any text at all

special
any single word or number

These tokens are all described in this section except for
scope = <Routine>
, which is postponed to the next.

"<word>"
This matches only the literal word given, normally a preposition such as "into". Whereas most tokens produce a "parameter'' (an object or group of objects, or a number), this token doesn't. There can therefore be as many or as few of them on a grammar line as desired.

It often happens that several prepositions really mean the same thing for a given verb: "in", "into" and "inside" are often equally sensible. As a convenient shorthand you can write a series of prepositions with slash marks / in between, to mean "one of these words". For example:

     * noun "in"/"into"/"inside" noun      -> Insert

(Note that / can only be used with prepositions.)

$/\$ Prepositions like this are unfortunately sometimes called 'adjectives' inside the parser source code, and in Infocom hackers' documents: the usage is traditional but has been avoided in this manual.

noun
The definition of "in scope'' will be given in the next section. Roughly, it means "visible to the player at the moment''.

held
Convenient for two reasons. Firstly, many actions only sensibly apply to things being held (such as Eat or Wear), and using this token in the grammar you can make sure that the action is never generated by the parser unless the object is being held. That saves on always having to write "You can't eat what you're not holding" code. Secondly, suppose we have grammar

Verb "eat"
                * held                           -> Eat;

and the player types "eat the banana'' while the banana is, say, in plain view on a shelf. It would be petty of the game to refuse on the grounds that the banana is not being held. So the parser will generate a Take action for the banana and then, if the Take action succeeds, an Eat action. Notice that the parser does not just pick up the object, but issues an action in the proper way -- so if the banana had rules making it too slippery to pick up, it won't be picked up. This is called "implicit taking''.

The
multi-
tokens indicate that a list of one or more objects can go here. The parser works out all the things the player has asked for, sorting out plural nouns and words like "except" by itself, and then generates actions for each one. A single grammar line can only contain one
multi-
token: so "hit everything with everything" can't be parsed (straightforwardly, that is: you can parse anything with a little more effort). The reason not all nouns can be multiple is that too helpful a parser makes too easy a game. You probably don't want to allow "unlock the mystery door with all the keys'' -- you want the player to suffer having to try them one at a time, or else to be thinking.

multiexcept
Provided to make commands like "put everything in the rucksack'' parsable: the "everything" is matched by all of the player's possessions except the rucksack. This stops the parser from generating an action to put the rucksack inside itself.

multiinside
Similarly, this matches anything inside the other parameter on the line, and is good for parsing commands like "remove everything from the cupboard''.

<attribute>
This allows you to sort out objects according to attributes that they have:

Verb "use" "employ" "utilise"
                * edible                    -> Eat
                * clothing                  -> Wear
      ...and so on...
                * enterable                 -> Enter;

though the library grammar does not contain such an appallingly convenient verb! Since you can define your own attributes, it's easy to make a token matching only your own class of object.

creature
Same as
animate
(a hangover from older editions of Inform).

noun = <Routine>
The last and most powerful of the "a nearby object satisfying some condition'' tokens. When determining whether an object passes this test, the parser sets the variable noun to the object in question and calls the routine. If it returns true, the parser accepts the object, and otherwise it rejects it. For example, the following should only apply to animals kept in a cage:

[ CagedCreature;
    if (noun in wicker_cage) rtrue; rfalse;
];
Verb "free" "release"
                * noun=CagedCreature        -> FreeAnimal;

So that only nouns which pass the CagedCreature test are allowed. The CagedCreature routine can appear anywhere in the code, though it's tidier to keep it nearby.

scope = <Routine>
An even more powerful token, which means "an object in scope'' where scope is redefined specially. See the next section.

number
Matches any decimal number from 0 upwards (though it rounds off large numbers to 10000), and also matches the numbers "one'' to "twenty'' written in English. For example:

Verb "type"
                * number                    -> TypeNum;

causes actions like Typenum 504 when the player types "type 504''. Note that noun is set to 504, not to an object.

EXERCISE 68:
(link to the answer)

(A beautiful feature stolen from David M. Baggett's game 'The Legend Lives', which uses it to great effect.) Some games produce footnotes every now and then. Arrange matters so that these are numbered [1], [2] and so on in order of appearance, to be read by the player when "footnote 1'' is typed.

$/\$

The entry point ParseNumber allows you to provide your own number-parsing routine, which opens up many sneaky possibilities -- Roman numerals, coordinates like "J4", very long telephone numbers and so on. This takes the form

[ ParseNumber buffer length;
  ...returning 0 if no match is made, or the number otherwise...
];

and examines the supposed 'number' held at the byte address buffer, a row of characters of the given length. If you provide a ParseNumber routine but return 0 from it, then the parser falls back on its usual number-parsing mechanism to see if that does any better.

$/\/\$ Note that ParseNumber can't return 0 to mean the number zero. Probably "zero'' won't be needed too often, but if it is you can always return some value like 1000 and code the verb in question to understand this as 0. (Sorry: this was a poor design decision made too long ago to change now.)

<Routine>
The most flexible token is simply the name of a "general parsing routine''. This looks at the word stream using NextWord and wn (see Section 24) and should return:

-1 if the text isn't understood,
0 if it's understood but no parameter results,
1 if a number results, or
n if the object n results.

In the case of a number, the actual value should be put into the variable parsed_number. On an unsuccessful match (returning -1) it doesn't matter what the final value of wn is. On a successful match it should be left pointing to the next thing after what the routine understood. Since NextWord moves wn on by one each time it is called, this happens automatically unless the routine has read too far. For example:

[ OnAtorIn w;
  w=NextWord(); if (w=='on' or 'at' or 'in') return 0;
  return -1;
];

makes a token which accepts any of the words "on", "at" or "in" as prepositions (not translating into objects or numbers). Similarly,

[ Anything w;  while (w~=-1) w=NextWordStopped(); return 0; ];

accepts the entire rest of the line (ignoring it). NextWordStopped is a form of NextWord which returns -1 once the original word stream has run out.

topic
This token matches as much text as possible. It should either be at the end of its grammar line, or be followed by a preposition. (The only way it can fail to match is if it finds no text at all.) The library's grammar uses this token for topics of conversation and topics looked up in books (see \SSection 15, 16), hence the name. The parser ignores the text for now (your own code will have to think about it later), and simply sets the variables consult_from to the number of the first word of the matched text and consult_words to the number of words.

special
Obsolete and best avoided.

EXERCISE 69:
(link to the answer)

Write a token to detect low numbers in French, "un'' to "cinq''.

$??/\$ EXERCISE 70:
(link to the answer)

Write a token to detect floating-point numbers like "21'', "5.4623'', "two point oh eight'' or "0.01'', rounding off to two decimal places.

$??/\$ EXERCISE 71:
(link to the answer)

Write a token to match a phone number, of any length from 1 to 30 digits, possibly broken up with spaces or hyphens (such as "01245 666 737'' or "123-4567'').

$??/\/\$ EXERCISE 72:
(link to the answer)

(Adapted from code in Andrew Clover's 'timewait.h' library extension.) Write a token to match any description of a time of day, such as "quarter past five'', "12:13 pm'', "14:03'', "six fifteen'' or "seven o'clock''.

$??/\$ EXERCISE 73:
(link to the answer)

Code a spaceship control panel with five sliding controls, each set to a numerical value, so that the game looks like:

>look
Machine Room
There is a control panel here, with five slides, each of which can be
set to a numerical value.
>push slide one to 5
You set slide one to the value 5.
>examine the first slide
Slide one currently stands at 5.
>set four to six
You set slide four to the value 6.

$/\/\$

General parsing routines sometimes need to get at the raw text originally typed by the player. Usually WordAddress and WordLength (see Section 24) are adequate. If not, it's helpful to know that the parser keeps a string array called buffer holding:

buffer->0 = <maximum number of characters which can fit in buffer>
buffer->1 = <the number n of characters typed>
buffer->2...buffer-> (n+1) = <the text typed>

and, in parallel with this, another one called parse holding:

parse->0 = <maximum number of words which can fit in buffer>
parse->1 = <the number m of words typed>
parse->2... = <a four-byte block for each word, as follows>
block-->0 = <the dictionary entry if word is known, 0 otherwise>
block->2 = <number of letters in the word>
block->3 = <index to first character in the buffer>

(However, for version 3 games the format is slightly different: in buffer the text begins at byte 1, not at byte 2, and its end is indicated with a zero terminator byte.) Note that the raw text is reduced to lower case automatically, even if within quotation marks. Using these buffers directly is perfectly safe but not recommended unless there's no other way, as it tends to make code rather illegible.

$??/\/\$ EXERCISE 74:
(link to the answer)

Try to implement the parser's routines NextWord, WordAddress and WordLength.

$??/\/\$ EXERCISE 75:
(link to the answer)

(Difficult.) Write a general parsing routine accepting any amount of text (including spaces, full stops and commas) between double-quotes as a single token.

EXERCISE 76:
(link to the answer)

How would you code a general parsing routine which never matches anything?

$??/\/\$ EXERCISE 77:
(link to the answer)

Why would you code a general parsing routine which never matches anything?

$??/\$ EXERCISE 78:
(link to the answer)

An apparent restriction of the parser is that it only allows two parameters (noun and second). Write a general parsing routine to accept a third. (This final exercise with general parsing routines is easier than it looks: see the specification of the NounDomain library routine in Appendix A9.)

Contents / Back / Forward
Chapter I / Chapter II / Chapter III / Chapter IV / Chapter V / Chapter VI / Appendix