Basic parsing techniques
A transition network is a finite state automaton
that is used to represent a part of a grammar.
A transition network parser uses a number of these
transition networks to represent its entire grammar.
Each network represents one nonterminal symbol in
the grammar. Hence, in the grammar for the English language, we would have one
transition network for Sentence, one for Noun Phrase, one for Verb Phrase, one
for Verb, and so on.
Fig 10.2 shows the transition network equivalents
for three production rules.
In each transition network, S1 is the start state,
and the accepting state, or final state, is denoted by a heavy border. When a
phrase is applied to a transition network, the first word is compared against
one of the arcs leading from the first state.
If this word matches one of those arcs, the network
moves into the state to which that arc points. Hence, the first network shown
in Fig 10.2, when presented with a Noun Phrase, will move from state S1 to
phrase is presented to a transition network and no match is found from the
current state, then that network cannot be used and another network must be
tried. Hence, when starting with the phrase the cat sat on the mat, none of the
networks shown in Fig 10.2 will be used because they all have only nonterminal
symbols, whereas all the symbols in the cat sat on the mat are terminal.Hence,
we need further networks, such as the ones shown in Figure 10.2, which deal
with terminal symbols.
networks can be used to determine whether a sentence is grammatically correct,
at least according to the rules of the grammar the networks represent.
using transition networks involves exploring a search space of possible parses
in a depth-first fashion.
examine the parse of the following simple sentence:
in state S1 in the Sentence transition network. To proceed, we must follow the
arc that is labeled NounPhrase. We thus move out of the Sentence network and
into the NounPhrase network.
arc of the NounPhrase network is labeled Noun. We thus move into the Noun
network. We now follow each of the arcs in the Noun network and discover that
our first word, A, does not match any of them. Hence, we backtrack to the next
arc in the NounPhrase network.
is labeled Article, so we move on to the Article transition network. Here, on
examining the second label, we find that the first word is matched by the
terminal symbol on this arc.
therefore consume the word, A, and move on to state S2 in the Article network.
Because this is a success node, we are able to return to the NounPhrase network
and move on to state S2 in this network. We now have an arc labeled Noun.
before, we move into the Noun network and find that our next word, cat,
matches. We thus move to state S4 in the NounPhrase network. This is a success
node, and so we move back to the Sentence network and repeat the process for
the VerbPhrase arc.
possible for a system to use transition networks to generate a derivation tree
for a sentence, so that as well as determining whether the sentence is
grammatically valid, it parses it fully to obtain further information by
semantic analysis from the sentence.
be done by simply having the system build up the tree by noting which arcs it
successfully followed. When, for example, it successfully follows the
NounPhrase arc in the Sentence network, the system generates a root node
labeled Sentence and an arc leading from that node to a new node labeled
NounPhrase.When the system follows the NounPhrase network and
an article and a noun, these are similarly added to the tree.