|
A non-deterministic finite automaton (or NDFA) can be formally defined as a 5-tuple
, where
is a non-empty finite set of states,
is the alphabet (defining what set of input strings the automaton operates on),
-
is a function called the transition function,
is the starting state, and
-
is a non-empty set of final (or accepting) states.
Note how this definition differs from that of a deterministic finite automaton (DFA) only by the definition of the transition function . Some authors also relax the fourth condition by permitting multiple starting states. Operation of the NDFA begins at , and movement from state to state is governed by the transition function .
The transition function takes the first symbol of the (remaining) input string and the current state as its input, and after the transition this first symbol is removed only if the transition is defined for a symbol in instead of . Conceptually, all possible transitions from a current state are followed simultaneously (hence the non-determinism). Once every possible transition has been executed, the NDFA is halted. If any of the states reached upon halting are in for some input string, and the entire input string is consumed to reach that state, then the NDFA accepts that string.
An NDFA can be represented visually as a directed graph. Circular vertices denote states, and the set of directed edges, labelled by symbols in
, denotes . The starting state is usually denoted by an arrow pointing to it that points from no other vertex. States in are usually denoted by double circles.
NDFAs represent regular languages, and can be used to test whether any string in is in the language it represents. Consider the following regular language over the alphabet
(represented by the regular expression aa*b):
This language can be represented by the following NDFA:
The vertex 0 is the initial state , and the vertex 3 is the only state in .
If given the string aaab as input, operation of the NDFA is as follows. Let
indicate the set of “current” states and the remaining input associated with them. Initially
. For state 0 with a leading a as its input, the only possible transition to follow is to 1 (which consumes the a). This transforms to
. Now there are two possible transitions to follow for state 1 with a leading a. One transition is back to 1, consuming the a, while the other is to 2, leaving the a. Thus is then
. Again, the same transitions are possible for state 1, while no transition at all is available for state 2 with a leading a, so is then
. At this point, there is still no possible transition from 2, and the only possible transition from 1 is to 2 (leaving the input string as it is). This then gives
. Only state 2 with remaining input of b has a transition leading from it, giving
. At this point no further transitions are possible, and so the NDFA is halted. Since 3 is in , and the input string can be reduced to when it reached 3, the NDFA accepts aaab.
If the input string were instead aaaba, processing would occur as before until
is reached and the NDFA halts. Although 3 is in , it is not possible to reduce the input string completely before reaching 3. Therefore aaaba is not accepted by this NDFA.
Any regular grammar can be represented by an NDFA. Any string accepted by the NDFA is in the language represented by that NDFA. Furthermore, it is a straight-forward process to generate an NDFA for any regular grammar. Actual operation of an NDFA is generally intractable, but there is a simple process to transform any NDFA into a DFA, the operation of which is very tractable. Regular expression matchers tend to operate in this manner.
|