|
Just like every context-free grammar can be ``normalized'' or ``standardized'' in one of the so-called normal forms, so can a context-sensitive grammar, and in fact arbitrary grammar, be normalized. The particular normalization discussed in this entry is what is known as the Kuroda normal form.
A formal grammar $G=(\Sigma,N,P,\sigma)$ is in Kuroda normal form if its productions have one of the following forms:
- $A\to a$ ,
- $A\to BC$ ,
- $AB\to CD$ .
where $a\in \Sigma-N$ and $A,B,C,D\in N$ .
Note: Sometimes the form $A\to B$ , where $B\in N$ is added to the list above. However, we may remove a production of this form by replacing all occurrences of $B$ by $A$ in every production of $G$ .
The usefulness of the Kuroda normal forms is captured in the following result:
Note that the third production form $AB\to CD$ may be replaced by the following productions
- $AB\to AX$
- $AX\to YX$
- $YX\to YD$
- $YD\to CD$
where $X,Y$ are new non-terminals introduced to $G$ . Note also that, among the new forms, 1 and 3 are right context-sensitive, while 2 and 4 are left context-sensitive. Thus, a grammar in Kuroda normal form is equivalent to a grammar with productions having one of the following forms:
- $A\to a$ ,
- $A\to BC$ ,
- $AB\to AC$ ,
- $AB\to CB$ .
It can be shown that
Theorem 2 A grammar in Kuroda normal form if it is equivalent to a grammar whose productions are in one of forms $1,2$ , or $3$ .
By symmetry, a grammar in Kuroda normal form is equivalent to a grammar whose productions are in one of forms $1,2$ , or $4$ . A grammar whose productions are in one of forms $1,2$ , or $3$ is said to be in one-sided normal form.
As a corollary, every $\lambda$ -free context-sensitive language (not containing the empty word $\lambda$ ) can be generated by a grammar in one-sided normal form.
What if we throw in production of the form $A\to \lambda$ in the above list? Then certainly every context-sensitive language has a grammar in this ``extended'' normal form. In fact, we have
Theorem 3 Every type-0 language can be generated by a grammar whose productions are in one of the following forms:
- $A\to a$ ,
- $A\to BC$ ,
- $AB\to AC$ ,
- $A\to \lambda$ .
- 1
- G. E. Révész, Introduction to Formal Languages, Dover Publications (1991).
- 2
- A. Mateescu, A. Salomaa, Chapter 4 - Aspects of Classical Language Theory, Handbook of Formal Languages: Volume 1. Word, Language, Grammar, Springer, (1997).
|