# ambiguous grammar

Let $G=(\Sigma,N,P,\sigma)$ be a context-free grammar, and $L(G)$ the language generated by it. Since every word in $L(G)$ has at least one leftmost derivation, let us only consider leftmost derivations of words.

Definition. $G$ is said to be unambiguous if every word in $L(G)$ has exactly one leftmost derivation. Otherwise, $G$ is said to be ambiguous.

For an example of an ambiguous grammar, let $G$ be the grammar consisting of $a,b$ as terminal symbols, $\sigma,X$ as non-terminal symbols, and $\sigma\to a$, $\sigma\to ab\sigma b$, $\sigma\to aXb$, and $X\to b\sigma$ as productions. By definition, $G$ is context-free. Then the word $abab$ has

• $\sigma\to ab\sigma b\to abab$, and

• $\sigma\to aXb\to ab\sigma b\to abab$,

two leftmost derivations (corresponding to the following derivation trees).

Hence $G$ is ambiguous.

An example of an unambiguous grammar can be found when we convert the language formation rules of the classical propositional logic to a formal grammar. This grammar is context-free. By unique readability of well-formed formulas (words in this language), we conclude that the grammar is unambiguous.

Remarks.

• A grammar $G$ is unambiguous iff every word in $L(G)$ corresponds to a uniuqe derivation tree, since every derivation tree corresponds to a unique leftmost derivation.

• Deciding whether a context-free grammar is ambiguous is undecidable unless it has only one terminal symbol.

The concept of ambiguity can be carried over to context-free languages. Since every context-free language can be generated by many context-free grammars, some of which may be ambiguous, while others may not be, there are potentially three classes of context-free languages:

1. 1.

those that can only be generated by unambiguous grammars,

2. 2.

those that can be generated by ambiguous, as well as unambiguous grammars,

3. 3.

those that can only be generated by ambiguous grammars.

However, the first class is an empty set: every context-free language can be generated by an ambiguous grammar. Suppose $G$ is a context-free grammar generating the language $L$. If $G$ contains the production $\sigma\to\sigma$, then $G$ is ambiguous, for any leftmost derivation $\sigma\lx@stackrel{{\scriptstyle*}}{{\to}}w$ of a word $w$ can be lengthened to a leftmost derivation $\sigma\to\sigma\lx@stackrel{{\scriptstyle*}}{{\to}}w$. If $G$ does not contain $\sigma\to\sigma$, the grammar $G^{\prime}$ obtained by adding the production $\sigma\to\sigma$ to $G$ generates $L$ as well, and is ambiguous as we have just shown.

The other two classes are formally defined as follows:

Definition. A context-free language is unambiguous if it can be generated by an unambiguous grammar. Otherwise, it is said to be inherently ambiguous.

It can be shown that any regular language is unambiguous, and so is any deterministic context-free language (a language generated by a deterministic pushdown automaton). In addition, the intersection as well as the difference of a unambiguous context-free language with a regular langauge is unambiguous and context-free.

Nevertheless, inherently ambiguous languages do exist. Several explicit examples can be found in Ginsburg, one of which is the union of two context-free languages $\{a^{i}b^{i}c^{j}\mid i,j\geq 1\}$ and $\{a^{i}b^{j}c^{j}\mid i,j\geq 1\}$.

Remark. It is undecidable whether a context-free language over at least two symbols is inherently ambiguous.

## References

• 1 S. Ginsburg, The Mathematical Theory of Context-Free Languages, McGraw-Hill, New York (1966).
• 2 D. C. Kozen, Automata and Computability, Springer, New York (1997).
Title ambiguous grammar AmbiguousGrammar 2013-03-22 18:54:57 2013-03-22 18:54:57 CWoo (3771) CWoo (3771) 14 CWoo (3771) Definition msc 68Q42 msc 68Q45 DeterministicPushdownAutomaton ambiguous inherently ambiguous unambiguous