<?xml version="1.0" encoding="UTF-8"?>

<record version="11" id="11765">
 <title>ambiguous grammar</title>
 <name>AmbiguousGrammar</name>
 <created>2009-05-07 18:00:09</created>
 <modified>2009-08-25 19:24:55</modified>
 <type>Definition</type>
<parent id="11763">leftmost derivation</parent>
 <creator id="3771" name="CWoo"/>
 <author id="3771" name="CWoo"/>
 <classification>
	<category scheme="msc" code="68Q45"/>
	<category scheme="msc" code="68Q42"/>
 </classification>
 <defines>
	<concept>ambiguous</concept>
	<concept>inherently ambiguous</concept>
	<concept>unambiguous</concept>
 </defines>
 <related>
	<object name="DeterministicPushdownAutomaton"/>
 </related>
 <preamble>\usepackage{amssymb,amscd}
\usepackage{amsmath}
\usepackage{amsfonts}
\usepackage{mathrsfs}

% used for TeXing text within eps files
%\usepackage{psfrag}
% need this for including graphics (\includegraphics)
\usepackage{graphicx}
% for neatly defining theorems and propositions
\usepackage{amsthm}
% making logically defined graphics
\usepackage{xypic}
\usepackage{pst-plot}

% define commands here
\newcommand*{\abs}[1]{\left\lvert #1\right\rvert}
\newtheorem{prop}{Proposition}
\newtheorem{thm}{Theorem}
\newtheorem{ex}{Example}
\newcommand{\real}{\mathbb{R}}
\newcommand{\pdiff}[2]{\frac{\partial #1}{\partial #2}}
\newcommand{\mpdiff}[3]{\frac{\partial^#1 #2}{\partial #3^#1}}</preamble>
 <content>Let $G=(\Sigma, N, P, \sigma)$ be a context-free grammar, and $L(G)$ the language generated by it.  Since every word in $L(G)$ has at least one leftmost derivation, let us only consider leftmost derivations of words.

\textbf{Definition}.  $G$ is said to be \emph{unambiguous} if every word in $L(G)$ has exactly one leftmost derivation.  Otherwise, $G$ is said to be \emph{ambiguous}.

For an example of an ambiguous grammar, let $G$ be the grammar consisting of $a,b$ as terminal symbols, $\sigma,X$ as non-terminal symbols, and $\sigma\to a$, $\sigma \to ab\sigma b$, $\sigma \to aXb$, and $X\to b\sigma$ as productions.  By definition, $G$ is context-free.  Then the word $abab$ has
\begin{itemize}
\item $\sigma\to ab\sigma b \to abab$, and
\item $\sigma\to aXb \to ab\sigma b \to abab$,
\end{itemize}
two leftmost derivations (corresponding to the following derivation trees).  

\begin{figure}[htp]
\centering
\includegraphics[scale=1]{tree.eps}
\end{figure}

Hence $G$ is ambiguous.

An example of an unambiguous grammar can be found when we convert the language formation rules of the classical propositional logic to a formal grammar.  This grammar is context-free.  By unique readability of well-formed formulas (words in this language), we conclude that the grammar is unambiguous.

\textbf{Remarks}.  
\begin{itemize}
\item A grammar $G$ is unambiguous iff every word in $L(G)$ corresponds to a uniuqe derivation tree, since every derivation tree corresponds to a unique leftmost derivation.
\item Deciding whether a context-free grammar is ambiguous is undecidable unless it has only one terminal symbol.
\end{itemize}

The concept of ambiguity can be carried over to context-free languages.  Since every context-free language can be generated by many context-free grammars, some of which may be ambiguous, while others may not be, there are \emph{potentially} three classes of context-free languages:
\begin{enumerate}
\item those that can only be generated by unambiguous grammars,
\item those that can be generated by ambiguous, as well as unambiguous grammars,
\item those that can only be generated by ambiguous grammars.
\end{enumerate}

However, the first class is an empty set: every context-free language can be generated by an ambiguous grammar.  Suppose $G$ is a context-free grammar generating the language $L$.  If $G$ contains the production $\sigma\to \sigma$, then $G$ is ambiguous, for any leftmost derivation $\sigma \stackrel{*}{\to} w$ of a word $w$ can be lengthened to a leftmost derivation $\sigma \to \sigma \stackrel{*}{\to} w$.  If $G$ does not contain $\sigma\to \sigma$, the grammar $G'$ obtained by adding the production $\sigma \to \sigma$ to $G$ generates $L$ as well, and is ambiguous as we have just shown.

The other two classes are formally defined as follows:

\textbf{Definition}.  A context-free language is \emph{unambiguous} if it can be generated by an unambiguous grammar.  Otherwise, it is said to be \emph{inherently ambiguous}.

It can be shown that any regular language is unambiguous, and so is any deterministic context-free language (a language generated by a deterministic pushdown automaton).  In addition, the intersection as well as the difference of a unambiguous context-free language with a regular langauge is unambiguous and context-free.

Nevertheless, inherently ambiguous languages do exist.  Several explicit examples can be found in Ginsburg, one of which is the union of two context-free languages $\lbrace a^ib^ic^j\mid i,j\ge 1\rbrace$ and $\lbrace a^ib^jc^j\mid i,j\ge 1\rbrace$.

\textbf{Remark}.  It is undecidable whether a context-free language over at least two symbols is inherently ambiguous.

\begin{thebibliography}{9}
\bibitem{sg} S. Ginsburg, {\em The Mathematical Theory of Context-Free Languages}, McGraw-Hill, New York (1966).
\bibitem{dk} D. C. Kozen, {\em Automata and Computability}, Springer, New York (1997).
\end{thebibliography}</content>
</record>
