PlanetMath (more info)
 Math for the people, by the people. Sponsor PlanetMath
Encyclopedia | Requests | Forums | Docs | Wiki | Random | RSS  
Login
create new user
name:
pass:
forget your password?
Main Menu
Owner confidence rating: Very high Entry average rating: No information on entry rating
[parent] Parikh's theorem (Theorem)

Let $\Sigma$ be an alphabet. Fix an order on elements of $\Sigma$ , so that they are $a_1,\ldots, a_n$ . For each word $w$ over $\Sigma$ , we can form an $n$ -tuple $(w_1,\ldots, w_n)$ so that each $w_i$ is the number of occurrences of $a_i$ in $w$ . The function $\Psi: \Sigma^* \to \mathbb{N}^n$ such that $$\Psi(w)=(w_1,\ldots, w_n)$$ is called the Parikh mapping (over the alphabet $\Sigma$ ). Here, $\mathbb{N}$ is the set of non-negative integers.

Two languages $L_1$ and $L_2$ over some $\Sigma$ are called letter-equivalent if $\Psi(L_1)=\Psi(L_2)$ .

For example, $L_1:=\lbrace a^nb^n \mid n\ge 0\rbrace$ is letter-equivalent to the $L_2:=\lbrace (ab)^n \mid n\ge 0 \rbrace$ , as both are mapped to the set $\lbrace (n,n)\mid n\ge 0\rbrace$ by $\Psi$ . Note that $L_1$ is context-free, while $L_2$ is regular. In fact, we have the following:

Theorem 1 (Parikh)   Every context-free language is letter-equivalent to a regular language.

Like the pumping lemma, Parikh's theorem can be used to show that certain languages are not context-free. But in order to apply Parikh's theorem above, one needs to know more about the structure of the set $\Psi(L)$ for a given context-free language $L$ .

First, note that $\mathbb{N}^n$ is a commutative monoid under addition. For any element $x$ and subset $S$ of $\mathbb{N}^n$ , define $x+S$ in the obvious manner. Next, let $x_1,\ldots, x_m$ be elements of $\mathbb{N}^n$ . Denote $$\langle x_1,\ldots, x_m \rangle$$ the submonoid generated by $x_1,\ldots, x_m$ .

A subset $S$ of $\mathbb{N}^n$ is said to be linear if it has the form $$x_0 + \langle x_1,\ldots, x_m\rangle,$$ for some $x_0,\ldots, x_m \in \mathbb{N}^n$ . It is not hard to see that every linear set is the image of some regular language under $\Psi$ . For example, $$(1,0,0)+\langle (2,1,0),(0,3,2)\rangle = \Psi(\lbrace a(a^2b)^m (b^3c^2)^n \mid m,n\ge 0\rbrace).$$ The example can be easily generalized.

A subset of $\mathbb{N}^n$ is said to be semilinear if it is the union of finitely many linear sets. Since regular languages are closed under union, every semilinear set is the image of a regular language under $\Psi$ . As a result, Parikh's theorem can be restated as

Theorem 2 (Theorem 1 restated)   If $L$ is context-free, then $\Psi(L)$ is semilinear.

Using this theorem, one easily sees that the language $\lbrace a^p \mid p { is a prime number } \rbrace$ is not context-free, and neither is the language $$\lbrace a^m b^n \mid \mbox{ either }m\mbox{ is prime, and }n\ge m,\mbox{ or }n\mbox{ is prime, and }m\ge n\rbrace,$$ which can not be proved by the pumping lemma in a straightforward manner.

Remarks.

  • The converse of Theorem 2 above is false. For example, let $L=\lbrace a^nb^nc^n \mid n\ge 0\rbrace$ . Then $L$ is not context-free by the pumping lemma. However, $\Psi(L)=\langle (1,1,1)\rangle$ , which is clearly linear.
  • In the literature, a lanugage $L$ such that $\Psi(L)$ is semilinear is often called a slip-language. It can be shown that for a slip-language $L$ over an alphabet consisting of two symbols, the language $\Psi^{-1}\circ \Psi(L)$ is context-free.

Bibliography

1
H.R. Lewis, C.H. Papadimitriou, Elements of the Theory of Computation. Prentice-Hall, Englewood Cliffs, New Jersey (1981).
2
D. C. Kozen, Automata and Computability, Springer, New York (1997).
3
R.J. Parikh, On Context-Free Languages, Journal of the Association for Computing Machinery, 4 (1966), 570-581.
4
M. Latteux, Cônes rationnels commutatifs, J. Comput. Systems Sci. 18 (1979), 307-333.




"Parikh's theorem" is owned by CWoo.
(view preamble | get metadata)

View style:

Also defines:  Parikh mapping, linear set, semilinear set, letter-equivalent, slip-language

This object's parent.
Log in to rate this entry.
(view current ratings)

Cross-references: converse, theorem, closed under, union, image, generated by, submonoid, obvious, subset, addition, commutative monoid, structure, pumping lemma, regular language, context-free language, regular, context-free, languages, integers, function, occurrences, number, elements, order, fix, alphabet
There are 2 references to this entry.

This is version 6 of Parikh's theorem, born on 2009-05-14, modified 2009-05-19.
Object id is 11783, canonical name is ParikhsTheorem.
Accessed 988 times total.

Classification:
AMS MSC68Q45 (Computer science :: Theory of computing :: Formal languages and automata)
 68Q42 (Computer science :: Theory of computing :: Grammars and rewriting systems)

Pending Errata and Addenda
None.
Discussion
Style: Expand: Order:
forum policy

No messages.

Interact
post | correct | update request | prove | add result | add corollary | add example | add (any)