Parikh’s theorem


Let Σ be an alphabet. Fix an order on elements of Σ, so that they are a1,,an. For each word w over Σ, we can form an n-tuple (w1,,wn) so that each wi is the number of occurrences of ai in w. The function Ψ:Σ*n such that

Ψ(w)=(w1,,wn)

is called the Parikh mapping (over the alphabet Σ). Here, is the set of non-negative integers.

Two languagesPlanetmathPlanetmath L1 and L2 over some Σ are called letter-equivalent if Ψ(L1)=Ψ(L2).

For example, L1:={anbnn0} is letter-equivalent to the L2:={(ab)nn0}, as both are mapped to the set {(n,n)n0} by Ψ. Note that L1 is context-free, while L2 is regularPlanetmathPlanetmathPlanetmath. In fact, we have the following:

Theorem 1 (Parikh).

Every context-free language is letter-equivalent to a regular language.

Like the pumping lemmaPlanetmathPlanetmath, Parikh’s theorem can be used to show that certain languages are not context-free. But in order to apply Parikh’s theorem above, one needs to know more about the structureMathworldPlanetmath of the set Ψ(L) for a given context-free language L.

First, note that n is a commutative monoid under addition. For any element x and subset S of n, define x+S in the obvious manner. Next, let x1,,xm be elements of n. Denote

x1,,xm

the submonoid generated by x1,,xm.

A subset S of n is said to be linear if it has the form

x0+x1,,xm,

for some x0,,xmn. It is not hard to see that every linear set is the image of some regular language under Ψ. For example,

(1,0,0)+(2,1,0),(0,3,2)=Ψ({a(a2b)m(b3c2)nm,n0}).

The example can be easily generalized.

A subset of n is said to be semilinear if it is the union of finitely many linear sets. Since regular languages are closed under union, every semilinear set is the image of a regular language under Ψ. As a result, Parikh’s theorem can be restated as

Theorem 2 (Theorem 1 restated).

If L is context-free, then Ψ(L) is semilinear.

Using this theorem, one easily sees that the language {app is a prime number } is not context-free, and neither is the language

{ambn either m is prime, and nm, or n is prime, and mn},

which can not be proved by the pumping lemma in a straightforward manner.

Remarks.

  • The converseMathworldPlanetmath of Theorem 2 above is false. For example, let L={anbncnn0}. Then L is not context-free by the pumping lemma. However, Ψ(L)=(1,1,1), which is clearly linear.

  • In the literature, a lanugage L such that Ψ(L) is semilinear is often called a slip-language. It can be shown that for a slip-language L over an alphabet consisting of two symbols, the language Ψ-1Ψ(L) is context-free.

References

  • 1 H.R. Lewis, C.H. Papadimitriou, Elements of the Theory of Computation. Prentice-Hall, Englewood Cliffs, New Jersey (1981).
  • 2 D. C. Kozen, Automata and Computability, Springer, New York (1997).
  • 3 R.J. Parikh, On Context-Free Languages, Journal of the Association for Computing Machinery, 4 (1966), 570-581.
  • 4 M. Latteux, Cônes rationnels commutatifs, J. Comput. Systems Sci. 18 (1979), 307-333.
Title Parikh’s theorem
Canonical name ParikhsTheorem
Date of creation 2013-03-22 18:55:49
Last modified on 2013-03-22 18:55:49
Owner CWoo (3771)
Last modified by CWoo (3771)
Numerical id 9
Author CWoo (3771)
Entry type Theorem
Classification msc 68Q42
Classification msc 68Q45
Defines Parikh mapping
Defines linear set
Defines semilinear set
Defines letter-equivalent
Defines slip-language