Parikh’s theorem
Let $\mathrm{\Sigma}$ be an alphabet. Fix an order on elements of $\mathrm{\Sigma}$, so that they are ${a}_{1},\mathrm{\dots},{a}_{n}$. For each word $w$ over $\mathrm{\Sigma}$, we can form an $n$tuple $({w}_{1},\mathrm{\dots},{w}_{n})$ so that each ${w}_{i}$ is the number of occurrences of ${a}_{i}$ in $w$. The function $\mathrm{\Psi}:{\mathrm{\Sigma}}^{*}\to {\mathbb{N}}^{n}$ such that
$$\mathrm{\Psi}(w)=({w}_{1},\mathrm{\dots},{w}_{n})$$ 
is called the Parikh mapping (over the alphabet $\mathrm{\Sigma}$). Here, $\mathbb{N}$ is the set of nonnegative integers.
Two languages^{} ${L}_{1}$ and ${L}_{2}$ over some $\mathrm{\Sigma}$ are called letterequivalent if $\mathrm{\Psi}({L}_{1})=\mathrm{\Psi}({L}_{2})$.
For example, ${L}_{1}:=\{{a}^{n}{b}^{n}\mid n\ge 0\}$ is letterequivalent to the ${L}_{2}:=\{{(ab)}^{n}\mid n\ge 0\}$, as both are mapped to the set $\{(n,n)\mid n\ge 0\}$ by $\mathrm{\Psi}$. Note that ${L}_{1}$ is contextfree, while ${L}_{2}$ is regular^{}. In fact, we have the following:
Theorem 1 (Parikh).
Every contextfree language is letterequivalent to a regular language.
Like the pumping lemma^{}, Parikh’s theorem can be used to show that certain languages are not contextfree. But in order to apply Parikh’s theorem above, one needs to know more about the structure^{} of the set $\mathrm{\Psi}(L)$ for a given contextfree language $L$.
First, note that ${\mathbb{N}}^{n}$ is a commutative monoid under addition. For any element $x$ and subset $S$ of ${\mathbb{N}}^{n}$, define $x+S$ in the obvious manner. Next, let ${x}_{1},\mathrm{\dots},{x}_{m}$ be elements of ${\mathbb{N}}^{n}$. Denote
$$\u27e8{x}_{1},\mathrm{\dots},{x}_{m}\u27e9$$ 
the submonoid generated by ${x}_{1},\mathrm{\dots},{x}_{m}$.
A subset $S$ of ${\mathbb{N}}^{n}$ is said to be linear if it has the form
$${x}_{0}+\u27e8{x}_{1},\mathrm{\dots},{x}_{m}\u27e9,$$ 
for some ${x}_{0},\mathrm{\dots},{x}_{m}\in {\mathbb{N}}^{n}$. It is not hard to see that every linear set is the image of some regular language under $\mathrm{\Psi}$. For example,
$$(1,0,0)+\u27e8(2,1,0),(0,3,2)\u27e9=\mathrm{\Psi}(\{a{({a}^{2}b)}^{m}{({b}^{3}{c}^{2})}^{n}\mid m,n\ge 0\}).$$ 
The example can be easily generalized.
A subset of ${\mathbb{N}}^{n}$ is said to be semilinear if it is the union of finitely many linear sets. Since regular languages are closed under union, every semilinear set is the image of a regular language under $\mathrm{\Psi}$. As a result, Parikh’s theorem can be restated as
Theorem 2 (Theorem 1 restated).
If $L$ is contextfree, then $\mathrm{\Psi}\mathit{}\mathrm{(}L\mathrm{)}$ is semilinear.
Using this theorem, one easily sees that the language $\{{a}^{p}\mid p\text{is a prime number}\}$ is not contextfree, and neither is the language
$$\{{a}^{m}{b}^{n}\mid \text{either}m\text{is prime, and}n\ge m,\text{or}n\text{is prime, and}m\ge n\},$$ 
which can not be proved by the pumping lemma in a straightforward manner.
Remarks.

•
The converse^{} of Theorem 2 above is false. For example, let $L=\{{a}^{n}{b}^{n}{c}^{n}\mid n\ge 0\}$. Then $L$ is not contextfree by the pumping lemma. However, $\mathrm{\Psi}(L)=\u27e8(1,1,1)\u27e9$, which is clearly linear.

•
In the literature, a lanugage $L$ such that $\mathrm{\Psi}(L)$ is semilinear is often called a sliplanguage. It can be shown that for a sliplanguage $L$ over an alphabet consisting of two symbols, the language ${\mathrm{\Psi}}^{1}\circ \mathrm{\Psi}(L)$ is contextfree.
References
 1 H.R. Lewis, C.H. Papadimitriou, Elements of the Theory of Computation. PrenticeHall, Englewood Cliffs, New Jersey (1981).
 2 D. C. Kozen, Automata and Computability, Springer, New York (1997).
 3 R.J. Parikh, On ContextFree Languages, Journal of the Association for Computing Machinery, 4 (1966), 570581.
 4 M. Latteux, Cônes rationnels commutatifs, J. Comput. Systems Sci. 18 (1979), 307333.
Title  Parikh’s theorem 
Canonical name  ParikhsTheorem 
Date of creation  20130322 18:55:49 
Last modified on  20130322 18:55:49 
Owner  CWoo (3771) 
Last modified by  CWoo (3771) 
Numerical id  9 
Author  CWoo (3771) 
Entry type  Theorem 
Classification  msc 68Q42 
Classification  msc 68Q45 
Defines  Parikh mapping 
Defines  linear set 
Defines  semilinear set 
Defines  letterequivalent 
Defines  sliplanguage 