# Kleene’s theorem

###### Theorem 1 (Kleene).

A language over an alphabet is regular iff it can be accepted by a finite automaton.

The best way to prove this theorem is to visualize a finite automaton as a directed graph (its state diagram). We prove sufficiency and necessity separately.

###### Lemma 1.

Every regular language can be accepted by a finite automaton.

###### Proof.

This is done inductively. We begin with atomic languages, and work our way up. First, note that $\varnothing$, $\{\lambda\}$, $\{a\}$ where $a\in\Sigma$ are accepted by the following finite automata respectively:

Next, suppose that $L_{1}$ and $L_{2}$ are regular languages, accepted by $A_{1}$ and $A_{2}$ respectively. Then

• $L_{1}L_{2}$ is accepted by the finite automaton $A_{1}A_{2}$, the juxtaposition of automata $A_{1}$ and $A_{2}$;

• $L_{1}^{*}$ is accepted by the finite automaton $A_{1}^{*}$, the Kleene star of automaton (http://planetmath.org/KleeneStarOfAnAutomaton) $A_{1}$; and

• $L_{1}\cup L_{2}$ is accepted by $A_{1}\cup A_{2}$, the union (http://planetmath.org/UnionOfAutomata) of $A_{1}$ and $A_{2}$.

Since a regular language can only be built up this way, the proof is complete. ∎

###### Lemma 2.

Every language accepted by a finite automaton is regular.

Note that, using a state diagram of a finite automaton $A$, a word accepted by $A$ is the label of a path that begins from one of the starting nodes and ends in one of the final nodes. In the proof, we build a sequence of languages

 $L_{1}\subseteq L_{2}\subseteq\cdots\subseteq L_{n}$

such that $L_{1}$ is regular, and each $L_{i+1}$ is constructed from $L_{i}$ by one of the regular operations (union, concatenation, and Kleene star), so that in the end, $L_{n}$ is also regular. Finally, we show that $L(A)$ is a finite union of such $L(n)$’s, and therefore $L(A)$ is regular too.

###### Proof.

Index the states of a finite automaton $A$ by positive integers: $q_{1},\ldots,q_{m}$. Suppose $1\leq i,j,k\leq m$. Define the language $L(i,j,k)$ as follows: a word is in $L(i,j,k)$ iff it is the label of a path from $q_{i}$ to $q_{j}$ such that each of the intermediate nodes on the path has an index at most $k$. Now, fix $i,j$, and do induction on $k$.

1. 1.

$L(i,j,0)$ is regular for any $i,j$. This is so because a word in $L(i,j,0)$ (if it is non-empty) has length at most 1, so that it is either in $\Sigma$ or $\lambda$. In any case, $L(i,j,0)$ is regular.

2. 2.

If $L(i,j,k)$ is regular, so is $L(i,j,k+1)$. Let $p$ be a path that begins at $q_{i}$ and ends at $q_{j}$ such that all of its intermediate nodes have indices no more than $k+1$. Then either none of the nodes have indices more than $k$, or $q_{k+1}$ is an intermediate node. In the later case, $p$ begins at $q_{i}$, reaches $q_{k+1}$ for the first time, leaves $q_{k+1}$, may or may not return to $q_{k+1}$, until its last return (if it does return), and ends at $q_{j}$. What we have just described can be put into an identity:

 $L(i,j,k+1)=L(i,j,k)\cup L(i,k+1,k)L(k+1,k+1,k)^{*}L(k+1,j,k).$

Since each of the languages on the right hand side is regular (as the last component is $k$ in each case), the language on the left hand side is reqular (as it is obtained by performing regular operations on regular languages).

In particular, $L(i,j,m)$ is regular. Since $L(A)=\bigcup\{L(i,j,m)\mid q_{i}\in I,q_{j}\in F\}$, a finite union of regular languages, $L(A)$ is regular as well. ∎

Remark. Kleene’s theorem can be generalized in at least two ways. By realizing that an automaton is just a directed graph whose edges are labeled by elements of $\Sigma^{*}$, a finitely generated free monoid, one may define a finite automaton over an arbitrary monoid. For more details, see this entry (http://planetmath.org/AutomatonOverAMonoid). The other is to generalize the notion of a word from finite to infinite, and modify the notion of acceptance so infinite words may be accepted.

Title Kleene’s theorem KleenesTheorem 2013-03-22 19:01:37 2013-03-22 19:01:37 CWoo (3771) CWoo (3771) 7 CWoo (3771) Theorem msc 03D10 msc 68Q05 msc 68Q42 RegularLanguage