star-free


Let ℛ and 𝒢⁢ℛ be the families of languagesPlanetmathPlanetmath represented by regular expressionsMathworldPlanetmath and generalized regular expressions respectively. It is well-known that

ℛ=𝒢⁢ℛ.

This means that the additional symbols ∩ and ¬ in a generalized regular expressions are extraneous: removing them will not affect the representational power of the expressions with respect to regular languages.

What if we remove *, the Kleene star symbol, instead? With regard to regular expressions, removing * severely limits the power of the expressions. By inductionMathworldPlanetmath, the represented languages are all finite. In this entry, we briefly discuss what happens when * is removed from the generalized regular expressions.

Definition. Let Σ be an alphabet. A language L over Σ is said to be star-free if it can be expressed by a generalized regular expression without *. In other words, a star-free language is a language that can be obtained by applying the operationsMathworldPlanetmath of union, concatenationMathworldPlanetmath, and complementation to ∅ and atomic languages (singleton subsets of Σ) a finite number of times.

If we denote 𝒮⁢ℱ the family of star-free languages (over some alphabet Σ), then 𝒮⁢ℱ is the smallest set of languages over Σ such that

  • •

    ∅∈𝒮⁢ℱ,

  • •

    {a}∈𝒮⁢ℱ for any a∈Σ,

  • •

    if L,M∈𝒮⁢ℱ, then L∪M,L⁢M,Lc∈𝒮⁢ℱ.

A shorter characterizationMathworldPlanetmath of a star-free language is a language with star height 0 with respect to representations by generalized regular expressions.

In relationsMathworldPlanetmathPlanetmath to finite and regular languages, we have the following:

ℱ⊆𝒮⁢ℱ⊆ℛ, (1)

where ℱ denotes the family of finite languages over Σ.

Furthermore, it is easy to see that 𝒮⁢ℱ is closed under Boolean operations, so that 𝒮⁢ℱ contains infiniteMathworldPlanetmath languages, for example ¬⁢∅ represents Σ*. As a result, the first inclusion must be strict. This example also shows that languages representable by expressions including the Kleene star may still be star-free. Here’s another example: {a⁢b}* over the alphabet {a,b}. This language can be represented as

λ∪(a⁢b⁢Σ*∩Σ*⁢a⁢b∩¬⁡(Σ*⁢a2⁢Σ*)∩¬⁡(Σ*⁢b2⁢Σ*))

The expression above, of course, is not star-free, and includes the symbol λ representing the empty wordPlanetmathPlanetmathPlanetmath. However, Σ* is just ¬⁢∅, and λ is just Σ*∩¬⁡(a⁢Σ*)∩¬⁡(b⁢Σ*). Some substitutions show that {a⁢b}* is indeed star-free.

Is the second inclusion strict? Are there regular languages such that representations by expressions including the Kleene star is inevitable? The following propositionPlanetmathPlanetmathPlanetmath answers the question:

Proposition 1.

A language L is star-free iff there exists a non-negative integer n such that, for any words u,v,w over Σ, u⁢vn⁢w∈L iff u⁢vn+1⁢w∈L.

A language satisfying the second statement in the proposition is known as noncounting, so the proposition can be restated as: a language is star-free iff it is noncounting.

As a result of this fact, we see that languages such as {(a⁢b)2}* is not star-free, although it is regularPlanetmathPlanetmath. Indeed, if we pick u=v=w=a⁢b as in the statement of the proposition above, we see that u⁢vn⁢w is in the language iff u⁢vn+1⁢w is not in the language, for any n≥0. Therefore, the second inclusion in chain (1) above is also strict.

The above proposition also strengthens chain (1): denote by 𝒯⁢(∞) the family of locally testable languages, then

𝒯⁢(∞)⊂𝒮⁢ℱ⊂ℛ. (2)

The first inclusion is due to the fact that, for any k-testable language L (over some Σ), we have

swk⁡(u⁢vk⁢w)=swk⁡(u⁢vk+1⁢w)

(the definition of swk⁡(u) is found in the entry on locally testable languages). Note the first inclusion is also strict. For example, the language represented by (a⁢b)*∪(b⁢a)* is star-free but not locally testable.

References

  • 1 A. Ginzburg, Algebraic Theory of Automata, Academic Press (1968).
  • 2 A. Salomaa, Formal LanguagesMathworldPlanetmath, Academic Press, New York (1973).
Title star-free
Canonical name Starfree
Date of creation 2013-03-22 18:59:05
Last modified on 2013-03-22 18:59:05
Owner CWoo (3771)
Last modified by CWoo (3771)
Numerical id 16
Author CWoo (3771)
Entry type Definition
Classification msc 68Q42
Classification msc 68Q45
Synonym star free
Synonym non-counting
Synonym noncounting
Related topic StarHeight