Let Σ1,Σ2 be alphabets. A substitution, or string substitution, is a function s:Σ1*P(Σ2*) such that

  • s preserves the empty wordPlanetmathPlanetmathPlanetmath: s(λ)={λ}, and

  • s preserves concatenationMathworldPlanetmath: s(αβ)=s(α)s(β).

In other words, for every word α over Σ1, s(α) is a languagePlanetmathPlanetmath over Σ2. In the second condition above, s(α)s(β) is the concatenation of languages: {uvus(α),vs(β)}.

For example, suppose Σ={a,b}. The map s taking u to {u}, where u is obtained from u by replacing every occurrence of a by b is a substitution.

One easy way to obtain more examples of substitutions is to start with some function


and extend it to all of Σ1* by language concatenation: if u=a1an, with aiΣ1, defining


gives us a substitution s. It is easy to see that the extensionPlanetmathPlanetmathPlanetmath is unique (if s1 and s2 both extend f, then s1=s2).

In fact, every substitution is obtained this way: every substitution s:Σ1*P(Σ2*) is the extension of its restrictionPlanetmathPlanetmathPlanetmath to Σ1. This can be verified directly, but is the result of a more general fact: any function f:AB, where B is a semigroupPlanetmathPlanetmath, extends uniquely to a semigroup homomorphism f*:A*B where A* is the semigroup freely generated by A.

In the previous example, s is the extension of the function that takes a to {b} and b to {b}.

Closure under Substitution

For any language LΣ1* and a substitution s:Σ1*P(Σ2*), define


A family of languages is said to be closed underPlanetmathPlanetmath substitutions if, given any substitution s, with L and s(w) for each wL, we have s(L). The following families are closed under substitutions:

As a corollary, the families of regularPlanetmathPlanetmath, context-free, and type-0 languages are closed under homomorphismsPlanetmathPlanetmathPlanetmathPlanetmathPlanetmathPlanetmathPlanetmathPlanetmathPlanetmath, since every homomorphism of languages is really just a special case of substitution, such that every symbol of the domain alphabet is mapped to a singleton consisting of a word over the range alphabet.

The family of context-sensitive languages is not closed under general substitutions. Instead, it is closed under λ-free substitutions (see remark below).


  • The notion of string substitution generally corresponds to our intuitive notion of how a substitution should behave:

    given words u,v,w, then Substitute(u,v,w) is a word that is obtained from u by replacing every occurrence of v in u by w.

    However, this is not always the case. For example, let Σ={a,b}, and s be the map that takes u to {u}, where u is obtained from u by replacing all occurrences of aa, if any, by b. Then it is easy to see that s is not a substitution, for




    Nevertheless, s is “intuitively” a “substitution”.

  • A substitution s is said to have property 𝒫 if for each aΣ, the set s(a) has property 𝒫. Thus, for example, a substitution s is finite if s(a) is a finite setMathworldPlanetmath, regular if s(a) is a regular language, and λ-free if each s(a) is λ-free, etc…


  • 1 S. Ginsburg, The Mathematical Theory of Context-Free Languages, McGraw-Hill, New York (1966).
  • 2 D. C. Kozen, Automata and Computability, Springer, New York (1997).
Title substitution
Canonical name Substitution
Date of creation 2013-03-22 18:55:12
Last modified on 2013-03-22 18:55:12
Owner CWoo (3771)
Last modified by CWoo (3771)
Numerical id 20
Author CWoo (3771)
Entry type Definition
Classification msc 68Q45
Synonym string substitution
Related topic HomomorphismOfLanguages
Related topic SubstitutionsInLogic
Defines λ-free substitution
Defines regular substitution