derivation language
Background
Let $G=(\mathrm{\Sigma},N,P,\sigma )$ be a formal grammar. A pair $({W}_{1},{W}_{2})$ of words over $\mathrm{\Sigma}$ is said to correspond to the production $p\to q$ if
$${W}_{1}={u}_{1}p{v}_{1}\mathit{\hspace{1em}\hspace{1em}}\text{and}\mathit{\hspace{1em}\hspace{1em}}{W}_{2}={u}_{2}q{v}_{2}$$ 
for some words ${u}_{i},{v}_{j}$ over $\mathrm{\Sigma}$. We also say that $({W}_{1},{W}_{2})$ is a derivation step, and write ${W}_{1}\to {W}_{2}$.
Recall that a derivation in $G$ is a finite sequence^{} of words
$${W}_{1},{W}_{2},\mathrm{\dots},{W}_{n}$$ 
over $\mathrm{\Sigma}$ such that ${W}_{i}\Rightarrow {W}_{i+1}$, for $i=1,\mathrm{\dots},n1$. The derivation is also written
$${W}_{1}\Rightarrow {W}_{2}\Rightarrow \mathrm{\cdots}\Rightarrow {W}_{n}.$$ 
The derivation above has $n1$ steps. Zerostep derivations are also permitted. These are just words over $\mathrm{\Sigma}$.
The reflexive transitive closure of $\Rightarrow $ is ${\Rightarrow}^{*}$. Thus, $V{\Rightarrow}^{*}W$ means that there is a derivation starting with $V$ and ending with $W$. There may be more than one derivation from $V$ to $W$.
If we consider each production as a “letter” in the alphabet $P$, then the above derivation can be represented by a “word” over $P$ in the following manner: the “word” is formed by taking concatenations^{} of the “letters”, where concatenations correspond to successive applications of productions in $P$:
$$[{p}_{1}\to {q}_{1}][{p}_{2}\to {q}_{2}]\mathrm{\cdots}[{p}_{n1}\to {q}_{n1}].$$ 
Derivation language is thus a certain collection^{} of derivation words, formally defined below.
Definitions
Treating productions as “letters” is really nothing more than labeling each production with some symbol. Formally, call a labeling of an alphabet $P$ a surjection $f:F\to P$, where $F$ is some alphabet. For any $p\in P$, a label for $p$ is an element $x\in F$ such that $f(x)=p$. We will only be interested in injective^{} labeling (hence a bijection) from now on.
Definition. Suppose we are given a labeling $f$ of the set $P$ of productions in the grammar $G$. Given a derivation $D:{W}_{1}\Rightarrow {W}_{2}\Rightarrow \mathrm{\dots}\Rightarrow {W}_{n}$, a derivation word $U$ for $D$ is defined inductively as follows:

1.
if $n=1$, then the empty word^{} $U:=\lambda $,

2.
if $n=2$, then $U:=x\in F$ is a label for a production that ${W}_{1}\Rightarrow {W}_{2}$ corresponds to,

3.
if $n>2$, then $U:={U}_{1}{U}_{2}$, where

–
${U}_{1}$ is a derivation word for the derivation ${W}_{1}\Rightarrow \mathrm{\cdots}\Rightarrow {W}_{i}$,

–
${U}_{2}$ is a derivation word for the derivation ${W}_{i}\Rightarrow \mathrm{\cdots}\Rightarrow {W}_{n}$.

–
If $U$ is a derivation word for derivation $D$, let us abbreviate this by writing $f[U]=D$. Note that we are not applying the labeling $f$ to $U$, it is merely a notational convenience.
A derivation word is sometimes called a control word, for it defines or controls whether and how a word may be derived from another word. Note that any ${W}_{1}{\Rightarrow}^{*}{W}_{2}$ may correspond to several distinct derivations, and hence several distinct derivation words. Also, the same derivation word may correspond to distinct derivations.
For example, let $G$ be a grammar over two symbols $($ and $)$ with productions $\sigma \to \lambda $, $\sigma \to (\sigma )$, and $\sigma \to \sigma \sigma $ ($G$ generates the parenthesis language ${\mathrm{\mathbf{P}\mathbf{a}\mathbf{r}\mathbf{e}\mathbf{n}}}_{\mathrm{\U0001d7cf}}$) Label the productions as $a,b,c$ respectively. Then the derivation $\sigma {\Rightarrow}^{*}(()())$ correspond to derivation words $bcbbaa$ and $bcbaba$. Notice that $\sigma \sigma \Rightarrow (\sigma )\sigma $ and $\sigma \sigma \Rightarrow \sigma (\sigma )$ both correspond to the derivation word $b$.
Definition. The derivation language of a grammar $G=(\mathrm{\Sigma},N,P,\sigma )$ is the set of all derivation words for derivations on words generated by $G$. In other words, consider the labeling $f:F\to P$. The derivation language of $G$ is the set
$$\{U\in {F}^{*}\mid f[U]\text{is a derivation of the form}\sigma {\Rightarrow}^{*}u\text{for some}u\in {N}^{*}\}.$$ 
The derivation language of $G$ is also called the Szilard language of $G$, named after its inventor, and is denoted by $\mathrm{Sz}(G)$.
For example, let $G$ be the grammar over a the letter $a$, with productions given by $\sigma \to \sigma $, $\sigma \to a$. If the productions are labeled $b,c$, then $\mathrm{Sz}(G)=\{{b}^{n}c\mid n\ge 0\}$. Note that $L(G)=\{a\}$. Likewise, if ${G}^{\prime}$ is the grammar over $a$, with productions $\sigma \to A\sigma $, $A\to \lambda $, and $\sigma \to a$, labeled $p,q,r$ respectively, then $L({G}^{\prime})=\{a\}$. However, $\mathrm{Sz}({G}^{\prime})$ is quite different from $\mathrm{Sz}(G)$:
$\mathrm{Sz}({G}^{\prime})$  $=\{u\in {F}^{*}\mid $  $u=vrw\text{, where}v\in {\{p,q\}}^{*},w\in {\{q\}}^{*},$  
${\mathrm{\#}}_{u}(p)={\mathrm{\#}}_{u}(q)\text{and}{\mathrm{\#}}_{x}(p)\ge {\mathrm{\#}}_{x}(q)\text{for all}x\le u\}$ 
where

•
$F=\{p,q,r\}$,

•
${\mathrm{\#}}_{u}(r)$ means the number of occurrences of $r$ in word $u$,

•
$v\le u$ means that $v$ is a prefix of $u$.
Remarks. Let $G$ be a formal grammar.

•
Some properties of $\mathrm{Sz}(G)$:

(a)
$\mathrm{Sz}(G)$ is always contextsensitive.

(b)
If $G$ is regular^{}, so is $\mathrm{Sz}(G)$.

(c)
if $G$ is contextfree, $\mathrm{Sz}(G)$ may not be; in fact, for any contextfree language $L$, there is a contextfree grammar $G$ such that $L=L(G)$ but $\mathrm{Sz}(G)$ is not contextfree.

(d)
There exists a contextfree language $L$ such that $\mathrm{Sz}(G)$ is not contextfree for any grammar $G$ generating $L$.

(a)

•
However, if one considers the subset ${\mathrm{Sz}}_{L}(G)$ of $\mathrm{Sz}(G)$ consisting of all derivation words corresponding to leftmost derivations, then ${\mathrm{Sz}}_{L}(G)$ is contextfree.

•
It is an open question that, given any (contextsensitive) language^{} $L$, whether there is a grammar $G$ such that $L=\mathrm{Sz}(G)$.
References
 1 A. Salomaa, Formal Languages, Academic Press, New York (1973).
 2 G. E. Révész, Introduction to Formal Languages, Dover Publications (1991).
Title  derivation language 

Canonical name  DerivationLanguage 
Date of creation  20130322 18:58:15 
Last modified on  20130322 18:58:15 
Owner  CWoo (3771) 
Last modified by  CWoo (3771) 
Numerical id  13 
Author  CWoo (3771) 
Entry type  Definition 
Classification  msc 68Q42 
Classification  msc 68Q45 
Synonym  Szilard language 