# context-free language

Formally, a context-free grammar is a formal grammar $G=(\Sigma,N,P,\sigma)$ whose productions in $P$ have the form

 $X\to w,$

where $X\in N$ and $w\in\Sigma^{*}$. A context-free language is the formal language $L(G)$ generated by a context-free grammar $G$. In other words,

 $L(G):=\{v\in T^{*}\mid\sigma\lx@stackrel{{\scriptstyle*}}{{\to}}v\},$

where $T:=\Sigma-N$ is the set of terminals, and $\sigma\lx@stackrel{{\scriptstyle*}}{{\to}}v$ simply means that the productions in $P$ are applied a finite number of times to the starting symbol $\sigma$ so that $v$ is reached in the end.

This formal definition is not initially very enlightening. As an example, consider typical infix arithmetic notation. An expression typically consists of one or more terms, joined together by operators with varying precedence. Parentheses can be used to override operator precedence. This syntax is a context-free language. Let us describe such expressions with the operators + and - with lowest precedence, and * and / with highest precedence, where all operators are left-associative and the only other allowed symbols are integer literals  and parentheses.

 $\displaystyle T$ $\displaystyle:=$ $\displaystyle\left\{\verb.0.,\verb.1.,\verb.2.,\verb.3.,\verb.4.,\verb.5.,% \verb.6.,\verb.7.,\verb.8.,\verb.9.,\verb.(.,\verb.).,\verb.+.,\verb.-.,\verb.% *.,\verb./.\right\}$ $\displaystyle N$ $\displaystyle:=$ $\displaystyle\left\{S,A,B,C,D\right\}$ $\displaystyle P$ $\displaystyle:=$ $\displaystyle\{(S,A),(S,S\verb.+.A),(S,S\verb.-.A),$ $\displaystyle(A,B),(A,A\verb.*.B),(A,A\verb./.B),$ $\displaystyle(B,C),(B,\verb.(.S\verb.).),(C,D),(C,CD),$ $\displaystyle(D,\verb.0.),(D,\verb.1.),(D,\verb.2.),(D,\verb.3.),(D,\verb.4.),% (D,\verb.5.),(D,\verb.6.),(D,\verb.7.),(D,\verb.8.),(D,\verb.9.)\}$ $\displaystyle\sigma$ $\displaystyle:=$ $\displaystyle S$

Context-free grammars are also known as Type-2 grammars in the Chomsky hierarchy. The Chomsky hierarchy specifies Type-2 grammars as consisting only of production rules of the form $A\to\gamma$, where $A$ is a non-terminal and $\gamma$ is a string of terminals and non-terminals.

Remarks.

1. 1.
2. 2.

The syntaxes of most programming languages are context-free grammars (or very close to it). A very useful subset of context-free languages are regular languages.

3. 3.

A context-free grammar $G$ is context-sensitive if none of its productions has the form $X\to\lambda$, where $\lambda$ is the empty word   .

4. 4.

## References

• 1 S. Ginsburg The Mathematical Theory of Context-Free Languages. McGraw-Hill, New York (1966).
 Title context-free language Canonical name ContextfreeLanguage Date of creation 2013-03-22 12:26:28 Last modified on 2013-03-22 12:26:28 Owner CWoo (3771) Last modified by CWoo (3771) Numerical id 33 Author CWoo (3771) Entry type Definition Classification msc 68Q45 Classification msc 68Q42 Synonym type-2 language Synonym type-2 grammar Related topic Language Related topic RegularLanguage Related topic DeterministicFiniteAutomaton Related topic NonDeterministicFiniteAutomaton Related topic PushdownAutomaton Related topic ContextSensitiveLanguage Related topic DeterministicPushdownAutomaton Defines context-free Defines context-free grammar