|
|
|
|
context-sensitive language
|
(Definition)
|
|
|
A context-sensitive language is a language over some alphabet generated by generated by some grammar known as a context-sensitive grammar.
Formally, a context-sensitive language is a formal grammar $G=(\Sigma, N, P, \sigma)$ , such that given any production in $P$ , it
- either has the form $$uXv\to uwv,$$ where $X\in N$ , $u,v,w\in \Sigma^*$ , and $w\ne \lambda$ , the empty word,
- or is $\sigma\to \lambda$ , provided that the start symbol $\sigma$ does not occur on the right side of any productions in $P$ .
In other words, if $G$ does not contain the production $\sigma\to \lambda$ , then any production will have the form in condition 1. On the other hand, if $G$ does contain $\sigma\to \lambda$ , then for any production $uXv\to uwv$ of $G$ , $\sigma$ does not occur in $uwv$ .
The reason for including the second condition is to ensure the possibility that $\lambda$ may be generated by the grammar.
One may interpret the first condition as follows: the non-terminal symbol $X$ can only be transformed into the word $w$ if it is surrounded by $u$ and $v$ , or that it is in the ``context'' of $uXv$ . If in each production $uXv\to uwv$ of $G$ , $u=v=\lambda$ , (so that $X$ no longer has a ``context''), then $G$ is a context-free grammar.
Given a context-sensitive grammar $G$ , the context-sensitive language generated by $G$ is $L(G)$ . In other words, $$L(G):=\lbrace v\in T^*\mid \sigma\stackrel{*}{\to} v\rbrace,$$ where $T=\Sigma-N$ is the set of terminals, and $\sigma\stackrel{*}{\to} v$ means that the word $v$ over $\Sigma$ is produced after a finite number of applications of the productions in $P$ to $\sigma$ .
With condition 2, we see that a context-sensitive language contains $\lambda$ iff it is generated by a context-sensitive grammar containing $\sigma\to \lambda$ . With condition 2, every context-free language is context-sensitive. Without condition 2, every $\lambda$ -free context-free language is $\lambda$ -free context-sensitive.
$\lbrace a^nb^nc^n \mid n\ge 0\rbrace$ and $\lbrace a^{2^n} \mid n\ge 0 \rbrace$ are examples of context-sensitive languages that are not context-free, the second of which is $\lambda$ -free.
Remarks.
- A formal grammar $G$ is said to be length-increasing if for every production $U\to V$ of $G$ , the length of $U$ is at most the length of $V$ : $|U|\le |V|$ . Every context-sensitive grammar not containing $\sigma\to \lambda$ (condition 2) is length-increasing. Conversely, any language generated by a length-increasing grammar is context-sensitive.
- The minimal automaton required for processing a context-sensitive languages is a bounded non-deterministic Turing machine (bounded linear automaton).
- The family of context-sensitive languages has the following closure properties: union, intersection, concatenation, Kleene star, reversal, and complementation (proved in 1988). It is not closed under homomorphism.
- In the Chomsky hierarchy, context-sensitive grammars are at Level 1. In fact, every context-sensitive language is recursive. The converse is not true, however.
- Given a context-sensitive language $L$ and a word $w$ , it is decidable whether $w\in L$ .
- 1
- A. Salomaa, Formal Languages, Academic Press, New York (1973).
- 2
- J.E. Hopcroft, J.D. Ullman, Formal Languages and Their Relation to Automata, Addison-Wesley, (1969).
|
"context-sensitive language" is owned by CWoo. [ full author list (3) | owner history (2) ]
|
|
(view preamble | get metadata)
Cross-references: converse, recursive, level, Chomsky hierarchy, homomorphism, closed under, reversal, Kleene star, concatenation, intersection, union, closure properties, automaton, non-deterministic Turing machine, bounded, minimal automaton, conversely, length, context-free, context-free language, iff, applications, number, finite, terminals, context-free grammar, occur in, contain, side, right, empty word, production, formal grammar, generated by, alphabet, language
There are 18 references to this entry.
This is version 20 of context-sensitive language, born on 2006-12-19, modified 2009-07-02.
Object id is 8644, canonical name is ContextSensitiveLanguage.
Accessed 3895 times total.
Classification:
| AMS MSC: | 68Q45 (Computer science :: Theory of computing :: Formal languages and automata) | | | 68Q42 (Computer science :: Theory of computing :: Grammars and rewriting systems) |
|
|
|
|
|
|
Pending Errata and Addenda
|
|
|
|
|
|
|
|
|
|
|