|
|
|
|
formal grammar
|
(Definition)
|
|
|
A grammar, loosely speaking, is a set of rules that can be applied to words to generate sentences in a language. For example, with the grammar of the English language, one can form syntactically correct sentences such as “The elephant drove his bicycle to the moon,” regardless whether the sentence is meaningful or not.
The mathematical abstraction of a grammar is known as a formal grammar. Instead of generating sentences from words, a formal grammar generates words from symbols and other words. The following basic ingredients are necessary in a formal grammar:
- a collection of symbols called an alphabet,
- a collection of rules, called rewriting rules, specifying how one can generate new words from existing ones, and
- a collection of initial words that serve to initialize the generation of new words.
To see how these rewriting rules work, let us look at an example. Let
be the alphabet as well as the set of initial words. With the rewriting rules given by: from a word we can form the word , as well as the word , we would be able to generate words like
However, words such as
can not be produced.
Note that by adding an extra symbol to the above alphabet, and two additional rewriting rules given by “from form ” and “from form ”, it is not hard to see that any word that can be generated by the first grammar can be generated by this new grammar.
Formalizing what we have discussed above, we say that a formal grammar is a quadruple
, where
-
is a rewriting system;
is a subset of whose elements are called non-terminals, and
the set of terminals;
- an element
called the starting symbol.
Instead of writing
, the quadruple
is another way of representing (as long as the conditions
and
are satisfied).
A formal grammar is variously known as a phrase-structure grammar, an unrestricted grammar, or simply a grammar. A formal grammar is sometimes also called a rewriting system in the literature, although the two notions are distinct on PlanetMath.
Given a formal grammar , the formal language generated by is the set
where
is the derivability relation in the rewriting system
. A formal language is also known as a phrase-structure language.
A language over an alphabet is said to be generable by a formal grammar if there is a formal grammar such that
.
Example. Continuing from the example in the previous section, we can put
and
. For the set of productions, we have four
-

-

-

-

Then
is a formal grammar. It is not hard to see that
, as
. In fact, consists of all words such that occurs at least once and occurs at most once.
Remarks.
- Not every language can be generated by a formal grammar. Given a finite alphabet
, is countably infinite, and therefore there are uncountably many languages over . However, there are only a countably infinitely many languages that can be generated by formal grammars.
- Every language generated by a formal grammar is recursively enumerable.
- Every context-sensitive grammar is equivalent to a formal grammar, and under the Chomsky hierarchy, the class of formal languages is of class 0.
- 1
- H.R. Lewis, C.H. Papadimitriou Elements of the Theory of Computation. Prentice-Hall, Englewood Cliffs, New Jersey (1981).
|
"formal grammar" is owned by CWoo. [ full author list (2) | owner history (2) ]
|
|
(view preamble)
See Also: semi-Thue system, language, Post system
| Other names: |
phrase-structure grammar, unrestricted grammar, grammar, phrase-structure language, terminal symbol, non-terminal symbol, initial-symbol, initial symbol, start symbol |
| Also defines: |
formal language, terminal, non-terminal, starting symbol, production, generable by a formal grammar |
|
|
Cross-references: class, Chomsky hierarchy, context-sensitive grammar, recursively enumerable, countably infinite, finite, section, relation, PlanetMath, subset, rewriting system, generated by, alphabet, collection, necessary, generating, language, sentences, generate
There are 36 references to this entry.
This is version 27 of formal grammar, born on 2006-12-09, modified 2008-07-18.
Object id is 8609, canonical name is FormalGrammar.
Accessed 4510 times total.
Classification:
| AMS MSC: | 03D05 (Mathematical logic and foundations :: Computability and recursion theory :: Automata and formal grammars in connection with logical questions) | | | 68Q45 (Computer science :: Theory of computing :: Formal languages and automata) | | | 68Q42 (Computer science :: Theory of computing :: Grammars and rewriting systems) | | | 91F20 (Game theory, economics, social and behavioral sciences :: Other social and behavioral sciences :: Linguistics) |
|
|
|
|
|
|
Pending Errata and Addenda
|
|
|
|
|
|
|
|
|
|
|