LR(k)
Given a word u and a context-free grammar G, how do we determine if u∈L(G)?
One way is to look for any subword v of u such that there is a production A→v. If this is successful, we may replace v with A in u to obtain a word w, so that w⇒u. We may then repeat the process on w to obtain another word x such that x⇒w (if successful). In the end, if everything works successfully, we arrive at the starting non-terminal symbol σ, and get a derivation σ⇒*u as a result, so that u∈L(G). This procedure is known as the bottom-up parsing of the word u.
In general, unless one is very lucky, successfully finding a derivation σ⇒*u requires many trials and errors, since at each stage, for a given word u, there may be several words w such that w⇒u.
Nevertheless, there is a particular family of context-free grammars, called the LR(k) grammars, which make the bottom-up parsing described above straightforward in the sense that, given a word u, once a word w is found such that w⇒Ru, any other word w′ such that w′⇒Ru forces w′=w. Here, ⇒R is known as the rightmost derivation (meaning that u is obtained from w by replacing the rightmost non-terminal in w). The L in LR(k) means scanning the symbols of u from left to right, R stands for finding a rightmost derivation for u, and k means having the allowance to look at up to k symbols ahead while scanning.
The details are as follows:
Definition. Let G=(Σ,N,P,σ) be a context-free grammar such that σ→σ is not a production of G, and k≥0 an integer. Suppose U is any sentential form over Σ with the following setup: U=U1U2U3 where
-
•
U3 is a terminal word,
-
•
X→U2 a production, and
-
•
σ⇒*RU1XU3⇒RU.
Let n=|U1U2|+k, and Z the prefix of U of length n (if |U|<n, then set Z=U).
Then G is said to be LR(k) if W is another sentential form having Z as a prefix, with the following setup: W=W1W2W3, where
-
•
W3 is a terminal,
-
•
Y→W2 is a production, and
-
•
σ⇒*RW1YW3⇒RW,
implies that
W1=U1,Y=X,and |
Simply put, if and are the rightmost derivations of and respectively, and if the prefix of obtained by including symbols beyond the last replacement in is also a prefix of , then the prefix of obtained by including symbols beyond the last replacement in is also a prefix of , where and are words at the next to the last step in and respectively. In particular, if , then . This implies that any derivable in an grammar has a unique rightmost derivation, hence
Proposition 1.
Any grammar is unambiguous.
Examples.
-
•
Let be the grammar consisting of one non-terminal symbol (which is also the final non-terminal symbol), two terminal symbols , with productions
Then is not for any . For instance, look at the following two derivations of :
Here, , . Let . Then the criteria in the definition are satisfied. Yet, . Therefore, is not .
-
•
Note that the grammar above generates the language
, which can also be generated by the grammar with three non-terminal symbols , with the final non-terminal symbol, where the productions are given by
However, this grammar is .
Determining whether a context-free grammar is is a non-trivial problem. Nevertheless, an algorithm exists for determining, given a context-free grammar and a non-negative integer , whether is . On the other hand, without specifying in advance, no algorithms exist that determine if is for some .
Definition. A language is said to be if it can be generated by an grammar.
Theorem 1.
Every language is deterministic context-free. Every deterministic context-free language is .
Hence, deterministic context-free languages are the same as languages.
References
- 1 A. Salomaa, Formal Languages, Academic Press, New York (1973).
-
2
J.E. Hopcroft, J.D. Ullman, Formal Languages and Their Relation
to Automata, Addison-Wesley, (1969).
Title | LR(k) |
---|---|
Canonical name | LRk |
Date of creation | 2013-03-22 19:00:31 |
Last modified on | 2013-03-22 19:00:31 |
Owner | CWoo (3771) |
Last modified by | CWoo (3771) |
Numerical id | 11 |
Author | CWoo (3771) |
Entry type | Definition |
Classification | msc 03D10 |
Classification | msc 68Q42 |
Classification | msc 68Q05 |
Synonym | |
Related topic | LLk |