CpS 450 Lecture Notes

Grammar Analysis

In order for a recursive descent parse to proceed when there are multiple alternatives for a given LHS, the parser must be able to use the current input token to predict which production to use to expand a nonterminal. This requires the parser to know First and Follow sets.

First(V) is the set of tokens that can begin a sentence derivable from V.
- If V is a terminal or ε , the set is { V }
- If V is a nonterminal, the set must be computed using an algorithm discussed below
Follow(V) is the set of tokens that can follow a nonterminal V in a sentential form.

Example from this grammar:

First(stmt) = {read, write, id}
Follow(stmt) = { ; }

First(V_n) - the set of terminals which can begin a string derivable from a nonterminal V_n

Two cases to consider.

The grammar contains no nullable productions
The grammar contains nullable productions

Note that, in general, it's impossible to compute the First set for an isolated nonterminal; we usually must compute the First set for all nonterminals.

Here's a description of the more complicated case, to compute First(A), where A is a nonterminal:

For each production A ::= V₁ V₂V₃ ... V_m
- One at a time, from left to right, add the First sets of each vocabulary symbol on the RHS to First(A), leaving out ε if it isn't already in First(A), until you get to a nonnullable vocabulary symbol (doesn't have ε in its First set). If all the vocabulary symbols on the RHS have ε in their First sets, or the RHS is empty (ε ), put ε in First(A)
Iterate over this algorithm until no First sets change.

Example Exercise: Compute First(this grammar)

Solution

	First
start	read, write, id
stmt_list	read, write, id
stmt_tail	read, write, id, ε
stmt	read, write, id
expr	intlit, id

Example Exercise: Compute First of the following grammar:

1.	A	::= B C D
2. 3.	B	::= c E b ::=
3. 4.	C	::= a E D ::= B
5. 6.	D	::= a D ::=
7.	E	::= B b D

Answer

	First
A	c a ε
B	c ε
C	a c ε
D	a ε
E	c, b

First(XYZ...)

When constructing an LLPT, we must compute the set of symbols that can begin a string derivable from the right-hand side of each production. For arbitrary LL(1) grammars, this can be done only after the First( ) sets are computed for all of the nonterminals.

See p. 113 (Step 5) for the technique.

Follow(V_n) - the set of terminal symbols which can follow a nonterminal V_n in a sentential form

Need to know what terminal symbols can follow a nonterminal V_n in some sentential form
Note that in practice, for programming language grammars, ε is never in Follow(V_n), because Follow(Start) is $ (EOF).
To compute the Follow sets for all nonterminals in a grammar,
- Follow[Start] = {$}
- While there are changes,
- For each production,

Example Exercise: Compute Follow(this grammar)

Solution

	Follow
start	$
stmt_list	$
stmt_tail	$
stmt	;
expr	) ;

Example Exercise: Compute Follow(this grammar)

	Follow
A	$
B	a b c $
C	a $
D	a b $
E	a b $

Disappearing Nonterminals

Important to know which nonterminals are nullable (can derive ε)
Theorem: A is nullable iff ε ∈ First(A)

CpS 450 Lecture Notes

Grammar Analysis

First(Vn) - the set of terminals which can begin a string derivable from a nonterminal Vn

Solution

Answer

First(XYZ...)

Follow(Vn) - the set of terminal symbols which can follow a nonterminal Vn in a sentential form

Solution

Disappearing Nonterminals

First(V_n) - the set of terminals which can begin a string derivable from a nonterminal V_n

Follow(V_n) - the set of terminal symbols which can follow a nonterminal V_n in a sentential form