Introduction to Lexical Analysis

Outline

Lexical Analysis

Tokens

What are Tokens used for?

Designing an Lexical Analyzer: Step 1

Designing an Lexical Analyzer: Step 2

Lexical Analyzer: Implementation

Example

Why do Lexical Analysis?

Difficulties

Review

Next

Regular Languages

Languages

Examples of Languages

Regular Expressions

Fundamental Regular Expressions

\(A\) \(L(A)\) Notes
a {a} singleton set for each symbol ‘a’ in the alphabet \(\Sigma\)
\(\epsilon\) {\(\epsilon\)} empty string
\(\varnothing\) { } empty language

Operations on Regular Expressions

\(A\) \(L(A)\) Notes
\(rs\) \(L(r) L(s)\) concatenation – \(r\) followed by \(s\)
\(r | s\) \(L(r) \cup L(s)\) combination (union) – \(r\) or \(s\)
\(r*\) \(L(r)*\) zero or more occurrences of \(r\) (Kleene closure)

Examples

Abbreviations

Abbreviation Meaning Notes
\(r+\) \((rr*)\) one or more occurrences
\(r?\) \((r | \epsilon)\) zero or one occurrence
\([a-z]\) \((a | b | \ldots | z)\) one character in given range
\([abxyz]\) \((a | b | x | y | z)\) one of the given characters
\([\)^\(abc]\) \(\overline{[abc]}\) any character except the given characters