I want to determine the cardinal direction I am facing
I can model this problem with English
I can model this problem with a subset of English
Subset:{Turn Left 90 degrees, Turn Right 90 degrees}
There is a minimum language needed to compute a problem
There is a minimum language needed to compute a problem
Different classes of languages exist
Regular languages can compute very simple problems
A regular language is any language that can be defined by a regular expression
Regular Expression: A pattern that describes a set of strings
Regular Expressions are used to describe regular languages (future lecture)
For now: a tool used to search for text
A pattern that describes a set of strings
How to define the set?
How to define the set?
We write a pattern or a regular expression
Our first pattern
"a"
Describes the set {"a"}
Our second pattern
"hello"
Describes the set {"hello"}
Boring
"hello|hi"
Describes the set {"hello", "hi"}
Boolean Or
"this|that"
Describes the set {"this", "that"}
The or operator's scope extends to start or end
"this|that|the other thing"
or until another |
Describes the set {"this", "that", "the other thing"}
Precedence
"cliff|clyff"
{"cliff","clyff"}
A lot of shared characters
"cl(i|y)ff"
Describes the same set
Quantification
"0|1|2|3|4|5|6|7|8|9"
{"0", "1", "2", "3", "4", "5", "6", "7", "8", "9"}
What about two digit strings?
"(0|1|2|3|4|5|6|7|8|9)(0|1|2|3|4|5|6|7|8|9)"
Cringe
Quantification
"(0|1|2|3|4|5|6|7|8|9)(0|1|2|3|4|5|6|7|8|9)"
"(0|1|2|3|4|5|6|7|8|9){2}"
What about infinite repetition?
Kleene Operator
"(ha)*"
{"", "ha", "haha", "hahaha",...}
Bracket Expressions
"0|1|2|3|4|5|6|7|8|9"
"[0-9]"
"[a-z]"
Any ascii range, can also or
"[a-zA-Z]"
Any lowercase or uppercase letter
Bracket Expressions
Any ascii range, can also or
"[a-zA-Z]"
"[aeiou]"
{"a", "e", "i", "o", "u"}
Can also negate single characters
"[^aeiou]"
Anything except a,e,i,o,u
Other helpful symbols
Note: will need to be escaped to be matched
"1 \+ 2"
Needs the re library
#require Re (* utop only *)
create a re:
let my_re = Re.compile(Re.Posix.re ("[0-9]+\.[0-9]+"))
Matching
Check if string in re
let my_re = Re.compile(Re.Posix.re ("[0-9]+\.[0-9]+")) in
let did_match = Re.execp my_re "26.19" in
if did_match then
print_string "successfully matched"
else
ptring_string "unsuccessfully matched"
Grouping
Searching is great, parsing is better
let my_re = Re.compile(Re.Posix.re ("([0-9]+)\.[0-9]+")) in
let matched = Re.exec my_re "26.19" in
print_string Re.Group.get matched 1
Parenthesis show precedence, AND capture
Grouping
Parenthesis show precedence, AND capture
let my_re = Re.compile(Re.Posix.re "(([0-9]){3})-([0-9]{3}-[0-9]{4})") in
let m = Re.exec my_re "123-456-7890" in
let ac = Re.Group.get m 1 in
let rest = Re.Group.get m 3 in
let _ = print_string ("the area code is " ^ ac ) in
print_string ("\nthe rest is " ^ rest )
Group is determined by open paren
Other ways to make Regex Type
Re.Posix.re "I am ([0-9]+) years old"
Re.seq [Re.str "I am "; Re.group (Re.rep1 Re.digit); Re.str " years old"]
Useful to procedurely build regex