Appendix A Syntax Table Notation
A syntax table consists of rules that have name and body:
The name and body of the rule are separated by the → symbol.
In the body of a rule there can be terminal symbols, language keywords, references to other rules, character classes in brackets, and
expressions containing those primitives and expressions combined with operators '|', catenation, '*', '+', and '?'.
-
Terminal symbols are in monospace font. A terminal symbol matches a literal input character.
-
Language keywords are in bold font. A keyword matches a literal string of input characters.
-
The names of the rules are in italic font.
-
Symbols in brackets denote a character class. A character class matches a single input character belonging to the class.
A character class can contain single terminal symbols represented in monospace font and ranges of characters.
For example a character class [0 - 9 ]
represents a character class that matches a single decimal digit.
If the class starts with a caret character, '^', the class is inverted.
For example a character class [^0 - 9 ]
matches a single input character that is not a decimal digit.
-
Expression e0 | e1 matches either e0 or e1.
-
Expression e0 e1 matches a sequence of e0 and e1.
-
Expression e* represents Kleene closure: e matches zero or more times.
-
Expression e+ represents positive closure: e matches one or more times.
-
Expression e? represents optionality: e matches zero or one time.
-
Operator '|' has least precedence, then comes catenation. Operations '*', '+' and '?' have greatest precedence.
Precedence can be overridden or clarified using parentheses.