Appendix B Common Syntax Elements

Identifiers

identifier Unicode identifier
Unicode character sequences that match the identifier rule above and are not reserved keywords can be used as identifiers in a .lexer, .parser or .spg file. The lexer‑file‑keyword rule defines reserved keywords in a .lexer file, the parser‑file‑keyword rule defines reserved keywords in a .parser file, and the container‑file‑keyword rule defines reserved keywords in a .spg file.

Literals

We try to be compatible with C++ literal syntax, and to accept a superset of valid C++ literals.

A string literal consists of a possibly empty sequence of string characters enclosed in double quotes:

string‑literal encoding‑prefix?
" schar* "
schar [ ^ " \\ \n \r ] | escape
encoding‑prefix u8 | u | U | L

A string character may be an ordinary character that is not a double quote, a backslash, a newline or a carriage return, or it may be an escape sequence. A string literal may start with an optional encoding prefix.

A character literal may be a narrow character literal, a universal character literal or a wide character litera.

character‑literal narrow‑char‑literal | universal‑char‑literal | wide‑char‑literal
narrow‑char‑literal ' cchar+ '
universal‑char‑literal (u | U) ' cchar+ '
wide‑char‑literal L ' cchar+ '
cchar [ ^ ' \\ \n \r ] | escape

A narrow character literal consists of a non-empty sequence of symbols matching the cchar rule enclosed in single quotes. A cchar is either an ordinary character that is not a single quote, a backslash, a newline or a carriage return, or it may be an escape sequence. A universal character literal has a 'u' or 'U' prefix. A wide character literal has an 'L' prefix.

An escape sequence starts with the backslash character that is followed by a hexadecimal escape, a decimal escape, an octal escape, a UTF-16 escape, a UTF-32 escape, a C-escape, or any other single character.

escape \ ( (x | X) hex‑digit+ | (d | D) dec‑digit+ | octal‑digit+ | u hex4 | U hex8 | [ abfnrtv ] |
'any other character' )
dec‑digit [ 0 - 9 ]
hex‑digit [ 0 - 9 | a - f | A - F ]
octal‑digit [ 0 - 7 ]
hex4 hex‑digit hex‑digit hex‑digit hex‑digit
hex8 hex4 hex4

We have tried to define an escape sequence that is a superset of a valid C++ escape sequence, so it can be used in all contexts in SoulNG syntax files.

An integer literal can be an octal literal, a decimal literal, or a hexadecimal literal. It may have an optional integer suffix:

integer‑literal (octal‑literal | decimal‑literal | hex‑literal) integer‑suffix?
octal‑literal 0 octal‑digit*
decimal‑literal [ 1 - 9 ] dec‑digit*
hex‑literal (0x | 0X) hex‑digit+
integer‑suffix unsigned‑suffix | long‑long‑suffix | long‑suffix
unsiged‑suffix u | U
long‑long‑suffix ll | LL
long‑suffix l | L

A floating-point literal may be a fractional floating-point literal or an exponential floating-point literal. It may have an optional floating-point suffix:

floating‑literal (fraction exponent? | dec‑digit+ exponent) floating‑suffix?
fraction dec‑digit* . dec‑digit+ | dec‑digit+ .
exponent (e | E) sign? dec‑digit+
sign + | -
floating‑suffix f | F | l | L

File Paths

A file path consists of a non-empty sequence of characters enclosed in angle brackets. A character inside the angle brackets may be any other character than a newline or a right angle bracket.

file‑path < [^\n>]+ >

API Specifier

An API specifier defines a macro symbol used for exporting generated classes and functions from a Windows DLL. See cmajor example in the examples/cmajor directory for an example use.

api api ( identifer )