Language Reference

1 Lexical Structure
    1.1 White Space and Comments
          1.1.1 Syntax
    1.2 Keywords
          1.2.1 Syntax
    1.3 Identifiers
          1.3.1 Syntax
    1.4 Literals
          1.4.1 Syntax
          1.4.2 Boolean Literals
          1.4.3 Floating Literals
          1.4.4 Integer Literals
          1.4.5 Character Literals
          1.4.6 String Literals
          1.4.7 Null Literal
          1.4.8 Array Literal
          1.4.9 Class Literal
    1.5 Operators
2 Basic Types
    2.1 Syntax
    2.2 Boolean Type
    2.3 Basic Integer Types
    2.4 Basic Floating-Point Types
    2.5 Basic Character Types
    2.6 Void
3 Expressions
    3.1 Syntax
    3.2 Axiom Expressions
    3.3 Boolean Expressions
    3.4 Bitwise Expressions
    3.5 Equality Expressions
    3.6 Relational Expressions
    3.7 Is and As Expressions
    3.8 Shift Expressions
    3.9 Additive Expressions
    3.10 Multiplicative Expressions
    3.11 Prefix Expressions
            3.11.1 Prefix Increment and Decrement Expressions
            3.11.2 Unary Plus and Unary Minus Expressions
            3.11.3 Logical Not Expression
            3.11.4 Bitrwise Complement Expression
            3.11.5 Pointer Dereference Expression
            3.11.6 Address-of Expression
    3.12 Postfix Expressions
            3.12.1 Postfix Increment and Decrement Expressions
            3.12.2 Member Access Expression
            3.12.3 Pointer Member Access Expression
            3.12.4 Subscript Expression
            3.12.5 Invocation Expression
    3.13 Primary Expressions
            3.13.1 Grouping Expression
            3.13.2 Literals in Expressions
            3.13.3 Basic Types in Expressions
            3.13.4 Template Identifier in Expressions
            3.13.5 Identifiers in Expressions
            3.13.6 This and Base Access
            3.13.7 Size-of Expression
            3.13.8 Typename Expression
            3.13.9 Cast Expression
            3.13.10 Construction Expression
            3.13.11 New Expression
    3.14 Constant Expression
4 Statements
    4.1 Syntax
    4.2 Labeled Statement
    4.3 Control Statements
           4.3.1 Compound Statement
           4.3.2 Return Statement
           4.3.3 If Statement
           4.3.4 While Statement
           4.3.5 Do Statement
           4.3.6 For Statement
           4.3.7 Range-for Statement
           4.3.8 Break Statement
           4.3.9 Continue Statement
           4.3.10 Goto Statement
           4.3.11 Switch Statement
           4.3.12 Case Statement
           4.3.13 Default Statement
           4.3.14 Goto Case Statement
           4.3.15 Goto Default Statement
    4.4 Expression Statement
    4.5 Empty Statement
    4.6 Assignment Statement
    4.7 Construction Statement
           4.7.1 Declaring a Construction Statement with a Placeholder Type
    4.8 Delete and Destroy Statements
    4.9 Assert Statement
    4.10 Conditional Compilation Statement
5 Access Specifiers
    5.1 Syntax
6 Type Expressions
    6.1 Syntax
    6.2 Primary Type Expressions
    6.3 Const Qualifier
    6.4 Type Member Access
    6.5 Pointer Types
    6.6 Reference Types
           6.6.1 Lvalue Reference Type
           6.6.2 Rvalue Reference Type
    6.7 Array Types
7 Constants
    7.1 Syntax
    7.2 Constant Values
    7.3 Examples
8 Global Variables
    8.1 Syntax
    8.2 Acces
    8.3 Initialization
    8.4 Example
9 Enumerations
    9.1 Syntax
    9.2 Underlying Type
    9.3 Enumeration Constants
    9.4 Examples
10 Functions
    10.1 Syntax
    10.2 Regular Function
    10.3 Constexpr Specifier
    10.4 Inline Specifier
    10.5 CDecl Specifier
    10.6 Extern Specifier
    10.7 Function Templates
    10.8 Intrinsic Functions
    10.9 System-default Functions
    10.10 Function Overload Resolution
11 Type Aliases
    11.1 Syntax
    11.2 Examples
12 Classes
    12.1 Syntax
    12.2 Regular Classes
    12.3 Abstract Classes
    12.4 Static Classes
    12.5 Class Members
            12.5.1 Static Constructor
            12.5.2 Constructors
            12.5.3 Destructor
            12.5.4 Member Functions
            12.5.5 Member Variables
     12.6 Literal Classes
     12.7 Class Hierarchies
     12.8 Class Templates
     12.9 Full Instantiation Requests
13 Interfaces
    13.1 Syntax
14 Delegates
    14.1 Syntax
    14.2 Conversions
15 Class Delegates
    15.1 Syntax
16 Concepts
    16.1 Syntax
    16.2 Constraints
            16.2.1 Typename Constraint
            16.2.2 Signature Constraints
            16.2.3 Embedded Constraint
            16.2.4 Where-Constraint
    16.3 Constraint Expressions
            16.3.1 Disjunctive Constraint Expression
            16.3.2 Conjunctive Constraint Expression
            16.3.3 Primary Constraint Expression
            16.3.4 Atomic Constraint Expression
            16.3.5 Predicate Constraint Expression
            16.3.6 Is-Constraint Expression
            16.3.7 Multiparam-Constraint Expression
    16.4 Axioms
17 Attributes
    17.1 Syntax
    17.2 Usage
18 Compile Units
    18.1 Syntax
    18.2 Using Directives
            18.2.1 Using-Alias Directives
            18.2.2 Using-Namespace Directives
    18.3 Definitions
            18.3.1 Namespaces
            18.3.2 Namespace-level Definitions
            18.3.3 Unnamed Namespaces
19 Projects
    19.1 Syntax
    19.2 Project Types
    19.3 File Types
    19.4 Example
    19.5 Programs
20 Solutions
    20.1 Syntax
    20.2 Example
Appendix A Terms
Appendix B Syntax Notation

1 Lexical Structure

Cmajor source files are ordinary UTF-8-encoded plain text files. Each source file consists lexically of keywords, identifiers, literals and operators that can be separated by comments and white space characters such as spaces, tabulations, and newline characters.

1.1 White Space and Comments

Other lexical elements such as keywords, identifiers, literals and operators may be separated by any number of white space characters and comments that match the syntax rule white‑space‑and‑comments.

1.1.1 Syntax

The syntax notation used in this text is explained in the Appendix B.

white‑space‑and‑comments	→	(white‑space‑char \| comment)*
white‑space‑char	→	'any Unicode character having property WSpace=Y'
comment	→	line‑comment \| block‑comment
line‑comment	→	// [^\r\n]* newline
newline	→	\r\n \| \n \| \r
block‑comment	→	/* (any‑char − /) */
any‑char	→	'any Unicode character'

See Wikipedia article about Unicode Whitespace Characters.

1.2 Keywords

Keywords have a context-dependent meaning in programs. They cannot be used as identifiers.

1.2.1 Syntax

keyword

→

1.3 Identifiers

Identifiers are used to name variables, parameters, types, constants, namespaces, type aliases and functions.

A qualified‑id can be used to refer to an entity when a simple identifier of it would be ambiguous in its context. It consists of the names of the namespaces containing the entity (if any) separated by periods followed by the names of the types (if any) containing the entity separated by periods followed by the identifier of the entity itself. An enumeration constant must always be referred by prefixing the name of it by the name of the enumerated type that contains it and a period. A local variable cannot be referred by a fully qualified identifier. An identifier of a local variable refers always to the local variable in the innermost scope that contains it.

1.3.1 Syntax

identifier	→	id‑char‑sequence − keyword
id‑char‑sequence	→	idstart idcont*
idstart	→	'any Unicode character having property ID_Start'
idcont	→	'any Unicode character having property ID_Continue'
qualified‑id	→	identifier (. identifier)*

Information about Unicode identifier syntax can be found in article UNICODE IDENTIFIER AND PATTERN SYNTAX.

1.4 Literals

Literals are used to enter values in a program. Literals have value and type.

1.4.1 Syntax

literal	→	boolean‑literal \| floating‑literal \| integer‑literal \| char‑literal \| string‑literal \| null‑literal \| array‑literal \| class‑literal
boolean‑literal	→	true \| false
floating‑literal	→	(fractional‑floating‑literal \| exponent‑floating‑literal) [fF]?
fractional‑floating‑literal	→	dec‑digit‑sequence? . dec‑digit‑sequence exponent‑part? dec‑digit‑sequence .
dec‑digit‑sequence	→	[0-9]+
exponent‑floating‑literal	→	dec‑digit‑sequence exponent‑part
exponent‑part	→	[eE] sign? dec‑digit‑sequence
sign	→	+ \| -
integer‑literal	→	(hex‑integer‑literal \| dec‑integer‑literal) [uU]?
hex‑integer‑literal	→	(0x \| 0X) hex‑digit‑sequence
hex‑digit‑sequence	→	[0-9a-fA-F]+
dec‑integer‑literal	→	dec‑digit‑sequence
char‑literal	→	(w \| u)? ' ([^'\\\r\n]+ \| char‑escape) '
char‑escape	→	\ ([xX] hex‑digit‑sequence \| [dD] dec‑digit‑sequence \| octal‑digit‑sequence \| u hex‑digit‑4 \| U hex‑digit‑8 \| [abfnrtv] \| any‑char)
octal‑digit‑sequence	→	[0-7]+
hex‑digit‑4	→	hex‑digit hex‑digit hex‑digit hex‑digit
hex‑digit‑8	→	hex‑digit hex‑digit hex‑digit hex‑digit hex‑digit hex‑digit hex‑digit hex‑digit
hex‑digit	→	[0-9a-fA-F]
string‑literal	→	raw‑string‑literal \| regular‑string‑literal
raw‑string‑literal	→	(w \| u)? @ " [^"]* "
regular‑string‑literal	→	(w \| u)? " ([^"\\\r\n]+ \| char‑escape)* "
null‑literal	→	null
array‑literal	→	[ (constant‑expression (, constant‑expression)* )? ]
class‑literal	→	{ (constant‑expression (, constant‑expression)* )? }

1.4.2 Boolean Literals

A Boolean literal can have value true or false. The type of Boolean literals is bool.

1.4.3 Floating Literals

A floating literal represents a fractional or exponential floating-point number. If it has 'f' or 'F' suffix its type is float, otherwise its type is double.

1.4.4 Integer Literals

An integer literal represents a hexadecimal or decimal signed or unsigned integer. If it has u or U suffix, it represents an unsigned integer literal, and its type is the smallest of the following types that can contain its value: byte, ushort, uint, ulong (see basic integer types). Otherwise it represents a signed integer literal, and its type is the smallest of the following types that can contain its value: sbyte, short, int, long (see basic integer types). If the literal has prefix 0x or 0X, it represents a hexadecimal, or base 16, value, otherwise it represents a decimal, or base 10, value.

1.4.5 Character Literals

A character literal represents an ASCII or a Unicode character value. Graphical character values can be entered by enclosing the character in single quotes. An escape mechanism is provided for entering character values that do not have a graphical representation. If the character literal is has a w prefix, its type is wchar, if it has a u prefix its type is uchar, otherwise its type is char.

By prefixing character value with the backslash \ character, the ASCII, or Unicode code point of the character can be given in hexadecimal (x, X, u or U prefix), decimal d or D prefix), or octal notation (lack of prefix). Some special control characters can be also entered using character combinations \a, \b, \f, \n, \r, \t and \v. Their meaning can be found in the Wikipedia article for ASCII.

1.4.6 String Literals

A string literal represents an ASCII, a Unicode UTF-8, a Unicode UTF-16, or a Unicode UTF-32 encoded string. The string is entered by enclosing its value in double quotes. If the string literal is prefixed with @ character the content may have no escapes, in other words, the backslash character \ has its literal meaning, otherwise the backslash character provides an escape mechanism for entering non-graphical character values in the same way as described in section for character literals.

If the string literal has a w prefix, its type is const wchar*, and it represents a Unicode UTF-16 encoded string. If the string literal has a u prefix, its type is const uchar*, and it represents a Unicode UTF-32 encoded string. If the string literal has no w or u prefix, its type is const char*, and it represents an ASCII or a Unicode UTF-8 encoded string. Note: by convention Cmajor source files have UTF-8 encoding, so that string literals are always entered using UTF-8 encoding, but internal representation of a string in a program can be ASCII, UTF-8, UTF-16 or UTF-32 encoded string.

1.4.7 Null Literal

The null literal represents a special value of a pointer that does not point to any memory location. Its type is a special @nullptr_type that is implicitly convertible to any other pointer type.

1.4.8 Array Literal

An array literal represents a value of a constant array. Elements of a constant array must be constant expressions that evaluate to literals, constants or enumeration constants.

1.4.9 Class Literal

A class literal represents a value of literal class. Members of a literal class must be constant expressions that evaluate to literals, constants or enumeration constants.

1.5 Operators

Operators allow expressions to be written with a notation close to mathematical notation.

operator

→

. | [ | ] | < | > | , | = | <=> | => | || | && | | | ^ | & | == | != | <= | >= | < | > | << | >> | + | − | * | / | % | ++ | −− | ! | ~ | −> | ( | )

2 Basic Types

Programming language constructs such as variables, parameters, constants and literal values have a type. A type provides an intepretation of the contents of such a construct and specifies what are the possible values for those constructs. A language has a small number of predefined built-in types also called basic types or primitive types. The Cmajor language defines the following basic types for operating with truth values, numbers and characters.

2.1 Syntax

basic‑type

→

2.2 Boolean Type

The bool type represents a truth value. It has values true and false.

2.3 Basic Integer Types

The sbyte type is a signed 8-bit integer type. It has values -128...127.

The byte type is an unsigned 8-bit integer type. It has values 0u...255u.

The short type is a signed 16-bit integer type. It has values -32768...32767.

The ushort type is an unsigned 16-bit integer type. It has values 0u...65535u.

The int type is a signed 32-bit integer type. It has values –2147483648...2147483647.

The uint type is an unsigned 32-bit integer type. It has values 0u...4294967295u.

The long type is a signed 64-bit integer type. It has values –9223372036854775808...9223372036854775807.

The ulong type is an unsigned 64-bit integer type. It has values 0u...18446744073709551615u.

2.4 Basic Floating-Point Types

The float type is a 32-bit single precision floating-point number type.

The double type is a 64-bit double precision floating-point number type.

2.5 Basic Character Types

The char type is an unsigned 8-bit character type. It can have an ASCII code value.

The wchar type is an unsigned 16-bit character type. It can have a Unicode UTF-16 code point value.

The uchar type is an unsigned 32-bit character type. It can have a Unicode UTF-32 code point value.

2.6 Void

The void keyword represents lack of value.

3 Expressions

Most expressions consist of operators and operands. Some expressions can also contain keywords. Operands can be names of constants, variables, parameters, types, namespaces and functions. They can also be literal values or subexpressions. Expressions can be evaluated, or have their value computed. ¹

Many operators can be overloaded. An overloaded operator has the same name as some built-in operator, but takes at least one parameter that is of a user-defined type.

Expressions can be classified as being infix expressions where operator is between the operands, prefix expressions where operator comes before the operand and postfix expressions where operator comes after the operand.

3.1 Syntax

expression	→	equivalence
equivalence	→	implication (<=> implication)*
implication	→	disjunction (=> disjunction)?
disjunction	→	conjunction (\|\| conjunction)*
conjunction	→	bit‑or (&& bit‑or)*
bit‑or	→	bit‑xor (\| bit‑xor)*
bit‑xor	→	bit‑and (^ bit‑and)*
bit‑and	→	equality (& equality)*
equality	→	relational ((== \| (!=) relational)*
relational	→	shift ((<= \| >= \| < \| >) shift)* \| shift is type-expr \| shift as type-expr
shift	→	additive ((<< \| >>) additive )*
additive	→	multiplicative ((+ \| −) multiplicative )*
multiplicative	→	prefix ((* \| / \| %) prefix )*
prefix	→	(++ \| −− \| + \| − \| ! \| ~ \| * \| &) prefix \| postfix
postfix	→	primary (++ \| −− \| . identifier \| −> identifier \| [ expression ] \| ( argument‑list ))*
primary	→	( expression ) \| literal \| basic‑type \| template‑id \| identifier \| this \| base \| size‑of‑expr \| type‑name‑expr \| cast‑expr \| construct‑expr \| new‑expr
size‑of‑expr	→	sizeof ( expression )
type‑name‑expr	→	typename ( expression )
cast‑expr	→	cast < type-expr > ( expression )
construct‑expr	→	construct < type-expr > ( expression‑list )
new‑expr	→	new type-expr ( argument‑list )
argument‑list	→	expression‑list?
expression‑list	→	expression (, expression)*
constant‑expression	→	expression

3.2 Axiom Expressions

An expression that is a true equivalence expression (and not just a disjunction for example) can only be used in axioms. The same applies to implication expression. An expression used in an axiom, for example a != b <=> !(a == b) is not evaluated at all, it has purely informative value.

3.3 Boolean Expressions

A disjunction expression takes bool type operands and yields a bool type result. A disjunction, for example a || b, is true, if either a or b, or both evaluate to true. It is false otherwise. Disjunctive expressions are evaluated using so called short-circuit evaluation: if the left operand is true, the right operand is not evaluated, because the result is already known to be true.

A conjunction expression takes bool type operands and yields a bool type result. A conjunction, for example a && b, is true, if both a and b evaluate to true. It is false otherwise. Conjunctive expressions are also evaluated using short-circuit evaluation: if the left operand is false, the right operand is not evaluated, because the result is already known to be false.

The || and && operators cannot be overloaded.

3.4 Bitwise Expressions

Bitwise expressions, bit-or, bit-xor and bit-and expressions, take integer type operands, and yield integer type result.

A bit-or expression, for example a | b, is evaluated as follows: for each bit x_i of a and corresponding bit y_i of b, if both bits are 0, the corresponding bit z_i of the result is 0. Otherwise, if either x_i, or y_i is 1, the result bit z_i is 1.

A bit-xor expression, for example a ^ b, is evaluated as follows: for each bit x_i of a and corresponding bit y_i of b, if either x_i or y_i, but not both, is 1, the corresponding bit z_i of the result is 1. Otherwise, if both x_i and y_i are both 0 or both are 1, the result bit z_i is 0.

A bit-and expression, for example a & b, is evaluated as follows: for each bit x_i of a and corresponding bit y_i of b, if both bits are 1, the corresponding bit z_i of the result is 1. Otherwise, if either x_i, or y_i is 0, the result bit z_i is 0.

Bitwise operators |, ^, and & can be overloaded, in which case they take operands of user-defined types and yield a result of some user-defined or built-in type.

3.5 Equality Expressions

An equality expression takes basic type (other that void), or pointer type operands and yields a bool type result.

Expression a == b evaluates to true if the value of a is equal to the value of b, and false otherwise.

Expression a != b evaluates to true if the value of a is not equal to the value of b, and false otherwise.

Equality operator == can be overloaded, in which case it takes operands of user-defined types and yields a result of some user-defined or built-in type. Inequality operator != cannot be overloaded. Instead, if equality operator is overloaded for some type T, and a and b are expressions of type T, expression a != b is equivalent to an expression !(a == b).

3.6 Relational Expressions

An relational expression takes basic type (other that bool or void), or pointer type operands and yields a bool type result.

Expression a < b evaluates to true if value of a is less than value of b, and false otherwise.

Expression a > b evaluates to true if value of a is greater than value of b, and false otherwise.

Expression a <= b evaluates to true if value of a is less than or equal to value of b, and false otherwise.

Expression a >= b evaluates to true if value of a is greater than or equal to value of b, and false otherwise.

If operands of <, >, <= or >= operators are of character types char, wchar or uchar, the comparison operators compare codepoint values, that is, numeric character code values of the operands.

If operands of <, >, <= or >= operators are pointers, the comparison operators compare the memory address values of the operands.

Less-than operator < can be overloaded, in which case it takes operands of user-defined types and yields a result of some user-defined or built-in type. Other relational operators >, <= and >= cannot be overloaded. Instead, if less-than operator is overloaded for some type T, and a and b are expressions of type T, expression a > b is equivalent to an expression b < a, expression a <= b is equivalent to an expression !(b < a), and expression a >= b is equivalent to an expression !(a < b).

3.7 Is and As Expressions

If p is a pointer to an object of some polymorphic class type, and T is some polymorphic class type, expression

p is T*

tests whether pointer p actually points to an object of type T or of type U that derives from type T. The test yields a bool result.

If p is a pointer to object of some polymorphic class type, and T is some polymorphic class type, expression

p as T*

tests whether pointer p actually points to an object of type T or of type U that derives from type T. If the test is successful, the result is a pointer to class T, otherwise the result is null.

The implementation of this feature is described here.

The is and as operators cannot be overloaded.

3.8 Shift Expressions

A shift expression take integer type operands, and yield integer type result.

Expression a << b returns operand a shifted operand b bit positions left. The vacant rightmost bit positions of the result are filled with zero bits.

Expression a >> b returns operand a shifted operand b bit positions right. The vacant leftmost bit positions of the result are filled with bits depending of the common type of a and b: If the common type of a and b is an unsigned type byte, ushort, uint, or ulong, the leftmost bit positios of the result are zero-filled. Otherwise, if the common type of a and b is a signed type sbyte, short, int or long, the leftmost bit positions of the result are filled with a bit equal to the leftmost bit of a.

Shift operators << and >> can be overloaded, in which case they take operands of user-defined types and yield a result of some user-defined or built-in type. For example, in the system library the << operator has these overloads.

3.9 Additive Expressions

An additive expression takes integer, floating-point, or pointer type operands. The result is of the common type of the operand types.

When a and b are of integer or floating-point types, the expression a + b returns the sum of a and b. and expression a − b returns the difference of a and b. If at least one operands is of a floating-point type, the result is of a floating-point type, otherwise it is of an integer type.

When p is of a pointer type and i is of an integer type, the value of expression p + i is (informally) a pointer pointing i objects "after" p. The same applies to the expression i + p. The value of expression p − i is (informally) a pointer pointing i objects "before" p. When p and q are of pointer types, the value of expression p − q is (informally) the "number of objects" between p and q.

Additive operators + and − can be overloaded, in which case they take operands of user-defined types and yield a result of some user-defined or built-in type. For example, in the system library the + operator has these overloads.

3.10 Multiplicative Expressions

A multiplicative expression takes integer or floating-point type operands. The result is of the common type of the operand types.

Expression a * b returns the product of a multiplied by b.

Expression a / b returns the quotient of a divided by b. If a and b are of integer types, the result is of an integer type. It is truncated to the nearest whole integer towards zero.

Expression a % b returns the remainder of integer division a divided by b. Remainder operation is defined only for integer type operands. If a is negative, the result is negative, otherwise the result is nonnegative.

Multiplicative operators *, / and % can be overloaded, in which case they take operands of user-defined types and yield a result of some user-defined or built-in type.

3.11 Prefix Expressions

Prefix expressions have many forms. There are expressions for incrementing or decrementing a variable, returning operand negated, returning logical not of the operand, returning a bitwise complement of the operand, dereferencing a pointer, and taking address of a variable.

All prefix operators ++, −−, +, −, !, ~, * and & can be overloaded, in which case they take an operand of some user-defined type and yield a result of some user-defined or built-in type.

3.11.1 Prefix Increment and Decrement Expressions

Prefix increment and decrement expressions take integer or pointer type variable operands and yield result of the same type.

If a is an integer variable, expression ++a increments a by one. The value of the expression is the value of a after incrementing it.

If a is an integer variable, expression −−a decrements a by one. The value of the expression is the value of a after decrementing it.

If p is a pointer to an object of type T, expression ++p increments p, so that p will point to the next object of type T in memory. The value of the expression is the value of p after incrementing it.

If p is a pointer to an object of type T, expression −−p decrements p, so that p will point to the previous object of type T in memory. The value of the expression is the value of p after decrementing it.

3.11.2 Unary Plus and Unary Minus Expressions

Unary plus and unary minus expressions take integer or floating-point type operands.

Expression +a will yield value of a and expression −a will yield value of a negated.

3.11.3 Logical Not Expression

Logical not expression takes a bool type operand and yields a bool type result.

If value of a is true, value of !a is false, and if value of a is false, value of !a is true.

3.11.4 Bitwise Complement Expression

Bitwise complement expression takes an integer operand and yields an integer type result.

For each bit x_i of an integer value a, the value of ~a is computed as follows: If x_i is 0, the corresponding bit y_i of the result will be 1, and if x_i is 1, the corresponding bit y_i of the result will be 0.

3.11.5 Pointer Dereference Expression

Pointer dereference expression takes a pointer type operand and returns the value of pointed-to type.

If p is a pointer to type T object, *p returns the value of that pointed object of type T.

Expression *p can occur also as the left side of an assignment statement, in which case the pointed-to object is assigned a new value.

When pointer dereference expression is overloaded, the return value of the operator function is often of a reference type, so the expression can be used as the left side of an assignment statement.

3.11.6 Address-of Expression

Address-of expression takes a variable operand and returns a pointer that contains the memory address of that variable.

If a is an object of type T, expression &a yields a pointer to type T that contains the address of a.

3.12 Postfix Expressions

A postfix expression can be a postfix increment or decrement expression, a member access expression, a pointer member access expression, a subscript expression, or an invocation expression.

3.12.1 Postfix Increment and Decrement Expressions

Postfix increment and decrement expressions take integer or pointer type variable operands and yield result of the same type.

If a is an integer variable, expression a++ increments a by one. The value of the expression is the value of a before incrementing it.

If a is an integer variable, expression a−− decrements a by one. The value of the expression is the value of a before decrementing it.

If p is a pointer to an object of type T, expression p++ increments p, so that p will point to the next object of type T in memory. The value of the expression is the value of p before incrementing it.

If p is a pointer to an object of type T, expression p−− decrements p, so that p will point to the previous object of type T in memory. The value of the expression is the value of p before decrementing it.

The postfix increment and decrement operators ++ and −− cannot be overloaded. They are implemented by the compiler if the prefix forms of the operators are overloaded.

3.12.2 Member Access Expression

Expression a.b accesses member b of namespace, type or class object a. Then b can be, depending on a, a name of a namespace, enumeration constant, constant, type, member variable, typedef, or function. If b has a type, then the type of the expression will be the type of b and the value of the expression will be the value of b.

Member access operator . cannot be overloaded.

3.12.3 Pointer Member Access Expression

Expression a−>b accesses member b of a class object through a pointer (case 1), or through an object of a class that overloads the −> operator (case 2).

The member b accessed can be a member variable or a member function. If b is a member variable, the type of the expression will be the type of b, and the value of expression will be the value of b. If b is a member function, the type of the expression will be the type returned by the member function b and the value of the expression will be the value returned by the member function b (if any).

Case 1

If a is a pointer to class object T, then b must be a member of the class T, or member of the base class or ancestor class of T.

Case 2

If a is an object of a class that overloads the −> operator, then that operator function can return either a pointer to a class object (case 1), or return an object of another class that in turn overloads the −> operator (case 2), thus providing another level of indirection.

This indirection mechanism, along with overloading the pointer dereference operator, makes it possible to implement pointer-like classes, such as "smart pointers" and iterators. For example, the system library contains a UniquePtr class having these two overloads.

3.12.4 Subscript Expression

A subscript expression provides access to individual elements of an array or other sequence of elements.

If a is an array, expression a[i] provides access to i'th array element. By convention the first element has index 0. The type of index i is long.

If p is a pointer to type T object, expression p[i] is equivalent to expression *(p+i), thus providing access to i'th object of type T in a sequence of elements pointed by p. Again indexing starts from 0.

A subscript expression can also occur in the left side of an assignment statement. In that case the accessed element is assigned a new value.

The subscript operator [] can be overloaded, in which case it takes an operand of a user-defined type and yields a result of some user-defined or built-in type.

3.12.5 Invocation Expression

Invocation expression takes form x(a₀, a₁, ...), where x can be for example an identifier or a qualified id naming a function, delegate, class delegate, typedef, type or object of a class type, and a₀, a₁, ... is a possibly empty list of argument expressions separated by commas and enclosed in parentheses.

Function Call

When x is a name of a function, overload resolution selects the best-matching function overload to be called with the specified arguments. The number of arguments must match exactly to the number of parameters of the function, but if the argument types do not match exactly the signature of the selected function overload, conversions take place. If no single best-matching function can be found, or many overloads are found to be equally good matches, the compiler issues an error.

Delegate and Class Delegate Calls

When x is a name of a delegate or a class delegate, again the number of arguments must match exactly the number of parameters of the delegate or class delegate, and conversions may take place. Delegates and class delegates cannot be overloaded.

Construction of a Temporary

When x is a name of a typedef or a type, the expression yields a call of a constructor and creation of a temporary variable of that type. Constructor to be called is selected by overload resolution.

Constructed temporary can be bound to an rvalue reference without the need to call Rvalue function for it.

Call of a Function Call Operator

When x denotes an object of a class type (a variable or a temporary, for instance), and that class type overloads the function call operator (), the expression yields a call to that operator function. Matching function call operator is selected using overload resolution.

3.13 Primary Expressions

A primary expression can be a grouping expression, a literal, a basic type, a template identifier, an identifier, this or base access, a size-of expression, a typename expression, a cast expression, a construct expression, or a new expression.

3.13.1 Grouping Expression

Parentheses can be used to group subexpressions. Precedence of multiplicative operators is greater than the precedence of additive operators, so a + b * c means a + (b * c). If the addition is ment to be be performed first, this can be accomplished by using parentheses: (a + b) * c.

3.13.2 Literals in Expressions

Literals can be used for example as operands of of arithmetic expressions or as arguments of function calls.

3.13.3 Basic Types in Expressions

A basic type can be used for example in construction of a temporary or in a size-of expression

3.13.4 Template Identifiers in Expression

A template identifier can be a name a class or a function template along with a template argument list. It can be used to refer to a class template specialization or to call a function template specialization, for instance.

3.13.5 Identifiers in Expressions

Identifiers are used in expressions to refer to a variable, parameter, type, constant, namespace, typedef or function.

3.13.6 This and Base Accesss

Keyword this refers to the current class object and base refers to the base class object in a member a function context. They can be used for example to call a function of the same (this) or the base class (base) of the current class, or to refer to a member variable of the same (this) or the base class (base) of the current class.

3.13.7 Size-of Expression

A size-of expression yields the size of the specified object or type in bytes. The type of the result is long. The size of a class object may be bigger than the sum of the sizes of its members, because of alignment.

3.13.8 Typename Expression

A typename expression yields the dynamic type name of the specified object or type. The type of the result is const char*. If p is a pointer to class T object, the static type of p is T*, but the actual type of the object p points to can be U, a type derived from T. In this case U is the dynamic type of expression *p and expression typename(*p) returns the fully qualified name of type U.

3.13.9 Cast Expression

A cast expression performs explicit type conversion. It takes a target type enclosed in angle brackets and a source expression to be converted enclosed in parentheses. Any basic type excluding void can be explicitly converted to any other basic type excluding void. Any pointer type can be explicitly converted to any other pointer type. You can also cast away constness of an operand as follows. If p is of type const T*, cast<T*>(p) yields a plain pointer. Similarly, if c is of type const T&, cast<T&>(c) yields a plain reference.

3.13.10 Construction Expression

A construction expression constructs an object "in place" into a memory location. It takes the type of an object to construct enclosed in angle brackets, and then a list of arguments p, a₀, a₁, ... enclosed in parentheses, where p is a pointer to the memory location where into to construct the object and a₀, a₁, ... is a possibly empty list of arguments separated by commas. If T is the type of the object to construct, the type of p can be either void* or T*. The expression yields a T* result.

3.13.11 New Expression

A new expression creates an object of the specified type. If T is the type of object to create, sizeof(T) bytes of memory is allocated from the free store and then the object is constructed into that memory. The new expression takes the type of an object to construct and a possibly empty list of constructor arguments enclosed in parentheses as operands. It yields a pointer to the newly created object as a result.

3.14 Constant Expression

A constant expression is an expression that is sufficiently simple so that it can be evaluated at compile time. It can contain literals, constants, enumeration constants, operators that do not involve taking an address of an object and invocations of constexpr functions.

4 Statements

Statements are used in programs to define flow of control, to evaluate expressions for side-effects, to assign values to variables, to construct local variables, to explicitly release allocated resources, to test assertions and compile statements conditionally

4.1 Syntax

statement	→	labeled‑statement \| control‑statement \| expression‑statement \| empty‑statement \| assignment‑statement \| construction‑statement \| delete‑statement \| destroy‑statement \| assert‑statement \| conditional‑compilation‑statement
labeled‑statement	→	identifier : statement
control‑statement	→	compound‑statement \| return‑statement \| if‑statement \| while‑statement \| do‑statement \| for‑statement \| range‑for‑statement \| break‑statement \| continue‑statement \| goto‑statement \| switch‑statement \| goto‑case‑statement \| goto‑default‑statement
compound‑statement	→	{ statement* }
return‑statement	→	return expression? ;
if‑statement	→	if ( expression ) statement (else statement)?
while‑statement	→	while ( expression ) statement
do‑statement	→	do statement while (expression);
for‑statement	→	for ( for‑init‑statement expression? ; for‑loop‑expression ) statement
for‑init‑statement	→	assignment‑statement \| construction‑statement \| empty‑statement
for‑loop‑expression	→	assignment \| expression
range‑for‑statement	→	for ( type‑expr identifier : container ) statement
container	→	expression
break‑statement	→	break ;
continue‑statement	→	continue ;
goto‑statement	→	goto identifier ;
switch‑statement	→	switch ( expression ) { (case‑statement \| default‑statement)* }
case‑statement	→	(case constant‑expression :)+ statement*
default‑statement	→	default : statement*
goto‑case‑statement	→	goto case constant‑expression ;
goto‑default‑statement	→	goto default ;
expression‑statement	→	expression ;
empty‑statement	→	;
assignment‑statement	→	assignment ;
assignment	→	expression = expression
construction‑statement	→	type‑expr identifier ( = expression \| ( argument‑list ) )? ;
delete‑statement	→	delete expression ;
destroy‑statement	→	destroy expression ;
assert‑statement	→	# assert ( expression ) ;
conditional‑compilation‑statement	→	# if ( conditional‑compilation‑expression ) statement* ( # elif ( conditional‑compilation‑expression ) statement* )* ( # else statement* )? # endif
conditional‑compilation‑expression	→	conditional‑compilation‑disjunction
conditional‑compilation‑disjunction	→	conditional‑compilation‑conjunction (\|\| conditional‑compilation‑conjunction)*
conditional‑compilation‑conjunction	→	conditional‑compilation‑prefix (&& conditional‑compilation‑prefix)*
conditional‑compilation‑prefix	→	! conditional‑compilation‑prefix \| conditional‑compilation‑primary
conditional‑compilation‑primary	→	conditional‑compilation‑symbol \| ( conditional‑compilation‑expression )
conditional‑compilation‑symbol	→	identifier

4.2 Labeled Statement

A labeled statement consists of an identifier that acts as a target label for a goto statement, a colon, and a statement.

4.3 Control Statements

The control statements provide basic means for defining the flow of control of a function: sequence, selection and repetition.

4.3.1 Compound Statement

A compound statement executes a sequence of statements in order. A compound-statement is a statement so it can be used whenever syntax allows a statement to occur.

4.3.2 Return Statement

A return statement returns control from the currently executing function to the caller of that function. If the function has a return type that is not void, the return statement must have an expression, a return value, that is evaluated and returned to the caller. Otherwise, if the function is a constructor, a destructor, or a member function or nonmember function that has a void return type, the return statement is not allowed to contain a return expression.

4.3.3 If Statement

An if statement executes a statement conditionally. The if statement contains a bool-valued expression, a condition, that is evaluated. If evaluation of the condition results true, the statement following the condition of the if statement is executed and then control is transferred to the statement that comes after the if statement. Otherwise, if evaluation of the condition results false and the if statement has an else part, the statement following the else keyword is executed. If evaluation of the condition results false and the if statement has not an else part, control is transferred directly to the statement that comes after the if statement.

4.3.4 While Statement

A while statement executes a statement repeatedly as long as a bool-valued expression, the condition of the while statement, evaluates to true. If the condition evaluates true, control is transferred to the statement following the condition of the while statement. Then control is transferred back to evaluating the condition, and so on, until the condition evaluates false. Then control is transferred to the statement that comes after the while statement. The statement contained by the while statement is executed zero or more times.

4.3.5 Do Statement

A do statement executes a statement repeatedly until a bool-valued expression, the condition of the do statement, evaluates to false. First the statement following the do keyword is executed. Then the condition of the do statement is evaluated. If the condition evaluates true, control is transferred back to the statement following do keyword, and so on, until the condition evaluates false. Then control is transferred to the statement that comes after the do statement. The statement contained by the do statement is executed at least once.

4.3.6 For Statement

A for statement consists of a for-init-statement, a bool-valued expression, a for-loop-expression and a statement.

The for statement is commonly used as follows: the for-init-statement constructs a loop variable, the expression evaluates some condition that depends on the loop variable. As long as the condition evaluate to true, the statement contained by the for statement is executed, the for-loop-expression that manipulates the loop variable is executed, and control is transferred back to testing the condition.

For example, here's a for loop that prints integer 0 ... 9 to the standard output stream:

        for (int i = 0; i < 10; ++i)
        {
                Console.WriteLine(i);
        }

4.3.7 Range-for Statement

A range-for statement executes a statement for each element of a container. The range-for statement consists of a type and name of a local variable that is bound to each element of a sequence of elements in turn. When the element is bound to the local variable, the statement contained by the range-for statement is executed.

Here's an example program with various range-for statements.

The container of the range-for statement must refer to an object of a class that contains types or type aliases named Iterator and ConstIterator and member functions Begin(), End(), CBegin() and CEnd() that return those iterator types: Begin(), End() must return an Iterator, and CBegin() and CEnd() must return a ConstIterator. The iterator types, Iterator and ConstIterator, must support two operations: a dereference operator overload for accessing an element of a sequence, and an increment operator overload for incrementing the iterator. These requirements are fulfilled by arrays and all container and string classes in the system library. When these requirements are fulfilled by some user-defined class, it also supports the range-for statement.

If the container refers to a const object, the range-for statement

for (T x : c) stmt;

is lowered by the compiler to the following sequence of statements:

        ConstIterator e = c.CEnd();
        for (ConstIterator i = c.CBegin(); i != e; ++i)
        {
                T x = *i;
                stmt;
        }

If the container refers to a non-const object, the range-for statement

for (T x : c) stmt;

is lowered by the compiler to the following sequence of statements:

        Iterator e = c.End();
        for (Iterator i = c.Begin(); i != e; ++i)
        {
                T x = *i;
                stmt;
        }

4.3.8 Break Statement

A break statement terminates a while, do or for loop by transferring control to the statement after the looping statement. It is used also to terminate a case or default statement.

4.3.9 Continue Statement

A continue statement transfers control from inside of a while or do statement to the condition of the statement, and from inside of a for statement to the for-loop-expression of the for statement.

4.3.10 Goto Statement

A goto statement transfers control to a labeled statement.

4.3.11 Switch Statement

A switch statement consists of an condition expression that evaluates to an integral value, a nonnegative number of case statements and possibly a default statement. The condition of the switch statement is evaluated and control is transferred to a case statement with a matching value. If none of the case values match, and the switch statement contains a default statement, control is transferred to the default statement. Otherwise, if none of the case values match, but the switch statement does not contain a default statement, control is transferred to the statement coming after the switch statement.

4.3.12 Case Statement

A case statement has a number case values that must be integral compile-time constants and a possibly empty sequence of statements that are executed if the condition of the switch statement match one of the case values. The case statement must be terminated by a break, goto case, goto default, or return statement.

4.3.13 Default Statement

A default statement has a possibly empty sequence of statement that are executed if the condition of the switch statement does not match any of the case values. The default statement must be terminated by a break, goto case, or return statement.

4.3.14 Goto Case Statement

A goto case statement is used to transfer control from a case, or default statement to a case statement.

4.3.15 Goto Default Statement

A goto default statement is used to transfer control from a case statement to a default statement.

4.4 Expression Statement

An expression statement evaluates an expression but the possible result of the evaluation is not used. Instead the expression is evaluated for its side-effects. The evaluated expression can for example increment or decrement a variable, or call a function, delegate or class delegate.

4.5 Empty Statement

An empty statement does nothing. It can be used when the syntax requires a statement but there's nothing to be done.

4.6 Assignment Statement

An assignment statement evaluates the expression that is on the right-hand side of the assignment operator and assigns it to an lvalue expression that is on the left-hand side of the assignment operator. An lvalue expression can be for instance a variable, a dereferenced pointer or iterator, or a reference-valued result of a function call.

4.7 Construction Statement

A construction statement declares a local variable into its scope, allocates memory for it from the stack frame of the current function² and sets an initial value for it. The local variable will have given type and name. It is initialized to the value given on the right side of the assignment operator symbol or resulting of a call to a constructor with arguments enclosed in parenthesis. If no initializer is given, the variable will be default initialized.

4.7.1 Declaring a Construction Statement with a Placeholder Type

The local variable can also be declared with a placeholder type auto in which case the actual type is deduced from the type of the initializer. An auto variable must always have an initializer.

For example in the following code that contains a range-for statement

        List<string> cars;
        cars.Add("Honda");
        cars.Add("Jaguar");
        cars.Add("Porsche");
        for (const auto& car : cars)
        {
                Console.WriteLine(car);
        }

the type of local variable car is deduced to be

const System.Collections.List<System.String<char>>&

4.8 Delete and Destroy Statements

If p is a pointer to type T object that is created using the new expression, statement delete p calls the destructor of T (if T is a class type having one), and then frees the memory allocated for the object back to the free store. It is not an error to call delete for a null pointer.

If p is a pointer to type T object, statement destroy p calls the destructor of T, but does not free any memory. It is not an error to call destroy for a type T that has no destructor. In that case the statement has no effect.

4.9 Assert Statement

An assert statement can be used to test bool-valued conditions that should always hold in a valid program. If an assertion expression evaluates to false, an error message "assertion failed" with a function name, source file name and line number along with stack trace is printed to the standard error stream of the process, and the execution of the process is ended with exit code 254. Assert statements have no effect in a program compiled using the release configuration.

4.10 Conditional Compilation Statement

A conditional compilation statement includes statements to be compiled conditionally depending on symbols supplied from the command line or from the IDE.

The #if part contains a conditional compilation expression that is evaluated. If the expression evaluates to true the statements following the expression are included in the compilation. Otherwise an expression contained by one of the #elif parts is evaluated. As soon as one of them evaluates to true, the statements following that expression are included in the compilation. Otherwise, if neither #if or any #elif part expressions evaluates to true, and the statement has an #else part, the statements in the #else part are included in the compilation.

A conditional compilation expression can be a

disjunction that evaluates to true if its left expression or its right expression evaluates to true
conjunction that evaluates to true if its left expression and its right expression evaluates to true
logical not expression that evaluates to true if its operand expression evaluates to false, or a
primary expression that evaluates to true if the conditional compilation symbol operand is defined and false otherwise.

5 Access Specifiers

Access specifiers are used to grant or reject access to a program entity from another program entity.

5.1 Syntax

access‑specifier

→

public | protected | private | internal

public access grants access from everywhere.
protected access grants access for an entity defined within the same or derived class and rejects it for other entities.
private access grants access for an entity defined within the same class and rejects it for other entities.
internal access grants access for an entity defined within the same program or library and rejects it for other entities.

If no access specifier is given, the default access for a namespace-level entity is internal access, and for a class-level entity it is private access.

6 Type Expressions

Type expressions are used for declaring the type of a variable, parameter or constant, or the return type of a function.

6.1 Syntax

type‑expr	→	prefix‑type‑expr
prefix‑type‑expr	→	const postfix‑type‑expr \| postfix‑type‑expr
postfix‑type‑expr	→	primary‑type‑expr ( member \| pointer \| rvalue‑ref \| lvalue‑ref \| array )*
member	→	. identifier
pointer	→	*
rvalue‑ref	→	&&
lvalue‑ref	→	&
array	→	[ constant‑expression? ]
primary‑type‑expr	→	basic‑type \| template‑id \| identifier \| auto
template‑id	→	qualified‑id < type‑expr ( , type‑expr )* >

6.2 Primary Type Expressions

A primary type expression is a name of a basic or user-defined type, or a name of a class template with type arguments.

If the type expression occurs in the context of construction statement, the primary type expression can also be a placeholder type auto. See reference to declaring a construction statement with a placeholder type.

6.3 Const Qualifier

A const qualifier indicates an intent that an object is not ment to be changed. For example, type expression const T&; declares a reference to constant T type, and const T* declares a pointer to constant T type.

6.4 Type Member Access

A member access in a type expression is used for accesssing a member of a namespace or a class type. For example, if N is a namespace that contains type T, type expression N.T * declares a pointer to type T in namespace N type.

6.5 Pointer Types

A pointer qualifier is used for declaring pointer types. The value of a pointer object is the memory address of the object it points to or it can be a special null value that means it does not point to any object. The default value of a pointer type object is null.

A simple pointed-to object is accessed using the dereference expression. A member of a class object is accessed using the pointer member access expression.

6.5.1 Generic Pointer Type

The void* type represents a generic point type. Its value is a memory address or null value just like the value of other pointer types but it does not have information about the type of an object it points to, it is a bare memory address. Other pointer types have an implicit conversion to the generic pointer type and the generic pointer type can be explicitly casted to any other pointer type. The default value of a generic pointer type object is null.

To support systems programming, a void* value can be explicitly converted to a ulong value, and a ulong value can be explicitly converted to a void* value. Also a void* value can be explicitly converted to any delegate type value, and any delegate type value can be explicitly converted to a void* value.

6.6 Reference Types

The lvalue and rvalue reference qualifiers are used for declaring reference types. A reference always refers to some other object, it has no default value and it cannot be null. A reference is bound to the object it refers to when it is created and it always refers to the same object for the whole lifetime of it. When a reference is used as the left operand of an assignment, a new value is assigned to the object the reference refers to. When it is used as an operand in an expression, the value of the object the reference refers to is retrieved. In that sense it can be throught of as being an "automatically dereferenced pointer".

6.6.1 Lvalue Reference Type

Lvalue reference types are typically used for passing parameters by reference or by reference-to-const, or returning a reference to a member variable. Only a variable, a derefenced pointer or an array element can be bound to a nonconstant reference parameter. A literal can also be bound to a parameter that is a reference-to-const. In that case the literal value is assigned to a temporary variable that is passed as a parameter.

6.6.2 Rvalue Reference Type

Rvalue reference types are used for implementing move semantics.

6.7 Array Types

An array qualifier is used for declaring an array type. An array is a sequence of objects of the same type. If that type is a pointer type the objects pointed to can actually be of different types though. An expression inside the square brackets is the length of the array. It is optional for constant arrays for which the compiler can calculate the length of the array from its initializing expression. If the length of the constant array is specified it must match the length of the array initializer.

If a denotes an array object, the length of it can be obtained using a.Length() notation. The type of the array length is long.

Multidimensional arrays can be declarated as arrays of arrays, for example, int[3][3] x, defines x as a two-dimensional array of nine integers.

Elements of an array are accessed using the subscript expression.

7 Constants

A constant represents either a simple value of a basic or enumerated type, or an array or structure of values that the compiler must be able to evaluate at compile time.

7.1 Syntax

constant	→	access‑specifier? const constant‑type constant‑name = constant‑value ;
constant‑type	→	type‑expr
constant‑name	→	identifier
constant‑value	→	constant‑expression

7.2 Constant Values

The value of a constant must be a constant‑expression that means it must be sufficiently simple so that the compiler can evaluate it at compile time. It cannot use free store so it cannot contain a new expression for example. However, it can contain invocations of constexpr functions with constant arguments. Some algorithms in the system library are declared constexpr.

7.3 Examples

        public class LiteralClass
        {
                public constexpr LiteralClass(int x_, double y_) : x(x_), y(y_)
                {
                }
                public constexpr int X() const
                {
                        return x;
                }
                public constexpr double Y() const
                {
                        return y;
                }
                private int x;
                private double y;
        }

        public const int meaningOfLife = 42;
        public const double pi = 3.1415926;
        public const int[] numberArray = [1, 2, 3];
        public const long lengthOfNumberArray = numberArray.Length();
        public const LiteralClass literalClass = {1 + 2 * 3, Min(pi, 4.13)};
        public const double y = literalClass.Y();

8 Global Variables

A global variable represents a namespace-level variable that has a type and a name and possibly an access specifier and an initializer. A global variable has a modifiable value of its type.

8.1 Syntax

global‑variable	→	access‑specifier? global‑variable‑type global‑variable‑name (= global‑variable‑initializer)? ;
global‑variable‑type	→	type‑expr
global‑variable‑name	→	identifier
global‑variable‑initializer	→	constant‑expression

8.2 Access

A global variable may have an access specifier. If it has no access specifier, the default access is private. The private access means that the variable is accessible only from within the same compile unit it is defined. There can be many privately accessible global variables defined in different compile units with the same name. If the access is public or internal the variable with the given name must be unique within the program.

8.3 Initialization

A global variable may have an initializer. The initializer must be a constant expression. If a global variable has no initializer, it will be default-initialized (a.k.a zero-initialized). Global variables support only static (compile-time) initialization. If you need a variable with dynamic initialization, consider using a class with a static constructor that intializes a static member variable of that class. Static constructors provide dynamic thread-safe one-time initialization.

8.4 Example

For an example of global variables, see globalvar.cm.

9 Enumerations

An enumerated type defines a user-defined type that contains a list of enumeration constants.

9.1 Syntax

enumerated‑type	→	access‑specifier? enum enumerated‑type‑name underlying‑type? { enumeration‑constants }
enumerated‑type‑name	→	identifier
underlying‑type	→	: type‑expr
enumeration‑constants	→	enumeration‑constant (, enumeration‑constant)*
enumeration‑constant	→	enumeration‑constant‑name ( = enumeration‑constant‑value )?
enumeration‑constant‑name	→	identifier
enumeration‑constant‑value	→	constant‑expression

9.2 Underlying Type

Enumerated type has an underlying type that must be, if specified, a basic integer type. If the underlying type is not specified, int is used as the underlying type.

There's an implicit conversion from the enumerated type to the underlying type, and an explicit conversion (cast) from the underlying type to the enumerated type. For an example, see NextMonth function and setting permissions in the examples.

9.3 Enumeration Constants

The value of an enumeration constant, if specified, must be a constant expression that evaluates to a value that is convertible to the underlying type of the enumerated type. If the value of the first enumeration constant is not specified, it has value zero. If the value of any other enumeration constant is not specified, it has the value of the preceding enumeration constant plus one.

Enumeration constants are accessed with EnumType.enumConstant syntax.

9.4 Examples

        public enum TrafficLight
        {
                green, yellow, red
        }

        public enum Month : sbyte
        {
                january = 1, february, march, april, may, june, july, august, september, october, november, december
        }

        public enum Permission : byte
        {
                read = 1u << 0u;
                write = 1u << 1u;
                execute = 1u << 2u;
        }

        public Month NextMonth(Month month)
        {
                return cast<Month>(month % 12 + 1);
        }

        void main()
        {
                TrafficLight light = TrafficLight.green;
                Month month = Month.january;
                Month next = NextMonth(month);
                Permission permissions = cast<Permission>(Permission.read | Permission.write);
        }

10 Functions

A function represents computation that can be invoked at run time and in simple cases also at compile time. The computation is defined by the body of the function. The body is a compound statement that contains statements that define the control flow of the function. Invocation of a function may yield a value, or the function can be a void function that is invoked for its side-effects.

10.1 Syntax

function	→	attributes? function‑specifiers return‑type function‑group‑id template‑parameter‑list? parameters where‑constraint? (compound‑statement \| ;)
function‑specifiers	→	(access‑specifier \| constexpr \| inline \| cdecl \| extern)*
return‑type	→	type‑expr
function‑group‑id	→	identifier \| operator‑function‑group‑id
operator‑function‑group‑id	→	operator (<< \| >> \| == \| = \| < \| −> \| ++ \| −− \| + \| − \| * \| / \| % \| & \| \| \| ^ \| ! \| ~ \| [] \| ())
template‑parameter‑list	→	< template‑parameter ( , template‑parameter )* >
template‑parameter	→	identifier (= default‑value )?
default‑value	→	type‑expr
parameters	→	( parameter‑list? )
parameter‑list	→	parameter (, parameter)*
parameter	→	parameter‑type parameter‑name?
parameter‑type	→	type‑expr
parameter‑name	→	identifier

10.2 Regular Function

A regular function has no other specifiers than a possible access specifier.

For an example of a regular function, see strlen.cm.

10.3 Constexpr Specifier

The constexpr specifier enables compile time evaluation of a function when the function arguments are constant expressions. If the evaluation succeeds, the function call is substituted by the compile time constant that is the result of the evaluation. A constexpr function can be used also in cases when the arguments are not constant expressions. In those cases the function is called just like a regular function. This saves from having to write two versions of the same function, a constexpr and a non-constexpr version.

A constexpr function cannot have the following kinds of statements: goto, range-for, delete, destroy, switch, case, default, goto case, goto default, or a conditional compilation statement.

Often constexpr specifier is combined with inline and because constexpr functions are typically short.

For an example of a constexpr function, see align.cm.

10.4 Inline Specifier

The inline specifier is a hint to the compiler that the function should be inlined when the program is compiled using release configuration. When a function is inlined, a function call is substituted by the body of the function, so it is best used for short functions. Function inlining avoids a function call overhead and increase other optimizing opportunities so that the generated code can be adapted to the arguments of the function call. Function inlining helps the compiler generating efficient code.

10.5 CDecl Specifier

The cdecl specifier disables name mangling. Name mangling makes it possible to have many functions with the same group name but different parameter types. When name mangling is disabled there can be only a single function with the given group name in the whole program and all libraries it uses. The cdecl specifier can ease interoperability with libraries written in other languages.

10.6 Extern Specifier

The extern specifier makes it possible to call a function written in another language from a Cmajor program. It is almost always used with the cdecl specifier to prevent the compiler from mangling the name of the function. An extern function cannot have a body.

10.7 Function Templates

When the function definition has a template parameter list and possibly a constraint, it defines a function template. The compiler instantiates, or creates a specialized version from the function template for each different set of argument types. When a function is instantiated, the template parameters of the function template are substituted with concrete types and the function is compiled using those substituted types. The concrete types can be specified in the function call by using a template-id, or if not specified, they can be automatically deduced by the compiler from the argument types of the function call.

Function templates participate in overload resolution. When two or more function templates have the same group name and number of parameters but different constraints they are overloaded based on those constraints. The compiler checks the constraints in the function call site by substituting the type parameters in the contraints by the concrete types in the function call and accepts or rejects overloads according to whether the constraints are satisfied or not. If two or more function templates satisfy the constraints, the compiler selects the one having most strict constraints.

For example, see these two Next function templates in the system library. If a Next function is called with a type that satisfies both the ForwardIterator concept and the RandomAccessIterator concept, the random access function overload is called because RandomAccessIterator refines BidirectionalIterator concept that refines ForwardIterator concept.

For an example of a function template, see min.cm.

10.8 Intrinsic Functions

The System.Meta is a special namespace that contains intrinsic functions. Intrinsic functions can be called just like regular functions, but they are special in the sense that they are not compiled from source code as regular functions are, but they are implemented internally inside of the compiler. The intrinsic functions in the System.Meta namespace take a type parameter and return information about the type supplied. They can be used in the predicate constraint expressions.

10.9 System-default Functions

A function marked with a system_default attribute is called a system-default function. The function overload resolution as described in the next section, is done in two phases: In the first phase the compiler ignores functions marked with the system_default attribute. If the set of viable functions is not empty, a best-matching function is chosen as described in the next section. Only if the set of viable functions is empty, the compiler proceeds to the second phase. In the second phase the compiler includes also functions marked with the system_default attribute as viable functions, and selects the best-matching function as described in the next section. This feature is used in the system library to provide default implementations of output operators for container types. Those default implementations can be overridden by user-defined output operators.

10.10 Function Overload Resolution

Overload resolution is done in three stages:

First the compiler searches the symbol table for viable functions. A viable function is a function that has given group name and arity.
Then the set of viable functions is filtered to form a set of overload candidates.

If the viable function is not a function template, it becomes an overload candidate if there exists a valid sequence of conversions from each argument type to the corresponding parameter type of the viable function. The sequence of needed conversions is memorized for the final stage.

If the viable function is a function template, the argument types are bound to the template parameters of the function template. In this case the viable function can be rejected because arguments cannot be bound or there is no valid conversion sequence for a bound parameter.

If the viable function is a constrained function template, after successfully binding the argument types, the constraint is checked by substituting parameter types in the constraint with bound argument types. In this case the viable function can be rejected because the constraint is not satisfied. Otherwise the viable function survives to be an overload candidate and its constraint is memorized for the final stage.
Finally, if there are no overload candidates survived to this stage, the compiler issues an overload not found error. Otherwise, the set of overload candidates is sorted according to ordering rules that take into account the number, kind and distance of needed conversions and a possible constraint for those overload candidates that are function templates. If a single best overload is found according to the ordering rules, the function call resolves to it, otherwise the compiler issues an ambiguous overload resolution error and lists equally good overload candidates.

Actually the overload resolution can be issued maximum of three times with different argument combinations to resolve a single function call, because member functions have an implicit this parameter that can be bound to the receiver or to the current this parameter if the call is issued from inside a member function.

The overload candidates are sorted according to the ordering rules that are informally described as follows: When comparing two overload candidates a and b:

First compare the argument conversions to a and b: For all arguments, count the number of better argument conversion to a compared to b. If the number of better argument conversions to a is greater than to b, a is better than b. The better argument conversion means that no conversion is better than a conversion, and when both have a conversion, a smaller conversion distance is better than greater distance.

Then, if a is not a function template and b is a function template, a is better than b. This is to ensure we don't have to instantiate the same function template for the same argument types many times.

Then, if a is not a function template specialization and b is a function template specialization, a is better than b.

Then, if a is a constrained function template and b is an unconstrained function template, a is better than b.

Finally, if a and b is are constrained function templates and the constraint of a is more strict than the constraint of b, a is better than b.

11 Type Aliases

A typedef or using alias introduces an alias name for a type expression.

11.1 Syntax

typedef	→	access‑specifier? typedef type‑expr identifier ;
using alias	→	access‑specifier? using identifier = type‑expr ;

11.2 Examples

        using string = String<char> ;
        public typedef String<wchar> wstring;
        public typedef String<uchar> ustring;

12 Classes

A class definition defines a user-defined type. It may contain member functions, member variables, types, type aliases and constants. It may have a base class and it can implement any number of interfaces.

12.1 Syntax

class	→	attributes? class‑specifiers class class‑name template‑parameter‑list? inheritance‑and‑implemented‑interfaces? where‑constraint? { class‑content }
class‑specifiers	→	(access‑specifier \| abstract \| static)*
class‑name	→	identifier
inheritance‑and‑implemented‑interfaces	→	: base‑class‑or‑interface (, base‑class‑or‑interface)*
base‑class‑or‑interface	→	template‑id \| qualified‑id
class‑content	→	class‑member*
class‑member	→	static‑constructor \| constructor \| destructor \| member‑function \| conversion‑function \| member‑variable \| typedef \| class \| constant \| enumerated‑type \| delegate \| class‑delegate
static‑constructor	→	attributes? static class‑name () initializer‑list? where‑constraint? (compound‑statement \| ;)
initializer‑list	→	: initializer (, initializer)*
initializer	→	this‑initializer \| base‑initializer \| member‑variable‑initializer
this‑initializer	→	this ( argument‑list )
base‑initializer	→	base ( argument‑list )
member‑variable‑initializer	→	member‑variable‑name ( argument‑list )
member‑variable‑name	→	identifier
constructor	→	attributes? constructor‑specifiers class‑name parameters initializer‑list? where‑constraint? (compound‑statement \| ;)
constructor‑specifiers	→	(access‑specifier \| constexpr \| explicit \| inline \| default \| suppress)*
destructor	→	attributes? destructor‑specifiers ~ class‑name () where‑constraint? (compound‑statement \| ;)
destructor‑specifiers	→	(access‑specifier \| virtual \| override \| default)*
member‑function	→	attributes? member‑function‑specifiers return‑type function‑group‑id parameters const? where‑constraint? (compound‑statement \| ;)
member‑function‑specifiers	→	(access‑specifier \| constexpr \| static \| abstract \| virtual \| override \| inline \| default \| suppress \| new)*
conversion‑function	→	attributes? conversion‑function‑specifiers return‑type () const? where‑constraint? (compound‑statement \| ;)
conversion‑function‑specifiers	→	(access‑specifier \| constexpr \| inline)*
member‑variable	→	attributes? member‑variable‑specifiers member‑variable‑type member‑variable‑name ;
member‑variable‑specifiers	→	(access‑specifier \| static)*
member‑variable‑type	→	type‑expr

12.2 Regular Classes

A regular class may have each kind of class member. It may have a base class and can implement interfaces.

12.3 Abstract Classes

An abstract class is a class that cannot be created an instance of. Typically it is a base class of a class hierarchy, or an intermediate class that derives from a base class but is not a concrete class because it has or inherits an abstract member function. An abstract class may have each kind of class member. Especially it may have abstract member functions but does not have to. If a class has or inherits an abstract member function it must explicitly be declared abstract.

12.4 Static Classes

A static class can have only

a static constructor,
static member variables,
static member functions,
type aliases,
constants,
enumerated types,
delegate types and
class delegate types.

For an example of a static class, see console.cm.

12.5 Class Members

A class can contain a static constructor, constructors, a destructor, other member functions, member variables, type aliases, classes, constants, enumerated types, delegate types and class delegate types.

12.5.1 Static Constructor

A static constructor is special member function that is used for initializing static member variables of the class it belongs to. The name of the static constructor must be the name of the class that contains it.

A static constructor is typically used in classes that need one-time initialization. An example of such a class is a singleton. Each static constructor is guarded by a recursive mutex and a Boolean flag provided by the Cmajor runtime. They ensure together that initialization is thread-safe and happens exactly once. A static constructor gets executed before control arrives to the body of a constructor or a static member function of the same class, or before a static member variable of a class is accessed from outside of the class.

For an example of a singleton class that has a static constructor take a look at phonebook.cm. The singleton is accessed using the static Instance() member function. Before Instance() returns a reference to a static instance member variable, the static constructor gets executed if it has not already been executed. The static constructor creates an instance of the PhoneBook class that assigns it to a static instance member variable. Creating an instance of the PhoneBook class involves calling the default constructor of the class that in turn calls the static constructor. But this time the initialization flag provided by the runtime has already been set so control returns from the static constructor right away.

In the start of the static constructor the language implementation checks whether the initialization flag is set. If it has been set the control returns and the body of the static constructor is not executed. Otherwise the language implementation next locks the recursive mutex, and then checks again whether the the initialization flag is set and unlocks the mutex and returns if it is set. This is called the double checked locking pattern. Otherwise the initialization has not yet been done, so the language implementation sets the initialization flag to true, executes the body of the static constructor, unlocks the mutex and returns.

If the class having a static constructor has a member variable that has a nontrivial destructor, the class should implement a destructor. If the destructor does not have other things to do besides calling destructors of the member variables, this can be done by using a default destructor:

public class MyClass
{
// ...
public default ~MyClass();
}

12.5.2 Constructors

A constructor is a special member function that creates an object of the class it belongs to. The task of a constructor is to initialize member variables and allocate resources. Memory is also considered as a kind of resource. The name of the constructor must be the name of the class that contains it.

Initializer List

Member variables and a possible base class object can be initialized by using an initializer list. An initializer list consists of initializers separated by commas. There are three kinds of initializers:

A this initializer delegates the initialization of some of the member variables to another constructor of the same class. The constructor called is resolved using overload resolution and takes the number and type of arguments of the this-initializer into account. There can be at most one this-initializer in an initializer list and if an initializer list has a this-initializer it cannot have a base-initializer.
A base initializer calls a constructor of the base class to initialize inherited member variables. The base class constructor called is resolved using overload resolution and takes the number and type of arguments of the base-initializer into account. There can be at most one base-initializer in an initializer list and if an initializer list has a base-initializer it cannot have a this-initializer.
A member variable initializer initializes a member variable of given name with supplied arguments. If the member variable is of a class type, a constructor of that class is called with given arguments. If there is no member variable initializer for a specific member variable in the initializer list, that member variable will be default initialized, copied or moved regarding whether we are in a default or other constructor, in a copy constructor or in a move constructor respectively.

There are three kinds of special constructors. A default constructor, a copy constructor and a move constructor.

Default Constructor

The default constructor has signature

ClassName()

The purpose of it is to default initialize an object of the class it belongs to. If a class has no user-defined default constructor, the compiler will generate one if it is needed. The generated default constructor will call the default constructor of the possible base class and it will default initialize all member variables of the class.

User can request the automatic generation of the default constructor by using the default keyword:

public default ClassName();

The default default constructor cannot have a body.

User can suppress the automatic generation of the default constructor by using the suppress keyword:

suppress ClassName();

The suppressed default constructor cannot have a body. The compiler generates an error if trying to call a suppressed default constructor.

A class may have at most one default constructor.

Copy Constructor

The copy constructor has signature

ClassName(const ClassName&)

It takes another object of the same class as an lvalue-reference-to-const parameter and typically copies members from that object by using an initializer list. If a class has no user-defined copy constructor, the compiler will generate one if it is needed. The generated copy constructor will copy the value of a possible base class object and the values of all member variables from another object of the same class.

User can request the automatic generation of the copy constructor by using the default keyword:

public default ClassName(const ClassName&);

The default copy constructor cannot have a body.

User can suppress the automatic generation of the copy constructor by using the suppress keyword:

suppress ClassName(const ClassName&);

The suppressed copy constructor cannot have a body. The compiler generates an error if trying to call a suppressed copy constructor.

A class may have at most one copy constructor.

Move Constructor

The move constructor has signature

ClassName(ClassName&&)

It takes another object of the same class as a rvalue-reference parameter and typically moves members from that object by using an initializer list or implementing the move in the body of the move constructor.

User can request the automatic generation of the move constructor by using the default keyword:

public default ClassName(ClassName&&);

The default move constructor cannot have a body.

User can suppress the automatic generation of the move constructor by using the suppress keyword:

suppress ClassName(ClassName&&);

The suppressed move constructor cannot have a body. The compiler generates an error if trying to call a suppressed move constructor.

A class may have at most one move constructor.

Compiler-Generated Initializing Actions

Either if the class has a user-defined constructor or a compiler-generated one, the language implementation will complete it with the following actions:

If the class has a static constructor, that will be called first.
If the constructor has an initializer list that contains a this-initializer, it is called.
Otherwise, if the constructor has an initializer list that contains a base-initializer, it is called. It will in turn perform these same actions for the base class object.
Otherwise, if the class has a base class, the base class constructor is called. It will in turn perform these same actions for the base class object.
If the class is polymorphic and the this-initializer has not been called, the language implementation will now set the VMT pointer of the current class being constructed to the VMT pointer field of the current class object. This ensures that if a user-defined constructor calls virtual member functions from its body, they are dispatched to member functions of the current class or its base class or ancestor class. If they would be dispatched to the derived class member functions, they would have access to the not yet constructed member variables of the derived class.
For each member variable of the class in declaration order, if the member variable has an initializer, it is called, otherwise the member variable is default initialized, copied, or moved regarding whether we are in a default or other constructor, in a copy constructor, or in a move constructor respectively.
If the constructor is a user-defined one, the body of the constructor is executed.

Constructor Specifiers

A constructor may have the following specifiers:

An access specifier.
constexpr. A constexpr constructor is used in literal classes.
explicit. A constructor of class A that is not a copy or move constructor but has a single parameter of type T is used normally to convert an argument of type T to an object of a class A in overload resolution. By using the explicit keyword, the programmer can say that argument of type T will not be automatically converted to class type A.
inline. This is a hint to the compiler that the constructor should be inlined if possible.
default. Can only be used for the default constructor, copy constructor and move constructor.
suppress. Can only be used for the default constructor, copy constructor and move constructor. The compiler generates an error if trying to call a suppressed constructor.

12.5.3 Destructor

The purpose of the destructor is to release allocated or otherwise obtained resources. Memory is also considered as a kind of resource. The name of the destructor must be the name of the class that contains it.

If a class does not have a user-defined destructor and either the class is polymorphic or it has a member variable that has a nontrivial destructor, the compiler will generate a destructor for the class. If the class has a polymorphic base class, the generated destructor will be set as overridden, otherwise, if the class is polymorphic, it will be set virtual, otherwise it will remain as a regular destructor.

Either if the class has a user-defined destructor or a compiler-generated one, the language implementation will complete it with the following actions:

If the class is polymorphic the language implementation will first set the VMT pointer of the current class being destroyed to the VMT pointer field of the current class object. This ensures that if a user-defined destructor calls virtual member functions from its body, they are dispatched to member functions of the current class or its base class or ancestor class. If they would be dispatched to the derived class member functions, they would have access to the already destroyed member variables of the derived class.
If the destructor is a user-defined one, the body of the destructor is executed.
For each member variable of the class in reverse declaration order, if the member variable is of a class that has a nontrivial destructor, the member variable destructor is called.
If the class has a base class that has a nontrivial destructor, the base class destructor will be called that in turn performs these same actions.

A destructor may have the following specifiers:

public. Destructor cannot be private, protected or internal.
virtual. A base class of a class hierarchy should have a virtual destructor.
override. A class that directly or indirectly derives from a class that has a virtual or overridden destructor may override the destructor.
default. If the destructor does not have other actions to do in addition to destroying member variables and a base class object, the destructor may be declared default. It may also be declared virtual or overridden in this case. A default destructor may not have a body.

A class may have at most one destructor.

12.5.4 Member Functions

A member function can use nonstatic and static member variables when it performs its job.

There are two kinds of special member functions left: a copy assignment operator and a move assignment operator:

Copy Assignment Operator

The copy assignment operator has signature

void operator=(const ClassName&)

It takes another object of the same class as an lvalue-reference-to-const parameter. If a class has no user-defined copy assignment operator, the compiler will generate one if it is needed. The generated copy assignment operator will assign the base class object and member variables from the passed argument.

User can request the automatic generation of the copy assignment by using the default keyword:

public default void operator=(const ClassName&);

The default copy assignment cannot have a body.

User can suppress the automatic generation of the copy assignment by using the suppress keyword:

suppress void operator=(const ClassName&);

The suppressed copy assignment cannot have a body. The compiler generates an error if trying to call a suppressed copy assignment.

A class may have at most one copy assignment operator.

Move Assignment Operator

The move assignment operator has signature

void operator=(ClassName&&)

It takes another object of the same class as a rvalue-reference parameter. If a class has no user-defined move assignment operator, the compiler will generate one if it is needed. The generated move assignment operator will call the move assignment operator of the base class with the base class object of the argument and it will swap all member variables of the current class object with the member variables of the argument.

User can request the automatic generation of the move assignment by using the default keyword:

public default void operator=(ClassName&&);

The default move assignment cannot have a body.

User can suppress the automatic generation of the move assignment by using the suppress keyword:

suppress void operator=(ClassName&&);

The suppressed move assignment cannot have a body. The compiler generates an error if trying to call a suppressed move assignment.

Member Function Specifiers

A member function can have the following specifiers:

An access specifier.
constexpr. A constexpr member function is used literal classes.
static. A static member function can use static member variables when it performs its job.
abstract. An abstract member function must be overridden by member functions of the same name in derived classes. An abstract member function cannot have a body.
virtual. A virtual member function may be overridden by member functions of the same name in derived classes.
override. Each inherited abstract member function of an abstract base or ancestor class must be overridden in a derived concrete class. Each virtual and overridden member function of a base or ancestor class may be overridden in the derived class.
inline. This is a hint to the compiler that the member function should be inlined if possible.
default. A copy assignment or move assignment operator may be declared default. A default copy assignment or move assignment may not have a body.
suppress. A copy assignment or move assignment operator may be suppressed. A suppressed copy assignment or move assignment may not have a body. The compiler generates an error if trying to call a suppressed member function.
new. A member function that has the same name as some abstract, virtual or overridden member function of its base or ancestor class may be declared new. A member function declared new does not participate in virtual call dispatch. It may not be abstract, virtual or overriddden at the same time.
const. A member function that does not change the member variables may be declared const.

12.5.5 Member Variables

A member variable has specified type and name. A member variable may have an access specifier and it may be declared static. Static member variables can be initialized in a static constructor. Nonstatic member variables are also called instance variables. They can be initialized in a constructor.

12.6 Literal Classes

A class whose constructors and member functions are declared constexpr is called a literal class. When arguments to such functions are constant expressions, they can be evaluated at compile time. The constructors and member functions of a literal class can be used also when arguments are not constant expressions. In that case they behave the same way as regular constructors and member functions. For an example of a literal class, see point.cm.

12.7 Class Hierarchies

In Cmajor a class may have a single base class. If class B is the base class of class A we say also that A derives from B, and that A inherits members from B. When A derives from B we may think that A is-a-kind-of B. Inheritance relationship allows us to build hierarchies of classes some of which are base classes and others inheriting from them. If class A derives from class B, A may override some or all the abstract, virtual, or already overridden member functions of B.

For an example of a class hierarchy, see vehicles.cm. It contains a hierarchy of vehicle classes: bicycle and car are vehicles, and truck is a kind of car.

12.8 Class Templates

When the class definition has a template parameter list and possibly a constraint, it defines a class template. The compiler instantiates, or creates a specialized version from the class template for each different set of template argument types. An instantiated class template is called a class template specialization. When a class template is instantiated, the template parameters of the class template are substituted by specified concrete types and the declarations of the class are type checked using those substituted types. Those concrete types can be specified by using a template-id. When a member function of a class template specialization is called, the compiler instantiates it using the specified types. The compiler instantiates automatically the following member functions for each class template specialization:

a possible destructor,
virtual and overridden member functions,
only those member functions that are called.

For an example of a class template, see unique_ptr.cm. It's a smart pointer that implements unique ownership.

12.9 Full Instantiation Requests

Sometimes the compiler fails to instantiate all needed member functions of a class template specialization. This happens especially when the class template inherits from an abstract base class and the derived class template overrides some member function(s) of the abstract base class. These instantiation failures manifest themselves as linker errors.

To get rid of the linker errors, the programmer may ask the compiler to instantate all member functions of a class template by using a full instantiation request. The full instantiation request reuses keywords new and class. It should be placed inside a namespace scope or to the global namespace. The syntax is as follows:

		new class template-id;

The primary class template and the template arguments may be either user-defined classes or classes defined in the system library.

For example having a class template System.Counter that has an overridden Dispose member function. If the compiler fails to instantiate the Dispose function for a specialization System.Counter<MyClass>, the programmer may place a full instantiation request inside some namespace:

		namespace MyNamespace
		{
			new class System.Counter<MyClass>;
		}

13 Interfaces

An interface is a list of member function signatures that describe some behaviour. A class can implement any number of interfaces. When a class implements an interface it must provide implementation for the member functions contained by the interface.

13.1 Syntax

interface	→	attributes? access‑specifier? interface interface‑name { interface‑content }
interface‑name	→	identifier
interface‑content	→	interface‑member‑function*
interface‑member‑function	→	return‑type interface‑member‑function‑name parameters ;
interface‑member‑function‑name	→	identifier

The runtime representation of an interface object is a couple of pointers. It is best passed around by value (not by reference).

See components.cm example for some possible design with an interface.

This interface.cm example illustrates constructing an interface object from class object and from a pointer to a class object.

14 Delegates

A delegate type represents a function signature type.

14.1 Syntax

delegate	→	delegate‑specifiers delegate return‑type delegate‑name parameters ;
delegate‑specifiers	→	(access‑specifier)*
delegate‑name	→	identifier

An object of a delegate type can be bound to a nonmember function or a static member function by assigning a name of that function to the delegate type object. If dlg is an object of a delegate type, the function that is currently bound to the dlg object can be called using syntax dlg(arg1, arg2, ...). For an example of a delegate, see delegate.cm.

14.2 Conversions

To support systems programming, a delegate type value can be explicitly converted to a void* value, and a void* value can be explicitly converted to a delegate type value.

15 Class Delegates

A class delegate type represents a member function signature type.

15.1 Syntax

class‑delegate	→	class‑delegate‑specifiers class delegate return‑type class‑delegate‑name parameters ;
class‑delegate‑specifiers	→	(access‑specifier)*
class‑delegate‑name	→	identifier

An object of a class delegate type can be bound to a specific member function of a specific class object. If clsdlg is an object of a class delegate type, the member function that is currently bound to the clsdlg object can be called using syntax clsdlg(arg1, arg2, ...). For an example of a class delegate, see class_delegate.cm.

The runtime repreasentation of a class delegate object is a couple of pointers. It is best passed around by value (not by reference).

16 Concepts

A concept is a named collection of requirements for a type or for a group of types. Those requirements are called constraints. Constraints can be thought of as Boolean expressions, or predicates, that operate on properties of types.

Concepts are checked as part of overload resolution. When a concept is checked, the type parameters it contains are substituted with argument types of the function call and then the constraint expressions in the body of the concept are evaluated using those substituted types. If all the results of these evaluations are true, we say that the type or group of types satisfy the concept, or conform to the concept. Those overloads whose constraint expressions yield true result, form the set of overload candidates.

A concept may refine another concept. Overload resolution selects always an overload whose constraint expression is most strict, or contains most refined concepts that are satisfied with argument types of the function call.

A concept may also contain axioms. Axioms are not checked or evaluated in any way by the compiler but they express semantic facts about properties of types that should always hold when the substituted type or types conform to the concept containing the axiom. Axioms are ment to be as information for the programmer.

16.1 Syntax

concept	→	access‑specifier? concept concept‑name < type‑parameter (, type‑parameter)* > refinement? where‑constraint? { concept‑body }
concept‑name	→	identifier
type‑parameter	→	identifier
refinement	→	: concept‑group‑id < type‑parameter (, type‑parameter)* >
concept‑group‑id	→	qualified‑id
concept‑body	→	(concept‑body‑constraint \| axiom)*
concept‑body‑constraint	→	typename‑constraint \| signature‑constraint \| embedded‑constraint
typename‑constraint	→	typename type‑expr ;
signature‑constraint	→	constructor‑constraint \| destructor‑constraint \| member‑function‑constraint \| function‑constraint
constructor‑constraint	→	explicit? class‑name parameters
destructor‑constraint	→	~ class‑name ();
member‑function‑constraint	→	return‑type type‑parameter . function‑group‑id parameters ;
function‑constraint	→	return‑type function‑group‑id parameters ;
embedded‑constraint	→	where‑constraint ;
where‑constraint	→	where constraint‑expression
constraint‑expression	→	disjunctive‑constraint‑expression
disjunctive‑constraint‑expression	→	conjunctive‑constraint‑expression (or conjunctive‑constraint‑expression)*
conjunctive‑constraint‑expression	→	primary‑constraint‑expression (and primary‑constraint‑expression)*
primary‑constraint‑expression	→	( constraint‑expression ) \| atomic‑constraint‑expression
atomic‑constraint‑expression	→	predicate‑constraint‑expression \| is‑constraint‑expression \| multiparam‑constraint‑expression
predicate‑constraint‑expression	→	invoke‑expression
invoke‑expression	→	(template‑id \| identifier) (. identifier)* ( argument‑list )
is‑constraint‑expression	→	type‑expr is concept‑or‑typename
concept‑or‑typename	→	type‑expr
multiparam‑constraint‑expression	→	concept‑group‑id < type‑expr (, type‑expr)* >
axiom	→	axiom identifier? parameters? { axiom‑body }
axiom‑body	→	axiom‑statement*
axiom‑statement	→	expression ;

16.2 Constraints

A constraint in the body of a constraint can be a typename constraint, a signature constraint, or an embedded constraint.

16.2.1 Typename Constraint

Evaluation of a typename constraint yields true, if the substituted type contains a type alias or type whose name is equal to the identifier contained by the constraint, false otherwise.

For examples of typename constraints, see Container concept in the system library. It contains three typename constraints: T.ValueType, T.Iterator and T.ConstIterator. This means that if a type T satisfies Container concept it must contain three type aliases named ValueType, Iterator and ConstIterator.

16.2.2 Signature Constraints

Evaluation of a signature constraint yields true, if there exists a constructor, destructor, member function or function whose signature matches given constructor, destructor, member function or function signature respectively, and false otherwise. Destructor constraint is satisfied always, because a trivial destructor matches a destructor signature.

Constructor Constraint

Here's an example of a concept that contains a constructor constraint:

public concept MoveConstructible<T>
{
T(T&&);
}

Destructor Constraint

Here's an example of a concept that contains a destructor constraint:

public concept Destructible<T>
{
~T();
}

Member Function Constraint

Here's an example of a concept that contains a member function constraint:

public concept Container<T>
{
// ...
long T.Count();
}

Function Constraint

Here's an example of a concept that contains a function constraint:

public concept LessThanComparable<T>
{
bool operator<(T, T);
// ...
}

16.2.3 Embedded Constraint

An embedded constraint is a where constraint that is embedded in the concept body. An embedded constraint yields true, if the where-constraint yields true, and false otherwise.

16.2.4 Where-Constraint

A where constraint consists of the keyword where and a constraint expression. A where-constraint yields true, if the constraint expression yields true, and false otherwise.

16.3 Constraint Expressions

A constraint expression can be a disjunctive constraint expression, a conjunctive constraint expression, a primary constraint expression, an atomic constraint expression, a predicate constraint expression, an is-constraint expression, or a multiparam-constraint expression.

16.3.1 Disjunctive Constraint Expression

A disjunctive constraint expression is a sequence of conjunctive constraint expressions separated by the keyword or.

A disjunctive constraint expression

a or b

yields true, if either a or b, or both yield true, and false otherwise.

16.3.2 Conjunctive Constraint Expression

A conjunctive constraint expression is a sequence of primary constraint expressions separated by the keyword and.

A conjunctive constraint expression

a and b

yields true, if both a and b yield true.

16.3.3 Primary Constraint Expression

A primary constraint expression is a parenthesized constraint expression, or an atomic constraint expression.

A primary constraint expression yields true if either the parenthesized constrained expression or the atomic constraint expression yields true, and false otherwise.

16.3.4 Atomic Constraint Expression

An atomic constraint expression is either a predicate constraint expression, an is-constraint expression, or a multiparam-constraint expression. It yields true if either predicate constraint expression, is-constraint expression or multiparam-constraint expression respectively yields true, and false otherwise.

16.3.5 Predicate Constraint Expression

A predicate constraint is invocation of a Boolean-valued constexpr or intrinsic function. It yields true if evaluation of the corresponding function yields true, and false otherwise.

16.3.6 Is-Constraint Expression

An is-constraint expression is either of the form

type is type

or of the form

type is concept

The first form yields true, if the type on the left-hand side is equal to the type on the right-hand side when possible reference and const qualifiers are removed, and false otherwise.

The second form yields true, if the type on the left-hand side conforms to the concept on the right-hand side, and false otherwise. The conformance of the concept is checked by substituting the type on the left side to the single type parameter of the concept on the right side and evaluating the constraint expressions in the body of the concept. If all the constraint expressions of the concept yield true, the type conforms to the concept, otherwise it does not conform to the concept.

16.3.7 Multiparam-Constraint Expression

A multiparam-constraint expression is of the form

concept<type1, type2, ..., type_n>

It yields true, if the types type1, type2, ..., type_n conform to the n-parametric concept concept, and false otherwise. The conformance of the concept is checked by substituting type parameters of the concept with types type1, type2, etc. respectively and evaluating the constraint expressions in the body of the concept. If all the constraint expressions of the concept yield true, the types conform to the concept, otherwise they do not conform to the concept.

16.4 Axioms

An axiom consists of a possible identifier, a possibly empty list of parameters and a body enclosed in braces. A body of an axiom is a possibly empty sequence of axiom statements. Each axiom statement is a Boolean-valued expression such as an equivalence, an implication, a disjunction, a conjunction, an equality or a relational expression. If a type conforms to a concept that contains these axioms, these axioms should always be true for such a type.

Here's an example of a concept that contains axioms:

public concept LessThanComparable<T>
{
        // ...
        axiom irreflexive(T a) { !(a < a); }
        axiom antisymmetric(T a, T b) { a < b => !(b < a); }
        axiom transitive(T a, T b, T c) { a < b && b < c => a < c; }
        axiom total(T a, T b) { a < b || a == b || a > b; }
        axiom greaterThan(T a, T b) { a > b <=> b < a; }
        axiom greaterThanOrEqualTo(T a, T b) { a >= b <=> !(a < b); }
        axiom lessThanOrEqualTo(T a, T b) { a <= b <=> !(b < a); }
}

17 Attributes

Functions, classes, member functions, member variables and interfaces can have attributes.

17.1 Syntax

Attributes are name-value pairs declared between brackets. The attribute declaration precedes syntactically the associated function, class, variable or interface.

attributes	→	[ ( attribute ( , attribute )* )? ]
attribute	→	attribute‑name ( = attribute‑value )?
attribute‑name	→	id‑char‑sequence
attribute‑value	→	" ( [^"\\\r\n] \| char‑escape )* "

17.2 Usage

Attributes are name-value pairs that can be attached to programming constructs. Attribute name is an identifier recognized by the compiler or a compiling tool and attribute value is a string. If the value of an attribute is not explicitly given, it will have implicitly value "true". Attributes can be used for example to generate serialization code or for a similar task.

Currently Cmajor compiler recognizes three attributes:

When the return value of a function with a nodiscard attribute is discarded, the compiler will issue a warning.
xml attribute can have value "true" or "false". It can be attached to classes and member variables. When a class is attached an xml attribute with value of "true" or with no value, the compiler will derive the class from the abstract System.Xml.Serialization.XmlSerializable class and generates implementation of its abstract member functions. See XML serialization document for usage examples.
system_default attribute with a value "true" or without an explicit value marks a function as a system-default function.

18 Compile Units

Cmajor projects consist of source files that are also called compile units. Each compile unit contains a possibly empty sequence of using directives followed by a possibly empty sequence of definitions.

18.1 Syntax

compile‑unit	→	namespace‑content
namespace‑content	→	using‑directives definitions
using‑directives	→	using‑directive*
using‑directive	→	using‑alias‑directive \| using‑namespace‑directive
using‑alias‑directive	→	using identifier = qualified‑id ;
using‑namespace‑directive	→	using qualified‑id ;
definitions	→	definition*
definition	→	namespace‑definition \| constant‑definition \| enumerated‑type‑definition \| function‑definition \| typedef‑declaration \| class‑definition \| interface‑definition \| delegate‑definition \| class‑delegate‑definition \| concept‑definition \| global‑variable‑definition
namespace‑definition	→	namespace qualified‑id? { namespace‑content }
constant‑definition	→	constant
enumerated‑type‑definition	→	enumerated‑type
function‑definition	→	function
typedef‑declaration	→	typedef
class‑definition	→	class
interface‑definition	→	interface
delegate‑definition	→	delegate
class‑delegate‑definition	→	class‑delegate
concept‑definition	→	concept
global‑variable‑definition	→	global‑variable

18.2 Using Directives

There are two kinds of using directives: using-alias directives and using-namespace directives.

18.2.1 Using-Alias Directives

A using-alias directive introduces an alternate name, a simple identifier, for a namespace-level entity referred by its fully qualified name.

For example, using-alias directive

using Console = System.Console;

makes it possible to refer to System.Console class as bare Console instead of its fully qualified name System.Console, for example.

18.2.2 Using-Namespace Directives

A using-namespace directive makes contents of a namespace available in a compile unit using simple identifiers of namespace-level entities.

For example, using-namespace directive

using System;

makes contents of System namespace available in the current compile unit. This means that one can refer to System.Console class as bare Console instead of its fully qualified name System.Console, for example.

However, if there are two entities of the same name, Foo, for example, in two namespaces, Alpha and Beta, and both contents of Alpha and Beta namespace are made available with using-namespace directives:

using Alpha;
using Beta;

one must refer to Foo using its fully qualified name, Alpha.Foo, for example.

18.3 Definitions

Definitions can appear inside a namespace, or at the global namespace level, outside any other namespace.

18.3.1 Namespaces

A namespace definition consists of a keyword namespace followed by the name of the namespace followed by its contents. The name of a namespace can be a simple identifier, Alpha, for example, or a fully qualified identifier, Alpha.Beta.Gamma, for example. Then

namespace Alpha.Beta.Gamma { ... }

is a shorthand notation for

namespace Alpha { namespace Beta { namespace Gamma { ... } } }

Namespaces can be used to organize entities in a library under a common name. They can also be used to prevent name clashes: If both a graphics library Graphics contains a function named draw that draws a shape, and a lottery library Lottery contains also a function named draw that shows a lottery draw, they could be used in the same program if the entities of the graphics library were defined in the namespace Graphics and the entities of the lottery library in the namespace Lottery:

namespace Graphics { void draw() { ... } }

namespace Lottery { void draw() { ... } }

void main
{
Graphics.draw();
Lottery.draw();
}

Namespaces are open: many compile units can add definitions to the same namespace.

18.3.2 Namespace-level Definitions

A namespace, including the global namespace, can contain the following:

constant definitions,
enumerated type definitions,
function definitions,
type alias declarations,
class definitions,
interface definitions,
delegate definitions,
class delegate definitions, and
concept definitions.

18.3.3 Unnamed namespaces

A namespace is unnamed if it lacks an identifier. Contents of an unnamed namespace is available only in the same source file that it appears. The names inside an unnamed namespace do not collide with identical names in other unnamed namespaces. The mangled name of an unnamed namespace will be "unnamed_ns_UNIQUE_HEX_STRING" where UNIQUE_HEX_STRING will be SHA-1 hash of a random UUID. The mangled names of the entities belonging to an unnamed namespace will be unique.

19 Projects

There is one project for each Cmajor program and library. By convention each project should be in its own directory and have a project file that ends with .cmp extension. A project file contains the name and type of the project and lists the source files belonging to it. A project can reference another project. Reference dependencies must be acyclic.

19.1 Syntax

project‑file	→	project project‑name ; project‑declarations
project‑name	→	qualified‑id
project‑declarations	→	project‑declaration*
project‑declaration	→	reference‑declaration \| source‑file‑declaration \| resource‑file‑declaration \| resource‑script‑declaration \| text‑file‑declaration \| target‑declaration
reference‑declaration	→	reference file‑path ;
source‑file‑declaration	→	source file‑path ;
resource‑file‑declaration	→	resource file‑path ;
resource‑script‑declaration	→	rc file‑path ;
text‑file‑declaration	→	text file‑path ;
target‑declaration	→	target = target ;
target	→	program \| winguiapp \| winapp \| library \| winlib \| unitTest
file‑path	→	< [^>]+ >

19.2 Project Types

The target declaration defines the type of the project:

program is a console application.
winguiapp is a Windows desktop application that has a graphical user interface.
winapp is a console application that uses Windows API.
library is a library.
winlib is a library that uses Windows API.
unitTest is a unit test project.

The winguiapp, winapp and winlib project types are available only on Windows platform.

19.3 File Types

The project file can contain declarations for the following file types:

reference declaration names a relative or absolute path to a library project file that this project references. The referenced file name must have a .cmp extension.
source file declaration names a relative or absolute path to a Cmajor source file that is included in this project. The file name must have a .cm extension.
resource file declaration names a relative or absolute path to a Cmajor resource file that is included in this project. The file name must have an .xml extension.
resource script declaration names a relative or absolute path to a Windows resource script that is included in this project. The file name must have a .rc extension.
text file declaration names a relative or absolute path to a text file that is included in this project. The file name may have any extension.

19.4 Example

Here's a project file alpha.cmp for a project named Alpha that references two library projects Beta and Gamma:

project Alpha;
target=program;
reference <../beta/beta.cmp>;
reference <../gamma/gamma.cmp>;
source <main.cm>;

Here's the project file beta.cmp:

project Beta;
target=library;
source <beta.cm>;

and here's the project file gamma.cmp:

project Gamma;
target=library;
source <gamma.cm>;

The directory structure is as follows:

        |
        +--alpha
        |  |
        |  +--alpha.cmp
        |  |
        |  +--main.cm
        |
        +--beta
        |  |
        |  +--beta.cmp
        |  |
        |  +--beta.cm
        |
        +--gamma
           |
           +--gamma.cmp
           |
           +--gamma.cm

19.5 Programs

Each valid Cmajor program must have a main function where the execution of the program starts. The possible signatures of the main function are:

void main();
int main();
void main(int argc, const char** argv);
int main(int argc, const char** argv);

If the return type of the main function is void, exit code 0 is returned to the caller of the program at the end of a normal program execution.

If the return type of the main function is int, the main function must explicitly return a value that is then returned to the caller of the program at the end of a normal program execution.

In the last two signatures argc is the number of program arguments including the name of the program, and argv contains the name of the program and program arguments. By convention argv[0] is the name of the program and argv[1], argv[2], ..., argv[argc − 1] are program arguments.

20 Solutions

A solution is a group of related projects that can be built as a unit. By convention a solution has a solution file that ends with .cms extension. When deciding the build order of the projects in a solution, the Cmajor compiler does a topological sort of the projects using reference relationship as the sorting criteria. If project A references a project B, project B is built before project A.

20.1 Syntax

solution‑file	→	solution solution‑name ; solution‑declarations
solution‑name	→	qualified‑id
solution‑declarations	→	solution‑declaration*
solution‑declaration	→	solution‑project‑declaration \| active‑project‑declaration \| active‑backend‑declaration \| active‑config‑declaration \| active‑opt‑level‑declaration
solution‑project‑declaration	→	project file‑path ;
active‑project‑declaration	→	activeProject qualified‑id ;
active‑backend‑declaration	→	activeBackEnd identifier ;
active‑config‑declaration	→	activeConfig identifier ;
active‑opt‑level‑declaration	→	activeOptLevel integer‑literal ;

20.2 Example

Here's a solution file solution.cms that contains three projects Alpha, Beta and Gamma:

solution Solution;
project <alpha/alpha.cmp>;
project <beta/beta.cmp>;
project <gamma/gamma.cmp>;

The directory structure is as follows:

        +--solution
           |
           +--solution.cms
           |
           +--alpha
           |  |
           |  +--alpha.cmp
           |  |
           |  +--main.cm
           |
           +--beta
           |  |
           |  +--beta.cmp
           |  |
           |  +--beta.cm
           |
           +--gamma
              |
              +--gamma.cmp
              |
              +--gamma.cm

Appendix A Terms

ALIGNMENT

Size of a class c may be aligned to a, or rounded to the smallest multiple of the a that is equal to or greater than the size of c before the alignment has taken place.

For example, given class Foo

class Foo
{
int x;
byte y;
}

sizeof(Foo) is 8, and not 5 because of alignment of 4, that is sizeof(int).

ARITY OF A FUNCTION

The number of parameters of a function.

COMMON TYPE

The common type for two basic types is the narrowist type that can contain a value of both types. The following table contains the common type function given two basic types:

	bool	sbyte	byte	short	ushort	int	uint	long	ulong	float	double	char	wchar	uchar
bool	bool
sbyte		sbyte	short	short	int	int	long	long		float	double
byte		short	byte	short	ushort	int	uint	long	ulong	float	double
short		short	short	short	int	int	long	long		float	double
ushort		int	ushort	int	ushort	int	uint	long	ulong	float	double
int		int	int	int	int	int	long	long		float	double
uint		long	uint	long	uint	long	uint	long	ulong	float	double
long		long	long	long	long	long	long	long		float	double
ulong			ulong		ulong		ulong		ulong	float	double
float		float	float	float	float	float	float	float	float	float	double
double		double	double	double	double	double	double	double	double	double	double
char												char	wchar	uchar
wchar												wchar	wchar	uchar
uchar												uchar	uchar	uchar

CONCRETE CLASS

Opposite of an abstract class. A concrete class may not have an abstract member function and it must override all abstract member functions it inherits.

CONVERSION DISTANCE

If A and B are class types, and A is the same class as B or A is derived directly or indirectly from B, their conversion distance is derined as follows: if A = B, distance(A, B) = 0, otherwise distance(A, B) = 1 + distance(baseClassOf(A), B).

If A and B are class types and A have no inheritance relationship or B is derived from A, we define distance(A, B) = 255.

If A and B are pointer-to-class types, their distance is defined as the distance of the pointed-to types. If A or B or both are reference-to-class types, their distance is defined as the distance of types for which references are removed.

If A and B are basic types, and A is implicitly convertible to B we define their conversion distance by associating integers to the types and computing the distance as the difference of those integers. For example, distance(byte, short) = 1, distance(byte, ushort) = 2, distance(byte, int) = 3, etc.

DEFAULT INITIALIZATION

In Cmajor objects of the following kind of types are default initialized as follows:³

Objects of the Boolean type are initialized to false.
Objects of basic integer types, basic floating-point types and enumerated types are initialized to zero.
Objects of basic character types are initialized to NUL character ('\0').
Pointer type objects are initialized to null.
Array type objects will contain their length number of default initialized objects.
Delegate type objects will contain null as their internal function pointer value.
Class delegate type objects will contain null as their internal object pointer and function pointer value.
Class type objects are default initialized by calling the default constructor of their class.

FREE STORE

A memory area managed by the runtime library and eventually by the operating system. A program can allocate memory from the free store and then release it back when it has finished using it.

FUNCTION GROUP NAME

The name of the function without the parameter list. For example, these functions have all the same group name foo:

        public void foo(int x) {}

        public void foo(double y) {}

        public class Alpha
        {
                public void foo() {}
        }

FUNCTION SIGNATURE

The name of the function, the types of the parameters of the function, and possibly the constraint of the function. For example, function

public void foo(int x) {}

has signature foo(int)

IMT

Interface Method Table. Each polymorphic class has one IMT per implemented interface. The interface method table contains pointers to implemented interface member functions for the class. Each interface member function has an IMT index that is used for getting the member function pointer from the IMT when doing interface call dispatch.

INTEGRAL VALUE

An enumeration constant, an integer type value or a character type value.

LVALUE EXPRESSION

An expression that can appear on the left-hand side of an assignment. The lvalue expressions are:

name of a variable or parameter,
dereferenced pointer,
array element access,
invocation of a function, delegate or class delegate that returns a nonconst lvalue reference type.

NAMESPACE-LEVEL ENTITY

An entity defined at namespace level. Entities that can be defined at namespace level are:

constants,
enumerated types,
functions,
type aliases,
classes,
interfaces,
delegates,
class delegates and
concepts.

NONTRIVIAL DESTRUCTOR

A class has a nontrivial destructor if it has

a user-defined destructor that is declared default or
a user-defined destructor that has a body, or
a nonempty compiler-generated destructor.

The compiler generates a nontrivial destructor for a class if one or more of the following conditions are true:

the class is polymorphic,
the class has a base class and that base class has a nontrivial destructor,
the class has a member variable of a class type that has a nontrivial destructor.

POLYMORPHIC CLASS TYPE

A class is polymorphic, if one or more of the following conditions are true:

the class is abstract,
it has a base class that is polymorphic,
it has a virtual or overridden destructor,
it has a member function that is virtual, abstract or overridden,
it implements an interface.

A polymorphic class type has a VMT.

USER-DEFINED TYPE

User-defined types are enumerated types, class types, interface types, and delegate and class delegate types.

VMT

Virtual Method Table. There is one virtual method table per polymorphic class. A virtual method table contains:

A 16-byte type identifier (UUID).
Pointer to the table of IMT pointers. This table contains an entry for each interface the class implements.
Pointers to virtual and overridden member functions for this class.

Each virtual, abstract and overridden member function contains a VMT index that is used for getting the member function pointer from the VMT when doing virtual call dispatch.

Appendix B Syntax Notation

Syntax of the Cmajor programming language is described in this text using a kind of context-free grammar called Parsing Expression Grammar or PEG. The notation used is not exaxtly same as PEG notation, but a slightly modified version of it.

In this notation a grammar consists of rules that are of the form:

ruleName

→

ruleBody

A rule produces a set of strings that forms a language. For example, the rule compile-unit produces a set a strings that form the language of syntactically valid Cmajor source files. The syntax rules alone are not enough to describe what is a valid Cmajor program though, because syntactically valid source file includes meaningless constructs. For example, program

void main()
{
1 = a;
}

is syntactically valid but meaningless because one cannot assign to a literal. It is compiler's job to perform semantic analysis and detect this kind of errors. In this case the Cmajor compiler produces the following error message:

            not an lvalue expression (file 'C:/Users/Seppo/cmajorw64/cmajor/test/foo/main.cm', line 5):
            1 = a;
            ^

The body of a rule consists of parsing expressions that are combined to produce a pattern that describes a set of strings that forms a language.

Keywords

One of the simplest kind of parsing expression is a keyword. A keyword is represented using bold font. A keyword is a string that appears literally in the produced language. For example, rule int‑rule produces a language that contains one string, "int":

int‑rule

→

int

Terminal Strings

Another simple kind of parsing expression is a terminal string. A terminal string is represented using monospace font. A terminal string also appears literal in the produced language. For example, rule parentheses produces a language that contains a string consisting of a left and a right parenthesis:

parentheses

→

()

Character Classes

A character class is a parsing expression that produces one character that is in the character class. A character class represented using serif font and it is enclosed in square brackets. For example, rule latin‑letter produces a language that contains strings that consist of a single Latin letter:

latin‑letter

→

[a-zA-Z]

A character class may have character ranges that consist of the starting characer of the range and a hyphen and the ending character of the range. Another example is a rule nondigit that produces a language that contains strings that consist of a single letter that is not a decimal digit character:

nondigit

→

[^0-9]

If a character class starts with a hat character (^), it means all characters except the characters or ranges of characters that follow.

Nonterminals

A nonterminal is a name of a grammar rule. A nonterminal is represented using italic font. A nonterminal produces a language that consists of the strings that the rule it names produces. For example, rule class‑name produces a language that consists of the strings that rule identifier produces:

class‑name

→

identifier

Informal Expressions

Sometimes it is not possible to include the syntax using formal rules. In that case the produced strings are described in english. The informal rules are enclosed in apostrophies. For example, rule any‑char produces a language that consists of the strings that consist of single Unicode characters:

any‑char

→

'any Unicode character'

Grammar Operators

The parsing expressions can be combined using grammar operators that are: |, sequence, −, *, +, ?, and ().

Union

If e₁ and e₂ are parsing expressions, e₁ | e₂ produces the set of strings that is union of the set of strings that e₁ produces and the set of strings that e₂ produces. Another way of thinking it is that the bar character (|) operator means alternatives. Unlike in general context-free grammars, in Parsing Expression Grammars the first alternative that matches an input always wins, so Parsing Expression Grammars cannot be ambiguous as general context-free grammars can.

For example, the rule basic‑type produces a language that consists of the names of the Cmajor basic types:

basic‑type

→

Sequence

If e₁ and e₂ are parsing expressions, e₁e₂ produces a set of strings that consist of strings that e₁ produces concatenated with strings that e₂ produces. Another way of thinking it is that first e₁ occurs and then e₂ occurs.

For example, rule hex‑digit‑4 produces a language that consist of strings of four hexadecimal digits:

hex‑digit‑4

→

hex‑digit hex‑digit hex‑digit hex‑digit

Difference

If e₁ and e₂ are parsing expressions, e₁ − e₂ produces the set of strings that is difference of the strings that e₁ produces and the strings that e₂ produces. Another way of thinking it is that e₁ occurs but e₂ does not occur.

For example, rule identifier produces strings that consist of strings that belong to the set of strings that rule id‑char‑sequence produces but do not belong to the set of strings that rule keyword produces:

identifier	→	id‑char‑sequence − keyword
id‑char‑sequence	→	(letter \| _) (letter \| digit \| _)*
letter	→	[a-zA-Z]
digit	→	[0-9]

Kleene Closure

If e is a parsing expression and λ denotes an empty string, e* produces strings that the following parsing expressions produce:

λ, e, ee, eee, ...

Another way of thinking it is that e occurs zero or more times. The name of this operation is kleene closure.

For example, rule digits produces strings that consist of zero or more decimal digits:

digits

→

[0-9]*

Positive

If e is a parsing expression, e+ produces strings that the following parsing expressions produce:

e, ee, eee, ...

Another way of thinking it is that e occurs one or more times.

For example, rule dec‑digit‑sequence produces strings consisting of nonempty sequences of decimal digits:

dec‑digit‑sequence

→

[0-9]+

Optional

If e is a parsing expression, e? produces strings that consist of the empty string and the strings that parsing expressions e produce. Another way of thinking it is that e may but does not have to occur, it is optional.

For example, rule integer produces strings that may begin with sign and then comes nonempty sequence of decimal digits:

integer	→	sign? dec‑digit‑sequence
sign	→	+ \| -
dec‑digit‑sequence	→	[0-9]+

Grouping

If e is a parsing expression, (e) produces the same strings that e produce. The parentheses may be used to group parsing expressions another way when the precedence of grammar operators produces wrong result. The precedence of | is lowest, then comes sequence operation, −, and then *, + and ?. This means that for example parsing expression

ab*

produces strings that consist of character a followed by zero or more b's, because precedence of sequence operation is lower than the precedence of *, so that operator * binds tighter than sequence operation.

If one wants to produce the following strings: λ, ab, abab, ababab, ..., this can be achieved by using parentheses:

(ab)*

The parentheses used for grouping are slightly taller than parentheses that are terminal characters:

grouping‑parens

→

(())

The outer parentheses are grouping operators and the inner parentheses are terminal characters.

^{1. Invocation of a void function does not have a value, but it can have
side-effects, for example write text to a stream.}
^{2. Actually a local variable can reside only in a processor register if its address is not needed.}
^{3. Default initialization rules for Cmajor are different than in C++. In C++ primitive type objects of automatic storage duration will be default initialized to indeterminate value.}