Table of Contents
Examples
新语言与C/Pascal/Java的比较
C | Pascal | Java | 新语言 | ||
---|---|---|---|---|---|
内存访问 | 指针 | 显式 | 显式 | 隐式 | 隐式 |
数组 | 浅赋值:指针 | 深复制 | 浅赋值:指针 | 浅赋值:指针 | |
记录 | 深复制 | 深复制 | 浅复制 | 浅复制 | |
字符串 | 字符数组 | 内置 | 内置 | 内置 | |
全局变量 | 允许 | 允许 | 类成员变量 | 不允许 | |
前置声明 | 需要 | 需要 | 不需要 | 不需要 | |
函数嵌套 | 不允许 | 允许 | 允许内嵌类 | 不允许 | |
变量声明 | 任意块开始处 | 函数开始处 | 任意位置 | 函数开始处 | |
多维数组 | 连续存储空间 | 连续存储空间 | 数组的数组 | 数组的数组 | |
注释 | 非嵌套 | 嵌套 | 非嵌套 | 嵌套 |
改进的建议(这学期不适用)
1. 允许全局变量
2. 函数,变量,类型有独立的名字空间
3. 变量将强制初始化
4. 变量可以在块中间声明
5. 去掉break和continue
6. 去掉逗号表达式
—by marong
Lexical Aspects
A token can be a keyword, an identifier, a integer constant, a character constant, or a string constant. Tokens are separated by whitespaces and comments.
An identifier is a sequence of letters, digits, or underscores, which begins with a letter and does not share its name with a keyword. Note that (i) identifiers cannot start with underscores, and (ii) case is significant in identifiers.
Line terminators are \n
, \r
, and \r\n
.
Whitespaces, including spaces, tabs, line terminators, and formfeeds (\f
), may appear between tokens.
Comments
There are two types of comments: line comments and block comments.
Line comments starts with two slashes /
/
. Texts behind the two slashes are ignored, until a line terminator is met.
Block comments starts with a /*
and ends with a */
. Texts in between are ignored. Block comments may nest.
Constants
An integer constant is a sequence of decimal digits (i.e., 0123456789). There are no negative integer constants.
A character constant is one printable character or space, or escape sequences that represents one character, surrounded by a pair of single quotes '.
A string constant is a sequence of zero or more printable characters, spaces, or escape sequences, surrounded by a pair of double quotes "
.
Escape sequences begins with a backslash \
, and represent some special characters. Escape sequences are
Escape sequences | Meanings |
---|---|
\n | Linefeed |
\r | Carriage return |
\t | Tab |
\\ | Backslash |
\ddd | The character with ASCII code ddd (three decimal digits) |
\" | Double quote (only allowed in a string constant) |
\ ' | Single quote (only allowed in a character constant) |
Reserved words
Here are the reserved words:
native | record | new | int | string | char | null |
if | else | while | for | return | break | continue |
; | [ | ] | { | } | ( | ) |
, | = | || | && | == | != | < |
< = | > | >= | + | - | * | / |
% | ! | . |
Input & Output
native int readInt();
native char readChar();
native int printInt(int i);
native int printChar(char c);
native int printString(string s);
native int printLine(string s);
Line break character is \n
.
Type Conversions
int i; char c; string s;
Convert from int
i = 97; c = chr(i); // c == 'a' s = "" + i; // s == "97"
Convert from char
c = 'a'; i = ord(c); // i == 97 s = "" + c; // s == "a"
Convert from string
s = "97"; c = s[0]; // c == '9' c = s[1]; // c == '7' i = parseInt(s); // i == 97
Note that parseInt
is a contributed function.
String Operations
Creation
string a, b; a = "hello"; b = a;
b
should share the same storage with a
.
Indexing
string s; s = "hello"; s[0] == 'h'; s[1] == 'e';
Strings are read-only. Assignments to string elements (e.g. s[1] = 'a';
) should cause errors in semantic analysis.
Length
string s; s = "hello"; s.length == 5; "hello".length == 5;
Comparison
By value; in alphabet order
string x, y; x = "a"; y = "ab"; x == "a" && y == "ab"; x < y && y < "b";
Substring
string s; s = "hello" substring(s, 0, s.length) == "hello"; substring(s, 1, 2) == "el";
Concatenation
string s; s = "hello"; s = s + ", " + 2012; s == "hello, 2012";
BNF Grammar
%precedence: ELSE => right %precedence: LBRACKET => left translation_unit : external_decl translation_unit : translation_unit external_decl external_decl : prototype_decl | function_def | record_def prototype_decl : NATIVE function_head SEMICOLON function_def : function_head LBRACE variable_decl_list stmt_list RBRACE | function_head LBRACE stmt_list RBRACE record_def : RECORD ID LBRACE variable_decl_list RBRACE variable_decl_list : variable_decl variable_decl_list : variable_decl_list variable_decl function_head : type_specifier ID LPAREN parameter_list RPAREN | type_specifier ID LPAREN RPAREN parameter_list : parameter_decl parameter_list : parameter_list COMMA parameter_decl parameter_decl : type_specifier ID variable_decl : type_specifier id_list SEMICOLON type_specifier : INT | STRING | CHAR | ID type_specifier : type_specifier LRBRACKET id_list : ID id_list : id_list COMMA ID stmt_list : stmt stmt_list : stmt_list stmt stmt : compound_stmt | expr_stmt | selection_stmt | iteration_stmt | jump_stmt compound_stmt : LBRACE stmt_list RBRACE | LBRACE RBRACE expr_stmt : expr SEMICOLON selection_stmt : IF LPAREN expr RPAREN stmt | IF LPAREN expr RPAREN stmt ELSE stmt iteration_stmt : WHILE LPAREN expr RPAREN stmt | FOR LPAREN expr_stmt expr_stmt expr RPAREN stmt | FOR LPAREN expr_stmt expr_stmt RPAREN stmt | FOR LPAREN expr_stmt SEMICOLON expr RPAREN stmt | FOR LPAREN expr_stmt SEMICOLON RPAREN stmt | FOR LPAREN SEMICOLON expr_stmt expr RPAREN stmt | FOR LPAREN SEMICOLON expr_stmt RPAREN stmt | FOR LPAREN SEMICOLON SEMICOLON expr RPAREN stmt | FOR LPAREN SEMICOLON SEMICOLON RPAREN stmt jump_stmt : RETURN expr SEMICOLON | BREAK SEMICOLON | CONTINUE SEMICOLON expr : assignment_expr expr : expr COMMA assignment_expr assignment_expr : logical_or_expr assignment_expr : unary_expr ASSIGN assignment_expr logical_or_expr : logical_and_expr logical_or_expr : logical_or_expr OR logical_and_expr logical_and_expr : equality_expr logical_and_expr : logical_and_expr AND equality_expr equality_expr : relational_expr equality_expr : equality_expr EQ relational_expr | equality_expr NEQ relational_expr relational_expr : additive_expr relational_expr : relational_expr LESS additive_expr | relational_expr LESS_EQ additive_expr | relational_expr GREATER additive_expr | relational_expr GREATER_EQ additive_expr additive_expr : mult_expr additive_expr : additive_expr PLUS mult_expr | additive_expr MINUS mult_expr mult_expr : unary_expr mult_expr : mult_expr MULTIPLY unary_expr | mult_expr DIVIDE unary_expr | mult_expr MODULO unary_expr unary_expr : postfix unary_expr : PLUS unary_expr | MINUS unary_expr | NOT unary_expr postfix : primary postfix : postfix LBRACKET expr RBRACKET | postfix LPAREN expr RPAREN | postfix LPAREN RPAREN | postfix DOT ID primary : ID | NULL | INTEGER | CHARACTER | STRING_LITERAL | LPAREN expr RPAREN | NEW type_specifier LBRACKET expr RBRACKET | NEW ID