Document unicode string literals

This commit is contained in:
Alex Beregszaszi 2020-07-27 13:01:58 +01:00
parent 1f39640392
commit af22dfa5b4
2 changed files with 22 additions and 2 deletions

View File

@ -25,16 +25,24 @@ Changes to the Syntax
* In external function and contract creation calls, Ether and gas is now specified using a new syntax:
``x.f{gas: 10000, value: 2 ether}(arg1, arg2)``.
The old syntax -- ``x.f.gas(10000).value(2 ether)(arg1, arg2)`` -- will cause an error.
* The global variable ``now`` is deprecated, ``block.timestamp`` should be used instead.
The single identifier ``now`` is too generic for a global variable and could give the impression
that it changes during transaction processing, whereas ``block.timestamp`` correctly
reflects the fact that it is just a property of the block.
* NatSpec comments on variables are only allowed for public state variables and not
for local or internal variables.
* The token ``gwei`` is a keyword now (used to specify, e.g. ``2 gwei`` as a number)
and cannot be used as an identifier.
* String literals now can only contain printable ASCII characters and this also includes a variety of
escape sequences, such as hexadecimal (``\xff``) and unicode escapes (``\u20ac``).
* Unicode string literals are supported now to accommodate valid UTF-8 sequences. They are identified
with the ``unicode`` prefix: ``unicode"Hello 😃"``.
* State Mutability: The state mutability of functions can now be restricted during inheritance:
Functions with default state mutability can be overridden by ``pure`` and ``view`` functions
while ``view`` functions can be overridden by ``pure`` functions.

View File

@ -484,7 +484,9 @@ String literals are written with either double or single-quotes (``"foo"`` or ``
For example, with ``bytes32 samevar = "stringliteral"`` the string literal is interpreted in its raw byte form when assigned to a ``bytes32`` type.
String literals support the following escape characters:
String literals can only contain printable ASCII characters, which means the characters between and including 0x1F .. 0x7E.
Additionally, string literals also support the following escape characters:
- ``\<newline>`` (escapes an actual newline)
- ``\\`` (backslash)
@ -511,9 +513,19 @@ character sequence ``abcdef``.
"\n\"\'\\abc\
def"
Any unicode line terminator which is not a newline (i.e. LF, VF, FF, CR, NEL, LS, PS) is considered to
Any Unicode line terminator which is not a newline (i.e. LF, VF, FF, CR, NEL, LS, PS) is considered to
terminate the string literal. Newline only terminates the string literal if it is not preceded by a ``\``.
Unicode Literals
----------------
While regular string literals can only contain ASCII, Unicode literals prefixed with the keyword ``unicode`` can contain any valid UTF-8 sequence.
They also support the very same escape sequences as regular string literals.
::
string memory a = unicode"Hello 😃";
.. index:: literal, bytes
Hexadecimal Literals