Merge pull request #11660 from ethereum/docs-fix-badly-indented-lists-and-blocks

[Docs] Fix badly indented lists and blocks
This commit is contained in:
Kamil Śliwak 2021-07-21 18:35:19 +02:00 committed by GitHub
commit 6d6c9e6e4f
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
16 changed files with 574 additions and 562 deletions

View File

@ -89,14 +89,14 @@ New Features
This section lists things that were not possible prior to Solidity 0.6.0 This section lists things that were not possible prior to Solidity 0.6.0
or were more difficult to achieve. or were more difficult to achieve.
* The :ref:`try/catch statement <try-catch>` allows you to react on failed external calls. * The :ref:`try/catch statement <try-catch>` allows you to react on failed external calls.
* ``struct`` and ``enum`` types can be declared at file level. * ``struct`` and ``enum`` types can be declared at file level.
* Array slices can be used for calldata arrays, for example ``abi.decode(msg.data[4:], (uint, uint))`` * Array slices can be used for calldata arrays, for example ``abi.decode(msg.data[4:], (uint, uint))``
is a low-level way to decode the function call payload. is a low-level way to decode the function call payload.
* Natspec supports multiple return parameters in developer documentation, enforcing the same naming check as ``@param``. * Natspec supports multiple return parameters in developer documentation, enforcing the same naming check as ``@param``.
* Yul and Inline Assembly have a new statement called ``leave`` that exits the current function. * Yul and Inline Assembly have a new statement called ``leave`` that exits the current function.
* Conversions from ``address`` to ``address payable`` are now possible via ``payable(x)``, where * Conversions from ``address`` to ``address payable`` are now possible via ``payable(x)``, where
``x`` must be of type ``address``. ``x`` must be of type ``address``.
Interface Changes Interface Changes

View File

@ -77,8 +77,9 @@ The following (fixed-size) array type exists:
- ``<type>[M]``: a fixed-length array of ``M`` elements, ``M >= 0``, of the given type. - ``<type>[M]``: a fixed-length array of ``M`` elements, ``M >= 0``, of the given type.
.. note:: .. note::
While this ABI specification can express fixed-length arrays with zero elements, they're not supported by the compiler.
While this ABI specification can express fixed-length arrays with zero elements, they're not supported by the compiler.
The following non-fixed-size types exist: The following non-fixed-size types exist:
@ -124,13 +125,13 @@ Design Criteria for the Encoding
The encoding is designed to have the following properties, which are especially useful if some arguments are nested arrays: The encoding is designed to have the following properties, which are especially useful if some arguments are nested arrays:
1. The number of reads necessary to access a value is at most the depth of the value 1. The number of reads necessary to access a value is at most the depth of the value
inside the argument array structure, i.e. four reads are needed to retrieve ``a_i[k][l][r]``. In a inside the argument array structure, i.e. four reads are needed to retrieve ``a_i[k][l][r]``. In a
previous version of the ABI, the number of reads scaled linearly with the total number of dynamic previous version of the ABI, the number of reads scaled linearly with the total number of dynamic
parameters in the worst case. parameters in the worst case.
2. The data of a variable or array element is not interleaved with other data and it is 2. The data of a variable or array element is not interleaved with other data and it is
relocatable, i.e. it only uses relative "addresses". relocatable, i.e. it only uses relative "addresses".
Formal Specification of the Encoding Formal Specification of the Encoding
@ -312,21 +313,21 @@ these are directly the values we want to pass, whereas for the dynamic types ``u
we use the offset in bytes to the start of their data area, measured from the start of the value we use the offset in bytes to the start of their data area, measured from the start of the value
encoding (i.e. not counting the first four bytes containing the hash of the function signature). These are: encoding (i.e. not counting the first four bytes containing the hash of the function signature). These are:
- ``0x0000000000000000000000000000000000000000000000000000000000000123`` (``0x123`` padded to 32 bytes) - ``0x0000000000000000000000000000000000000000000000000000000000000123`` (``0x123`` padded to 32 bytes)
- ``0x0000000000000000000000000000000000000000000000000000000000000080`` (offset to start of data part of second parameter, 4*32 bytes, exactly the size of the head part) - ``0x0000000000000000000000000000000000000000000000000000000000000080`` (offset to start of data part of second parameter, 4*32 bytes, exactly the size of the head part)
- ``0x3132333435363738393000000000000000000000000000000000000000000000`` (``"1234567890"`` padded to 32 bytes on the right) - ``0x3132333435363738393000000000000000000000000000000000000000000000`` (``"1234567890"`` padded to 32 bytes on the right)
- ``0x00000000000000000000000000000000000000000000000000000000000000e0`` (offset to start of data part of fourth parameter = offset to start of data part of first dynamic parameter + size of data part of first dynamic parameter = 4\*32 + 3\*32 (see below)) - ``0x00000000000000000000000000000000000000000000000000000000000000e0`` (offset to start of data part of fourth parameter = offset to start of data part of first dynamic parameter + size of data part of first dynamic parameter = 4\*32 + 3\*32 (see below))
After this, the data part of the first dynamic argument, ``[0x456, 0x789]`` follows: After this, the data part of the first dynamic argument, ``[0x456, 0x789]`` follows:
- ``0x0000000000000000000000000000000000000000000000000000000000000002`` (number of elements of the array, 2) - ``0x0000000000000000000000000000000000000000000000000000000000000002`` (number of elements of the array, 2)
- ``0x0000000000000000000000000000000000000000000000000000000000000456`` (first element) - ``0x0000000000000000000000000000000000000000000000000000000000000456`` (first element)
- ``0x0000000000000000000000000000000000000000000000000000000000000789`` (second element) - ``0x0000000000000000000000000000000000000000000000000000000000000789`` (second element)
Finally, we encode the data part of the second dynamic argument, ``"Hello, world!"``: Finally, we encode the data part of the second dynamic argument, ``"Hello, world!"``:
- ``0x000000000000000000000000000000000000000000000000000000000000000d`` (number of elements (bytes in this case): 13) - ``0x000000000000000000000000000000000000000000000000000000000000000d`` (number of elements (bytes in this case): 13)
- ``0x48656c6c6f2c20776f726c642100000000000000000000000000000000000000`` (``"Hello, world!"`` padded to 32 bytes on the right) - ``0x48656c6c6f2c20776f726c642100000000000000000000000000000000000000`` (``"Hello, world!"`` padded to 32 bytes on the right)
All together, the encoding is (newline after function selector and each 32-bytes for clarity): All together, the encoding is (newline after function selector and each 32-bytes for clarity):
@ -348,14 +349,14 @@ with values ``([[1, 2], [3]], ["one", "two", "three"])`` but start from the most
First we encode the length and data of the first embedded dynamic array ``[1, 2]`` of the first root array ``[[1, 2], [3]]``: First we encode the length and data of the first embedded dynamic array ``[1, 2]`` of the first root array ``[[1, 2], [3]]``:
- ``0x0000000000000000000000000000000000000000000000000000000000000002`` (number of elements in the first array, 2; the elements themselves are ``1`` and ``2``) - ``0x0000000000000000000000000000000000000000000000000000000000000002`` (number of elements in the first array, 2; the elements themselves are ``1`` and ``2``)
- ``0x0000000000000000000000000000000000000000000000000000000000000001`` (first element) - ``0x0000000000000000000000000000000000000000000000000000000000000001`` (first element)
- ``0x0000000000000000000000000000000000000000000000000000000000000002`` (second element) - ``0x0000000000000000000000000000000000000000000000000000000000000002`` (second element)
Then we encode the length and data of the second embedded dynamic array ``[3]`` of the first root array ``[[1, 2], [3]]``: Then we encode the length and data of the second embedded dynamic array ``[3]`` of the first root array ``[[1, 2], [3]]``:
- ``0x0000000000000000000000000000000000000000000000000000000000000001`` (number of elements in the second array, 1; the element is ``3``) - ``0x0000000000000000000000000000000000000000000000000000000000000001`` (number of elements in the second array, 1; the element is ``3``)
- ``0x0000000000000000000000000000000000000000000000000000000000000003`` (first element) - ``0x0000000000000000000000000000000000000000000000000000000000000003`` (first element)
Then we need to find the offsets ``a`` and ``b`` for their respective dynamic arrays ``[1, 2]`` and ``[3]``. Then we need to find the offsets ``a`` and ``b`` for their respective dynamic arrays ``[1, 2]`` and ``[3]``.
To calculate the offsets we can take a look at the encoded data of the first root array ``[[1, 2], [3]]`` To calculate the offsets we can take a look at the encoded data of the first root array ``[[1, 2], [3]]``
@ -380,12 +381,12 @@ thus ``b = 0x00000000000000000000000000000000000000000000000000000000000000a0``.
Then we encode the embedded strings of the second root array: Then we encode the embedded strings of the second root array:
- ``0x0000000000000000000000000000000000000000000000000000000000000003`` (number of characters in word ``"one"``) - ``0x0000000000000000000000000000000000000000000000000000000000000003`` (number of characters in word ``"one"``)
- ``0x6f6e650000000000000000000000000000000000000000000000000000000000`` (utf8 representation of word ``"one"``) - ``0x6f6e650000000000000000000000000000000000000000000000000000000000`` (utf8 representation of word ``"one"``)
- ``0x0000000000000000000000000000000000000000000000000000000000000003`` (number of characters in word ``"two"``) - ``0x0000000000000000000000000000000000000000000000000000000000000003`` (number of characters in word ``"two"``)
- ``0x74776f0000000000000000000000000000000000000000000000000000000000`` (utf8 representation of word ``"two"``) - ``0x74776f0000000000000000000000000000000000000000000000000000000000`` (utf8 representation of word ``"two"``)
- ``0x0000000000000000000000000000000000000000000000000000000000000005`` (number of characters in word ``"three"``) - ``0x0000000000000000000000000000000000000000000000000000000000000005`` (number of characters in word ``"three"``)
- ``0x7468726565000000000000000000000000000000000000000000000000000000`` (utf8 representation of word ``"three"``) - ``0x7468726565000000000000000000000000000000000000000000000000000000`` (utf8 representation of word ``"three"``)
In parallel to the first root array, since strings are dynamic elements we need to find their offsets ``c``, ``d`` and ``e``: In parallel to the first root array, since strings are dynamic elements we need to find their offsets ``c``, ``d`` and ``e``:
@ -416,11 +417,11 @@ and have the same encodings for a function with a signature ``g(string[],uint[][
Then we encode the length of the first root array: Then we encode the length of the first root array:
- ``0x0000000000000000000000000000000000000000000000000000000000000002`` (number of elements in the first root array, 2; the elements themselves are ``[1, 2]`` and ``[3]``) - ``0x0000000000000000000000000000000000000000000000000000000000000002`` (number of elements in the first root array, 2; the elements themselves are ``[1, 2]`` and ``[3]``)
Then we encode the length of the second root array: Then we encode the length of the second root array:
- ``0x0000000000000000000000000000000000000000000000000000000000000003`` (number of strings in the second root array, 3; the strings themselves are ``"one"``, ``"two"`` and ``"three"``) - ``0x0000000000000000000000000000000000000000000000000000000000000003`` (number of strings in the second root array, 3; the strings themselves are ``"one"``, ``"two"`` and ``"three"``)
Finally we find the offsets ``f`` and ``g`` for their respective root dynamic arrays ``[[1, 2], [3]]`` and Finally we find the offsets ``f`` and ``g`` for their respective root dynamic arrays ``[[1, 2], [3]]`` and
``["one", "two", "three"]``, and assemble parts in the correct order: ``["one", "two", "three"]``, and assemble parts in the correct order:
@ -529,12 +530,12 @@ i.e. ``0xcf479181``, ``uint256(0)``, ``uint256(amount)``.
The error selectors ``0x00000000`` and ``0xffffffff`` are reserved for future use. The error selectors ``0x00000000`` and ``0xffffffff`` are reserved for future use.
.. warning:: .. warning::
Never trust error data. Never trust error data.
The error data by default bubbles up through the chain of external calls, which The error data by default bubbles up through the chain of external calls, which
means that a contract may receive an error not defined in any of the contracts means that a contract may receive an error not defined in any of the contracts
it calls directly. it calls directly.
Furthermore, any contract can fake any error by returning data that matches Furthermore, any contract can fake any error by returning data that matches
an error signature, even if the error is not defined anywhere. an error signature, even if the error is not defined anywhere.
.. _abi_json: .. _abi_json:
@ -618,24 +619,24 @@ would result in the JSON:
.. code-block:: json .. code-block:: json
[{ [{
"type":"error", "type":"error",
"inputs": [{"name":"available","type":"uint256"},{"name":"required","type":"uint256"}], "inputs": [{"name":"available","type":"uint256"},{"name":"required","type":"uint256"}],
"name":"InsufficientBalance" "name":"InsufficientBalance"
}, { }, {
"type":"event", "type":"event",
"inputs": [{"name":"a","type":"uint256","indexed":true},{"name":"b","type":"bytes32","indexed":false}], "inputs": [{"name":"a","type":"uint256","indexed":true},{"name":"b","type":"bytes32","indexed":false}],
"name":"Event" "name":"Event"
}, { }, {
"type":"event", "type":"event",
"inputs": [{"name":"a","type":"uint256","indexed":true},{"name":"b","type":"bytes32","indexed":false}], "inputs": [{"name":"a","type":"uint256","indexed":true},{"name":"b","type":"bytes32","indexed":false}],
"name":"Event2" "name":"Event2"
}, { }, {
"type":"function", "type":"function",
"inputs": [{"name":"a","type":"uint256"}], "inputs": [{"name":"a","type":"uint256"}],
"name":"foo", "name":"foo",
"outputs": [] "outputs": []
}] }]
Handling tuple types Handling tuple types
-------------------- --------------------
@ -670,61 +671,61 @@ would result in the JSON:
.. code-block:: json .. code-block:: json
[ [
{ {
"name": "f", "name": "f",
"type": "function", "type": "function",
"inputs": [ "inputs": [
{ {
"name": "s", "name": "s",
"type": "tuple", "type": "tuple",
"components": [ "components": [
{ {
"name": "a", "name": "a",
"type": "uint256" "type": "uint256"
}, },
{ {
"name": "b", "name": "b",
"type": "uint256[]" "type": "uint256[]"
}, },
{ {
"name": "c", "name": "c",
"type": "tuple[]", "type": "tuple[]",
"components": [ "components": [
{ {
"name": "x", "name": "x",
"type": "uint256" "type": "uint256"
}, },
{ {
"name": "y", "name": "y",
"type": "uint256" "type": "uint256"
} }
] ]
} }
] ]
}, },
{ {
"name": "t", "name": "t",
"type": "tuple", "type": "tuple",
"components": [ "components": [
{ {
"name": "x", "name": "x",
"type": "uint256" "type": "uint256"
}, },
{ {
"name": "y", "name": "y",
"type": "uint256" "type": "uint256"
} }
] ]
}, },
{ {
"name": "a", "name": "a",
"type": "uint256" "type": "uint256"
} }
], ],
"outputs": [] "outputs": []
} }
] ]
.. _abi_packed_mode: .. _abi_packed_mode:
@ -761,18 +762,19 @@ As an example, the encoding of ``int16(-1), bytes1(0x42), uint16(0x03), string("
^^^^^^^^^^^^^^^^^^^^^^^^^^ string("Hello, world!") without a length field ^^^^^^^^^^^^^^^^^^^^^^^^^^ string("Hello, world!") without a length field
More specifically: More specifically:
- During the encoding, everything is encoded in-place. This means that there is
no distinction between head and tail, as in the ABI encoding, and the length - During the encoding, everything is encoded in-place. This means that there is
of an array is not encoded. no distinction between head and tail, as in the ABI encoding, and the length
- The direct arguments of ``abi.encodePacked`` are encoded without padding, of an array is not encoded.
as long as they are not arrays (or ``string`` or ``bytes``). - The direct arguments of ``abi.encodePacked`` are encoded without padding,
- The encoding of an array is the concatenation of the as long as they are not arrays (or ``string`` or ``bytes``).
encoding of its elements **with** padding. - The encoding of an array is the concatenation of the
- Dynamically-sized types like ``string``, ``bytes`` or ``uint[]`` are encoded encoding of its elements **with** padding.
without their length field. - Dynamically-sized types like ``string``, ``bytes`` or ``uint[]`` are encoded
- The encoding of ``string`` or ``bytes`` does not apply padding at the end without their length field.
unless it is part of an array or struct (then it is padded to a multiple of - The encoding of ``string`` or ``bytes`` does not apply padding at the end
32 bytes). unless it is part of an array or struct (then it is padded to a multiple of
32 bytes).
In general, the encoding is ambiguous as soon as there are two dynamically-sized elements, In general, the encoding is ambiguous as soon as there are two dynamically-sized elements,
because of the missing length field. because of the missing length field.
@ -784,12 +786,12 @@ for prepending a function selector. Since the encoding is ambiguous, there is no
.. warning:: .. warning::
If you use ``keccak256(abi.encodePacked(a, b))`` and both ``a`` and ``b`` are dynamic types, If you use ``keccak256(abi.encodePacked(a, b))`` and both ``a`` and ``b`` are dynamic types,
it is easy to craft collisions in the hash value by moving parts of ``a`` into ``b`` and it is easy to craft collisions in the hash value by moving parts of ``a`` into ``b`` and
vice-versa. More specifically, ``abi.encodePacked("a", "bc") == abi.encodePacked("ab", "c")``. vice-versa. More specifically, ``abi.encodePacked("a", "bc") == abi.encodePacked("ab", "c")``.
If you use ``abi.encodePacked`` for signatures, authentication or data integrity, make If you use ``abi.encodePacked`` for signatures, authentication or data integrity, make
sure to always use the same types and check that at most one of them is dynamic. sure to always use the same types and check that at most one of them is dynamic.
Unless there is a compelling reason, ``abi.encode`` should be preferred. Unless there is a compelling reason, ``abi.encode`` should be preferred.
.. _indexed_event_encoding: .. _indexed_event_encoding:
@ -801,13 +803,13 @@ Indexed event parameters that are not value types, i.e. arrays and structs are n
stored directly but instead a keccak256-hash of an encoding is stored. This encoding stored directly but instead a keccak256-hash of an encoding is stored. This encoding
is defined as follows: is defined as follows:
- the encoding of a ``bytes`` and ``string`` value is just the string contents - the encoding of a ``bytes`` and ``string`` value is just the string contents
without any padding or length prefix. without any padding or length prefix.
- the encoding of a struct is the concatenation of the encoding of its members, - the encoding of a struct is the concatenation of the encoding of its members,
always padded to a multiple of 32 bytes (even ``bytes`` and ``string``). always padded to a multiple of 32 bytes (even ``bytes`` and ``string``).
- the encoding of an array (both dynamically- and statically-sized) is - the encoding of an array (both dynamically- and statically-sized) is
the concatenation of the encoding of its elements, always padded to a multiple the concatenation of the encoding of its elements, always padded to a multiple
of 32 bytes (even ``bytes`` and ``string``) and without any length prefix of 32 bytes (even ``bytes`` and ``string``) and without any length prefix
In the above, as usual, a negative number is padded by sign extension and not zero padded. In the above, as usual, a negative number is padded by sign extension and not zero padded.
``bytesNN`` types are padded on the right while ``uintNN`` / ``intNN`` are padded on the left. ``bytesNN`` types are padded on the right while ``uintNN`` / ``intNN`` are padded on the left.

View File

@ -19,16 +19,16 @@ which can be used to check which bugs affect a specific version of the compiler.
Contract source verification tools and also other tools interacting with Contract source verification tools and also other tools interacting with
contracts should consult this list according to the following criteria: contracts should consult this list according to the following criteria:
- It is mildly suspicious if a contract was compiled with a nightly - It is mildly suspicious if a contract was compiled with a nightly
compiler version instead of a released version. This list does not keep compiler version instead of a released version. This list does not keep
track of unreleased or nightly versions. track of unreleased or nightly versions.
- It is also mildly suspicious if a contract was compiled with a version that was - It is also mildly suspicious if a contract was compiled with a version that was
not the most recent at the time the contract was created. For contracts not the most recent at the time the contract was created. For contracts
created from other contracts, you have to follow the creation chain created from other contracts, you have to follow the creation chain
back to a transaction and use the date of that transaction as creation date. back to a transaction and use the date of that transaction as creation date.
- It is highly suspicious if a contract was compiled with a compiler that - It is highly suspicious if a contract was compiled with a compiler that
contains a known bug and the contract was created at a time where a newer contains a known bug and the contract was created at a time where a newer
compiler version containing a fix was already released. compiler version containing a fix was already released.
The JSON file of known bugs below is an array of objects, one for each bug, The JSON file of known bugs below is an array of objects, one for each bug,
with the following keys: with the following keys:

View File

@ -124,17 +124,17 @@ The output of the above looks like the following (trimmed):
.. code-block:: json .. code-block:: json
{ {
"returnValues": { "returnValues": {
"_from": "0x1111…FFFFCCCC", "_from": "0x1111…FFFFCCCC",
"_id": "0x50…sd5adb20", "_id": "0x50…sd5adb20",
"_value": "0x420042" "_value": "0x420042"
}, },
"raw": { "raw": {
"data": "0x7f…91385", "data": "0x7f…91385",
"topics": ["0xfd4…b4ead7", "0x7f…1a91385"] "topics": ["0xfd4…b4ead7", "0x7f…1a91385"]
} }
} }
Additional Resources for Understanding Events Additional Resources for Understanding Events
============================================== ==============================================

View File

@ -223,14 +223,14 @@ following an internal naming schema and arguments of types not supported in the
The following identifiers are used for the types in the signatures: The following identifiers are used for the types in the signatures:
- Value types, non-storage ``string`` and non-storage ``bytes`` use the same identifiers as in the contract ABI. - Value types, non-storage ``string`` and non-storage ``bytes`` use the same identifiers as in the contract ABI.
- Non-storage array types follow the same convention as in the contract ABI, i.e. ``<type>[]`` for dynamic arrays and - Non-storage array types follow the same convention as in the contract ABI, i.e. ``<type>[]`` for dynamic arrays and
``<type>[M]`` for fixed-size arrays of ``M`` elements. ``<type>[M]`` for fixed-size arrays of ``M`` elements.
- Non-storage structs are referred to by their fully qualified name, i.e. ``C.S`` for ``contract C { struct S { ... } }``. - Non-storage structs are referred to by their fully qualified name, i.e. ``C.S`` for ``contract C { struct S { ... } }``.
- Storage pointer mappings use ``mapping(<keyType> => <valueType>) storage`` where ``<keyType>`` and ``<valueType>`` are - Storage pointer mappings use ``mapping(<keyType> => <valueType>) storage`` where ``<keyType>`` and ``<valueType>`` are
the identifiers for the key and value types of the mapping, respectively. the identifiers for the key and value types of the mapping, respectively.
- Other storage pointer types use the type identifier of their corresponding non-storage type, but append a single space - Other storage pointer types use the type identifier of their corresponding non-storage type, but append a single space
followed by ``storage`` to it. followed by ``storage`` to it.
The argument encoding is the same as for the regular contract ABI, except for storage pointers, which are encoded as a The argument encoding is the same as for the regular contract ABI, except for storage pointers, which are encoded as a
``uint256`` value referring to the storage slot to which they point. ``uint256`` value referring to the storage slot to which they point.

View File

@ -89,21 +89,21 @@ the sequence:
.. code-block:: none .. code-block:: none
PUSH 32 PUSH 32
PUSH 0 PUSH 0
CALLDATALOAD CALLDATALOAD
PUSH 100 PUSH 100
DUP2 DUP2
MSTORE MSTORE
KECCAK256 KECCAK256
or the equivalent Yul or the equivalent Yul
.. code-block:: yul .. code-block:: yul
let x := calldataload(0) let x := calldataload(0)
mstore(x, 100) mstore(x, 100)
let value := keccak256(x, 32) let value := keccak256(x, 32)
In this case, the optimizer tracks the value at a memory location ``calldataload(0)`` and then In this case, the optimizer tracks the value at a memory location ``calldataload(0)`` and then
realizes that the Keccak-256 hash can be evaluated at compile time. This only works if there is no realizes that the Keccak-256 hash can be evaluated at compile time. This only works if there is no
@ -116,14 +116,14 @@ For example,
.. code-block:: yul .. code-block:: yul
let x := calldataload(0) let x := calldataload(0)
mstore(x, 100) mstore(x, 100)
// Current knowledge memory location x -> 100 // Current knowledge memory location x -> 100
let y := add(x, 32) let y := add(x, 32)
// Does not clear the knowledge that x -> 100, since y does not write to [x, x + 32) // Does not clear the knowledge that x -> 100, since y does not write to [x, x + 32)
mstore(y, 200) mstore(y, 200)
// This Keccak-256 can now be evaluated // This Keccak-256 can now be evaluated
let value := keccak256(x, 32) let value := keccak256(x, 32)
Therefore, modifications to storage and memory locations, of say location ``l``, must erase Therefore, modifications to storage and memory locations, of say location ``l``, must erase
knowledge about storage or memory locations which may be equal to ``l``. More specifically, for knowledge about storage or memory locations which may be equal to ``l``. More specifically, for
@ -239,8 +239,8 @@ for all references to ``tag_f`` leaving it unused, s.t. it can be removed, yield
.. code-block:: text .. code-block:: text
...body of function f... ...body of function f...
...opcodes after call to f... ...opcodes after call to f...
So the call to function ``f`` is inlined and the original definition of ``f`` can be removed. So the call to function ``f`` is inlined and the original definition of ``f`` can be removed.
@ -269,11 +269,11 @@ backtracking.
All components of the Yul-based optimizer module are explained below. All components of the Yul-based optimizer module are explained below.
The following transformation steps are the main components: The following transformation steps are the main components:
- SSA Transform - SSA Transform
- Common Subexpression Eliminator - Common Subexpression Eliminator
- Expression Simplifier - Expression Simplifier
- Redundant Assign Eliminator - Redundant Assign Eliminator
- Full Function Inliner - Full Function Inliner
Optimizer Steps Optimizer Steps
--------------- ---------------
@ -281,36 +281,36 @@ Optimizer Steps
This is a list of all steps the Yul-based optimizer sorted alphabetically. You can find more information This is a list of all steps the Yul-based optimizer sorted alphabetically. You can find more information
on the individual steps and their sequence below. on the individual steps and their sequence below.
- :ref:`block-flattener`. - :ref:`block-flattener`.
- :ref:`circular-reference-pruner`. - :ref:`circular-reference-pruner`.
- :ref:`common-subexpression-eliminator`. - :ref:`common-subexpression-eliminator`.
- :ref:`conditional-simplifier`. - :ref:`conditional-simplifier`.
- :ref:`conditional-unsimplifier`. - :ref:`conditional-unsimplifier`.
- :ref:`control-flow-simplifier`. - :ref:`control-flow-simplifier`.
- :ref:`dead-code-eliminator`. - :ref:`dead-code-eliminator`.
- :ref:`equivalent-function-combiner`. - :ref:`equivalent-function-combiner`.
- :ref:`expression-joiner`. - :ref:`expression-joiner`.
- :ref:`expression-simplifier`. - :ref:`expression-simplifier`.
- :ref:`expression-splitter`. - :ref:`expression-splitter`.
- :ref:`for-loop-condition-into-body`. - :ref:`for-loop-condition-into-body`.
- :ref:`for-loop-condition-out-of-body`. - :ref:`for-loop-condition-out-of-body`.
- :ref:`for-loop-init-rewriter`. - :ref:`for-loop-init-rewriter`.
- :ref:`functional-inliner`. - :ref:`functional-inliner`.
- :ref:`function-grouper`. - :ref:`function-grouper`.
- :ref:`function-hoister`. - :ref:`function-hoister`.
- :ref:`function-specializer`. - :ref:`function-specializer`.
- :ref:`literal-rematerialiser`. - :ref:`literal-rematerialiser`.
- :ref:`load-resolver`. - :ref:`load-resolver`.
- :ref:`loop-invariant-code-motion`. - :ref:`loop-invariant-code-motion`.
- :ref:`redundant-assign-eliminator`. - :ref:`redundant-assign-eliminator`.
- :ref:`reasoning-based-simplifier`. - :ref:`reasoning-based-simplifier`.
- :ref:`rematerialiser`. - :ref:`rematerialiser`.
- :ref:`SSA-reverser`. - :ref:`SSA-reverser`.
- :ref:`SSA-transform`. - :ref:`SSA-transform`.
- :ref:`structural-simplifier`. - :ref:`structural-simplifier`.
- :ref:`unused-function-parameter-pruner`. - :ref:`unused-function-parameter-pruner`.
- :ref:`unused-pruner`. - :ref:`unused-pruner`.
- :ref:`var-decl-initializer`. - :ref:`var-decl-initializer`.
Selecting Optimizations Selecting Optimizations
----------------------- -----------------------
@ -375,7 +375,7 @@ After this step, a program has the following normal form:
.. code-block:: text .. code-block:: text
{ I F... } { I F... }
Where ``I`` is a (potentially empty) block that does not contain any function definitions (not even recursively) Where ``I`` is a (potentially empty) block that does not contain any function definitions (not even recursively)
and ``F`` is a list of function definitions such that no function contains a function definition. and ``F`` is a list of function definitions such that no function contains a function definition.
@ -589,8 +589,8 @@ For any variable ``a`` that is assigned to somewhere in the code
(variables that are declared with value and never re-assigned (variables that are declared with value and never re-assigned
are not modified) perform the following transforms: are not modified) perform the following transforms:
- replace ``let a := v`` by ``let a_i := v let a := a_i`` - replace ``let a := v`` by ``let a_i := v let a := a_i``
- replace ``a := v`` by ``let a_i := v a := a_i`` where ``i`` is a number such that ``a_i`` is yet unused. - replace ``a := v`` by ``let a_i := v a := a_i`` where ``i`` is a number such that ``a_i`` is yet unused.
Furthermore, always record the current value of ``i`` used for ``a`` and replace each Furthermore, always record the current value of ``i`` used for ``a`` and replace each
reference to ``a`` by ``a_i``. reference to ``a`` by ``a_i``.
@ -677,9 +677,9 @@ joins, the two mappings coming from the two branches are combined in the followi
Statements that are only in one mapping or have the same state are used unchanged. Statements that are only in one mapping or have the same state are used unchanged.
Conflicting values are resolved in the following way: Conflicting values are resolved in the following way:
- "unused", "undecided" -> "undecided" - "unused", "undecided" -> "undecided"
- "unused", "used" -> "used" - "unused", "used" -> "used"
- "undecided, "used" -> "used" - "undecided, "used" -> "used"
For for-loops, the condition, body and post-part are visited twice, taking For for-loops, the condition, body and post-part are visited twice, taking
the joining control-flow at the condition into account. the joining control-flow at the condition into account.
@ -735,10 +735,10 @@ is side-effect free and its evaluation only depends on the values of variables
and the call-constant state of the environment. Most expressions are movable. and the call-constant state of the environment. Most expressions are movable.
The following parts make an expression non-movable: The following parts make an expression non-movable:
- function calls (might be relaxed in the future if all statements in the function are movable) - function calls (might be relaxed in the future if all statements in the function are movable)
- opcodes that (can) have side-effects (like ``call`` or ``selfdestruct``) - opcodes that (can) have side-effects (like ``call`` or ``selfdestruct``)
- opcodes that read or write memory, storage or external state information - opcodes that read or write memory, storage or external state information
- opcodes that depend on the current PC, memory size or returndata size - opcodes that depend on the current PC, memory size or returndata size
DataflowAnalyzer DataflowAnalyzer
^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^
@ -836,8 +836,8 @@ ReasoningBasedSimplifier
This optimizer uses SMT solvers to check whether ``if`` conditions are constant. This optimizer uses SMT solvers to check whether ``if`` conditions are constant.
- If ``constraints AND condition`` is UNSAT, the condition is never true and the whole body can be removed. - If ``constraints AND condition`` is UNSAT, the condition is never true and the whole body can be removed.
- If ``constraints AND NOT condition`` is UNSAT, the condition is always true and can be replaced by ``1``. - If ``constraints AND NOT condition`` is UNSAT, the condition is always true and can be replaced by ``1``.
The simplifications above can only be applied if the condition is movable. The simplifications above can only be applied if the condition is movable.
@ -872,13 +872,13 @@ we cannot assign a specific value.
Current features: Current features:
- switch cases: insert "<condition> := <caseLabel>" - switch cases: insert "<condition> := <caseLabel>"
- after if statement with terminating control-flow, insert "<condition> := 0" - after if statement with terminating control-flow, insert "<condition> := 0"
Future features: Future features:
- allow replacements by "1" - allow replacements by "1"
- take termination of user-defined functions into account - take termination of user-defined functions into account
Works best with SSA form and if dead code removal has run before. Works best with SSA form and if dead code removal has run before.
@ -898,15 +898,15 @@ ControlFlowSimplifier
Simplifies several control-flow structures: Simplifies several control-flow structures:
- replace if with empty body with pop(condition) - replace if with empty body with pop(condition)
- remove empty default switch case - remove empty default switch case
- remove empty switch case if no default case exists - remove empty switch case if no default case exists
- replace switch with no cases with pop(expression) - replace switch with no cases with pop(expression)
- turn switch with single case into if - turn switch with single case into if
- replace switch with only default case with pop(expression) and body - replace switch with only default case with pop(expression) and body
- replace switch with const expr with matching case body - replace switch with const expr with matching case body
- replace ``for`` with terminating control flow and without other break/continue by ``if`` - replace ``for`` with terminating control flow and without other break/continue by ``if``
- remove ``leave`` at the end of a function. - remove ``leave`` at the end of a function.
None of these operations depend on the data flow. The StructuralSimplifier None of these operations depend on the data flow. The StructuralSimplifier
performs similar tasks that do depend on data flow. performs similar tasks that do depend on data flow.
@ -956,13 +956,13 @@ StructuralSimplifier
This is a general step that performs various kinds of simplifications on This is a general step that performs various kinds of simplifications on
a structural level: a structural level:
- replace if statement with empty body by ``pop(condition)`` - replace if statement with empty body by ``pop(condition)``
- replace if statement with true condition by its body - replace if statement with true condition by its body
- remove if statement with false condition - remove if statement with false condition
- turn switch with single case into if - turn switch with single case into if
- replace switch with only default case by ``pop(expression)`` and body - replace switch with only default case by ``pop(expression)`` and body
- replace switch with literal expression by matching case body - replace switch with literal expression by matching case body
- replace for loop with false condition by its initialization part - replace for loop with false condition by its initialization part
This component uses the Dataflow Analyzer. This component uses the Dataflow Analyzer.
@ -1008,8 +1008,8 @@ declarations inside conditional branches will not be moved out of the loop.
Requirements: Requirements:
- The Disambiguator, ForLoopInitRewriter and FunctionHoister must be run upfront. - The Disambiguator, ForLoopInitRewriter and FunctionHoister must be run upfront.
- Expression splitter and SSA transform should be run upfront to obtain better result. - Expression splitter and SSA transform should be run upfront to obtain better result.
Function-Level Optimizations Function-Level Optimizations
@ -1053,8 +1053,8 @@ remove the parameter and create a new "linking" function as follows:
.. code-block:: yul .. code-block:: yul
function f(a,b) -> x { x := div(a,b) } function f(a,b) -> x { x := div(a,b) }
function f2(a,b,c) -> x, y { x := f(a,b) } function f2(a,b,c) -> x, y { x := f(a,b) }
and replace all references to ``f`` by ``f2``. and replace all references to ``f`` by ``f2``.
The inliner should be run afterwards to make sure that all references to ``f2`` are replaced by The inliner should be run afterwards to make sure that all references to ``f2`` are replaced by
@ -1089,15 +1089,15 @@ FunctionalInliner
This component of the optimizer performs restricted function inlining by inlining functions that can be This component of the optimizer performs restricted function inlining by inlining functions that can be
inlined inside functional expressions, i.e. functions that: inlined inside functional expressions, i.e. functions that:
- return a single value. - return a single value.
- have a body like ``r := <functional expression>``. - have a body like ``r := <functional expression>``.
- neither reference themselves nor ``r`` in the right hand side. - neither reference themselves nor ``r`` in the right hand side.
Furthermore, for all parameters, all of the following need to be true: Furthermore, for all parameters, all of the following need to be true:
- The argument is movable. - The argument is movable.
- The parameter is either referenced less than twice in the function body, or the argument is rather cheap - The parameter is either referenced less than twice in the function body, or the argument is rather cheap
("cost" of at most 1, like a constant up to 0xff). ("cost" of at most 1, like a constant up to 0xff).
Example: The function to be inlined has the form of ``function f(...) -> r { r := E }`` where Example: The function to be inlined has the form of ``function f(...) -> r { r := E }`` where
``E`` is an expression that does not reference ``r`` and all arguments in the function call are movable expressions. ``E`` is an expression that does not reference ``r`` and all arguments in the function call are movable expressions.

View File

@ -56,8 +56,8 @@ used in a single modifier.
In order to compress these source mappings especially for bytecode, the In order to compress these source mappings especially for bytecode, the
following rules are used: following rules are used:
- If a field is empty, the value of the preceding element is used. - If a field is empty, the value of the preceding element is used.
- If a ``:`` is missing, all following fields are considered empty. - If a ``:`` is missing, all following fields are considered empty.
This means the following source mappings represent the same information: This means the following source mappings represent the same information:

View File

@ -1,6 +1,6 @@
******************************** *********************************
Solidity IR-based Codegen Changes Solidity IR-based Codegen Changes
******************************** *********************************
This section highlights the main differences between the old and the IR-based codegen, This section highlights the main differences between the old and the IR-based codegen,
along with the reasoning behind the changes and how to update affected code. along with the reasoning behind the changes and how to update affected code.
@ -11,180 +11,187 @@ Semantic Only Changes
This section lists the changes that are semantic-only, thus potentially This section lists the changes that are semantic-only, thus potentially
hiding new and different behavior in existing code. hiding new and different behavior in existing code.
* When storage structs are deleted, every storage slot that contains a member of the struct is set to zero entirely. Formally, padding space was left untouched. - When storage structs are deleted, every storage slot that contains a member of the struct is set to zero entirely. Formally, padding space was left untouched.
Consequently, if the padding space within a struct is used to store data (e.g. in the context of a contract upgrade), you have to be aware that ``delete`` will now also clear the added member (while it wouldn't have been cleared in the past). Consequently, if the padding space within a struct is used to store data (e.g. in the context of a contract upgrade), you have to be aware that ``delete`` will now also clear the added member (while it wouldn't have been cleared in the past).
.. code-block:: solidity .. code-block:: solidity
// SPDX-License-Identifier: GPL-3.0 // SPDX-License-Identifier: GPL-3.0
pragma solidity >0.7.0; pragma solidity >0.7.0;
contract C { contract C {
struct S { struct S {
uint64 y; uint64 y;
uint64 z; uint64 z;
} }
S s; S s;
function f() public { function f() public {
// ... // ...
delete s; delete s;
// s occupies only first 16 bytes of the 32 bytes slot // s occupies only first 16 bytes of the 32 bytes slot
// delete will write zero to the full slot // delete will write zero to the full slot
} }
} }
We have the same behavior for implicit delete, for example when array of structs is shortened. We have the same behavior for implicit delete, for example when array of structs is shortened.
* Function modifiers are implemented in a slightly different way regarding function parameters. - Function modifiers are implemented in a slightly different way regarding function parameters.
This especially has an effect if the placeholder ``_;`` is evaluated multiple times in a modifier. This especially has an effect if the placeholder ``_;`` is evaluated multiple times in a modifier.
In the old code generator, each function parameter has a fixed slot on the stack. If the function In the old code generator, each function parameter has a fixed slot on the stack. If the function
is run multiple times because ``_;`` is used multiple times or used in a loop, then a change to the is run multiple times because ``_;`` is used multiple times or used in a loop, then a change to the
function parameter's value is visible in the next execution of the function. function parameter's value is visible in the next execution of the function.
The new code generator implements modifiers using actual functions and passes function parameters on. The new code generator implements modifiers using actual functions and passes function parameters on.
This means that multiple executions of a function will get the same values for the parameters. This means that multiple executions of a function will get the same values for the parameters.
.. code-block:: solidity .. code-block:: solidity
// SPDX-License-Identifier: GPL-3.0 // SPDX-License-Identifier: GPL-3.0
pragma solidity >=0.7.0; pragma solidity >=0.7.0;
contract C { contract C {
function f(uint _a) public pure mod() returns (uint _r) { function f(uint _a) public pure mod() returns (uint _r) {
_r = _a++; _r = _a++;
} }
modifier mod() { _; _; } modifier mod() { _; _; }
} }
If you execute ``f(0)`` in the old code generator, it will return ``2``, while If you execute ``f(0)`` in the old code generator, it will return ``2``, while
it will return ``1`` when using the new code generator. it will return ``1`` when using the new code generator.
* The order of contract initialization has changed in case of inheritance. - The order of contract initialization has changed in case of inheritance.
The order used to be: The order used to be:
- All state variables are zero-initialized at the beginning.
- Evaluate base constructor arguments from most derived to most base contract.
- Initialize all state variables in the whole inheritance hierarchy from most base to most derived.
- Run the constructor, if present, for all contracts in the linearized hierarchy from most base to most derived.
New order: - All state variables are zero-initialized at the beginning.
- All state variables are zero-initialized at the beginning. - Evaluate base constructor arguments from most derived to most base contract.
- Evaluate base constructor arguments from most derived to most base contract. - Initialize all state variables in the whole inheritance hierarchy from most base to most derived.
- For every contract in order from most base to most derived in the linearized hierarchy execute: - Run the constructor, if present, for all contracts in the linearized hierarchy from most base to most derived.
1. If present at declaration, initial values are assigned to state variables.
2. Constructor, if present. New order:
- All state variables are zero-initialized at the beginning.
- Evaluate base constructor arguments from most derived to most base contract.
- For every contract in order from most base to most derived in the linearized hierarchy execute:
1. If present at declaration, initial values are assigned to state variables.
2. Constructor, if present.
This causes differences in some contracts, for example: This causes differences in some contracts, for example:
.. code-block:: solidity .. code-block:: solidity
// SPDX-License-Identifier: GPL-3.0 // SPDX-License-Identifier: GPL-3.0
pragma solidity >0.7.0; pragma solidity >0.7.0;
contract A { contract A {
uint x; uint x;
constructor() { constructor() {
x = 42; x = 42;
} }
function f() public view returns(uint256) { function f() public view returns(uint256) {
return x; return x;
} }
} }
contract B is A { contract B is A {
uint public y = f(); uint public y = f();
} }
Previously, ``y`` would be set to 0. This is due to the fact that we would first initialize state variables: First, ``x`` is set to 0, and when initializing ``y``, ``f()`` would return 0 causing ``y`` to be 0 as well. Previously, ``y`` would be set to 0. This is due to the fact that we would first initialize state variables: First, ``x`` is set to 0, and when initializing ``y``, ``f()`` would return 0 causing ``y`` to be 0 as well.
With the new rules, ``y`` will be set to 42. We first initialize ``x`` to 0, then call A's constructor which sets ``x`` to 42. Finally, when initializing ``y``, ``f()`` returns 42 causing ``y`` to be 42. With the new rules, ``y`` will be set to 42. We first initialize ``x`` to 0, then call A's constructor which sets ``x`` to 42. Finally, when initializing ``y``, ``f()`` returns 42 causing ``y`` to be 42.
* Copying ``bytes`` arrays from memory to storage is implemented in a different way. The old code generator always copies full words, while the new one cuts the byte array after its end. The old behaviour can lead to dirty data being copied after the end of the array (but still in the same storage slot). - Copying ``bytes`` arrays from memory to storage is implemented in a different way. The old code generator always copies full words, while the new one cuts the byte array after its end. The old behaviour can lead to dirty data being copied after the end of the array (but still in the same storage slot).
This causes differences in some contracts, for example: This causes differences in some contracts, for example:
.. code-block:: solidity .. code-block:: solidity
// SPDX-License-Identifier: GPL-3.0 // SPDX-License-Identifier: GPL-3.0
pragma solidity >0.8.0; pragma solidity >0.8.0;
contract C { contract C {
bytes x; bytes x;
function f() public returns (uint _r) { function f() public returns (uint _r) {
bytes memory m = "tmp"; bytes memory m = "tmp";
assembly { assembly {
mstore(m, 8) mstore(m, 8)
mstore(add(m, 32), "deadbeef15dead") mstore(add(m, 32), "deadbeef15dead")
} }
x = m; x = m;
assembly { assembly {
_r := sload(x.slot) _r := sload(x.slot)
} }
} }
} }
Previously ``f()`` would return ``0x6465616462656566313564656164000000000000000000000000000000000010`` (it has correct length, and correct first 8 elements, but then it contains dirty data which was set via assembly). Previously ``f()`` would return ``0x6465616462656566313564656164000000000000000000000000000000000010`` (it has correct length, and correct first 8 elements, but then it contains dirty data which was set via assembly).
Now it is returning ``0x6465616462656566000000000000000000000000000000000000000000000010`` (it has correct length, and correct elements, but does not contain superfluous data). Now it is returning ``0x6465616462656566000000000000000000000000000000000000000000000010`` (it has correct length, and correct elements, but does not contain superfluous data).
.. index:: ! evaluation order; expression .. index:: ! evaluation order; expression
* For the old code generator, the evaluation order of expressions is unspecified. - For the old code generator, the evaluation order of expressions is unspecified.
For the new code generator, we try to evaluate in source order (left to right), but do not guarantee it. For the new code generator, we try to evaluate in source order (left to right), but do not guarantee it.
This can lead to semantic differences. This can lead to semantic differences.
For example: For example:
.. code-block:: solidity .. code-block:: solidity
// SPDX-License-Identifier: GPL-3.0 // SPDX-License-Identifier: GPL-3.0
pragma solidity >0.8.0; pragma solidity >0.8.0;
contract C { contract C {
function preincr_u8(uint8 _a) public pure returns (uint8) { function preincr_u8(uint8 _a) public pure returns (uint8) {
return ++_a + _a; return ++_a + _a;
} }
} }
The function ``preincr_u8(1)`` returns the following values: The function ``preincr_u8(1)`` returns the following values:
- Old code generator: 3 (``1 + 2``) but the return value is unspecified in general
- New code generator: 4 (``2 + 2``) but the return value is not guaranteed
.. index:: ! evaluation order; function arguments - Old code generator: 3 (``1 + 2``) but the return value is unspecified in general
- New code generator: 4 (``2 + 2``) but the return value is not guaranteed
On the other hand, function argument expressions are evaluated in the same order by both code generators with the exception of the global functions ``addmod`` and ``mulmod``. .. index:: ! evaluation order; function arguments
For example:
.. code-block:: solidity On the other hand, function argument expressions are evaluated in the same order by both code generators with the exception of the global functions ``addmod`` and ``mulmod``.
For example:
// SPDX-License-Identifier: GPL-3.0 .. code-block:: solidity
pragma solidity >0.8.0;
contract C {
function add(uint8 _a, uint8 _b) public pure returns (uint8) {
return _a + _b;
}
function g(uint8 _a, uint8 _b) public pure returns (uint8) {
return add(++_a + ++_b, _a + _b);
}
}
The function ``g(1, 2)`` returns the following values: // SPDX-License-Identifier: GPL-3.0
- Old code generator: ``10`` (``add(2 + 3, 2 + 3)``) but the return value is unspecified in general pragma solidity >0.8.0;
- New code generator: ``10`` but the return value is not guaranteed contract C {
function add(uint8 _a, uint8 _b) public pure returns (uint8) {
return _a + _b;
}
function g(uint8 _a, uint8 _b) public pure returns (uint8) {
return add(++_a + ++_b, _a + _b);
}
}
The arguments to the global functions ``addmod`` and ``mulmod`` are evaluated right-to-left by the old code generator The function ``g(1, 2)`` returns the following values:
and left-to-right by the new code generator.
For example:
:: - Old code generator: ``10`` (``add(2 + 3, 2 + 3)``) but the return value is unspecified in general
// SPDX-License-Identifier: GPL-3.0 - New code generator: ``10`` but the return value is not guaranteed
pragma solidity >0.8.0;
contract C {
function f() public pure returns (uint256 aMod, uint256 mMod) {
uint256 x = 3;
// Old code gen: add/mulmod(5, 4, 3)
// New code gen: add/mulmod(4, 5, 5)
aMod = addmod(++x, ++x, x);
mMod = mulmod(++x, ++x, x);
}
}
The function ``f()`` returns the following values: The arguments to the global functions ``addmod`` and ``mulmod`` are evaluated right-to-left by the old code generator
- Old code generator: ``aMod = 0`` and ``mMod = 2`` and left-to-right by the new code generator.
- New code generator: ``aMod = 4`` and ``mMod = 0`` For example:
::
// SPDX-License-Identifier: GPL-3.0
pragma solidity >0.8.0;
contract C {
function f() public pure returns (uint256 aMod, uint256 mMod) {
uint256 x = 3;
// Old code gen: add/mulmod(5, 4, 3)
// New code gen: add/mulmod(4, 5, 5)
aMod = addmod(++x, ++x, x);
mMod = mulmod(++x, ++x, x);
}
}
The function ``f()`` returns the following values:
- Old code generator: ``aMod = 0`` and ``mMod = 2``
- New code generator: ``aMod = 4`` and ``mMod = 0``
Internals Internals
@ -234,6 +241,7 @@ For example:
} }
The function ``f(1)`` returns the following values: The function ``f(1)`` returns the following values:
- Old code generator: (``fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffe``, ``00000000000000000000000000000000000000000000000000000000000000fe``) - Old code generator: (``fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffe``, ``00000000000000000000000000000000000000000000000000000000000000fe``)
- New code generator: (``00000000000000000000000000000000000000000000000000000000000000fe``, ``00000000000000000000000000000000000000000000000000000000000000fe``) - New code generator: (``00000000000000000000000000000000000000000000000000000000000000fe``, ``00000000000000000000000000000000000000000000000000000000000000fe``)

View File

@ -182,7 +182,7 @@ At a global level, you can use import statements of the following form:
:: ::
import "filename"; import "filename";
The ``filename`` part is called an *import path*. The ``filename`` part is called an *import path*.
This statement imports all global symbols from "filename" (and symbols imported there) into the This statement imports all global symbols from "filename" (and symbols imported there) into the
@ -197,7 +197,7 @@ the global symbols from ``"filename"``:
:: ::
import * as symbolName from "filename"; import * as symbolName from "filename";
which results in all global symbols being available in the format ``symbolName.symbol``. which results in all global symbols being available in the format ``symbolName.symbol``.
@ -215,7 +215,7 @@ the code below creates new global symbols ``alias`` and ``symbol2`` which refere
:: ::
import {symbol1 as alias, symbol2} from "filename"; import {symbol1 as alias, symbol2} from "filename";
.. index:: virtual filesystem, source unit name, import; path, filesystem path, import callback, Remix IDE .. index:: virtual filesystem, source unit name, import; path, filesystem path, import callback, Remix IDE
@ -255,12 +255,12 @@ Single-line comments (``//``) and multi-line comments (``/*...*/``) are possible
:: ::
// This is a single-line comment. // This is a single-line comment.
/* /*
This is a This is a
multi-line comment. multi-line comment.
*/ */
.. note:: .. note::
A single-line comment is terminated by any unicode line terminator A single-line comment is terminated by any unicode line terminator

View File

@ -166,9 +166,9 @@ Inheritance Notes
Functions without NatSpec will automatically inherit the documentation of their Functions without NatSpec will automatically inherit the documentation of their
base function. Exceptions to this are: base function. Exceptions to this are:
* When the parameter names are different. * When the parameter names are different.
* When there is more than one base function. * When there is more than one base function.
* When there is an explicit ``@inheritdoc`` tag which specifies which contract should be used to inherit. * When there is an explicit ``@inheritdoc`` tag which specifies which contract should be used to inherit.
.. _header-output: .. _header-output:

View File

@ -44,23 +44,24 @@ where the default is no engine. Selecting the engine enables the SMTChecker on a
.. note:: .. note::
Prior to Solidity 0.8.4, the default way to enable the SMTChecker was via Prior to Solidity 0.8.4, the default way to enable the SMTChecker was via
``pragma experimental SMTChecker;`` and only the contracts containing the ``pragma experimental SMTChecker;`` and only the contracts containing the
pragma would be analyzed. That pragma has been deprecated, and although it pragma would be analyzed. That pragma has been deprecated, and although it
still enables the SMTChecker for backwards compatibility, it will be removed still enables the SMTChecker for backwards compatibility, it will be removed
in Solidity 0.9.0. Note also that now using the pragma even in a single file in Solidity 0.9.0. Note also that now using the pragma even in a single file
enables the SMTChecker for all files. enables the SMTChecker for all files.
.. note:: .. note::
The lack of warnings for a verification target represents an undisputed
mathematical proof of correctness, assuming no bugs in the SMTChecker and The lack of warnings for a verification target represents an undisputed
the underlying solver. Keep in mind that these problems are mathematical proof of correctness, assuming no bugs in the SMTChecker and
*very hard* and sometimes *impossible* to solve automatically in the the underlying solver. Keep in mind that these problems are
general case. Therefore, several properties might not be solved or might *very hard* and sometimes *impossible* to solve automatically in the
lead to false positives for large contracts. Every proven property should general case. Therefore, several properties might not be solved or might
be seen as an important achievement. For advanced users, see :ref:`SMTChecker Tuning <smtchecker_options>` lead to false positives for large contracts. Every proven property should
to learn a few options that might help proving more complex be seen as an important achievement. For advanced users, see :ref:`SMTChecker Tuning <smtchecker_options>`
properties. to learn a few options that might help proving more complex
properties.
******** ********
Tutorial Tutorial
@ -202,8 +203,9 @@ Note that in this example the SMTChecker will automatically try to prove three p
3. The assertion is always true. 3. The assertion is always true.
.. note:: .. note::
The properties involve loops, which makes it *much much* harder than the previous
examples, so beware of loops! The properties involve loops, which makes it *much much* harder than the previous
examples, so beware of loops!
All the properties are correctly proven safe. Feel free to change the All the properties are correctly proven safe. Feel free to change the
properties and/or add restrictions on the array to see different results. properties and/or add restrictions on the array to see different results.
@ -233,18 +235,18 @@ gives us:
.. code-block:: bash .. code-block:: bash
Warning: CHC: Assertion violation happens here. Warning: CHC: Assertion violation happens here.
Counterexample: Counterexample:
_a = [0, 0, 0, 0, 0] _a = [0, 0, 0, 0, 0]
= 0 = 0
Transaction trace: Transaction trace:
Test.constructor() Test.constructor()
Test.max([0, 0, 0, 0, 0]) Test.max([0, 0, 0, 0, 0])
--> max.sol:14:4: --> max.sol:14:4:
| |
14 | assert(m > _a[i]); 14 | assert(m > _a[i]);
State Properties State Properties
@ -323,26 +325,26 @@ the SMTChecker tells us exactly *how* to reach (2, 4):
.. code-block:: bash .. code-block:: bash
Warning: CHC: Assertion violation happens here. Warning: CHC: Assertion violation happens here.
Counterexample: Counterexample:
x = 2, y = 4 x = 2, y = 4
Transaction trace: Transaction trace:
Robot.constructor() Robot.constructor()
State: x = 0, y = 0 State: x = 0, y = 0
Robot.moveLeftUp() Robot.moveLeftUp()
State: x = (- 1), y = 1 State: x = (- 1), y = 1
Robot.moveRightUp() Robot.moveRightUp()
State: x = 0, y = 2 State: x = 0, y = 2
Robot.moveRightUp() Robot.moveRightUp()
State: x = 1, y = 3 State: x = 1, y = 3
Robot.moveRightUp() Robot.moveRightUp()
State: x = 2, y = 4 State: x = 2, y = 4
Robot.reach_2_4() Robot.reach_2_4()
--> r.sol:35:4: --> r.sol:35:4:
| |
35 | assert(!(x == 2 && y == 4)); 35 | assert(!(x == 2 && y == 4));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ^^^^^^^^^^^^^^^^^^^^^^^^^^^
Note that the path above is not necessarily deterministic, as there are Note that the path above is not necessarily deterministic, as there are
other paths that could reach (2, 4). The choice of which path is shown other paths that could reach (2, 4). The choice of which path is shown
@ -367,36 +369,36 @@ anything, including reenter the caller contract.
pragma solidity >=0.8.0; pragma solidity >=0.8.0;
interface Unknown { interface Unknown {
function run() external; function run() external;
} }
contract Mutex { contract Mutex {
uint x; uint x;
bool lock; bool lock;
Unknown immutable unknown; Unknown immutable unknown;
constructor(Unknown _u) { constructor(Unknown _u) {
require(address(_u) != address(0)); require(address(_u) != address(0));
unknown = _u; unknown = _u;
} }
modifier mutex { modifier mutex {
require(!lock); require(!lock);
lock = true; lock = true;
_; _;
lock = false; lock = false;
} }
function set(uint _x) mutex public { function set(uint _x) mutex public {
x = _x; x = _x;
} }
function run() mutex public { function run() mutex public {
uint xPre = x; uint xPre = x;
unknown.run(); unknown.run();
assert(xPre == x); assert(xPre == x);
} }
} }
The example above shows a contract that uses a mutex flag to forbid reentrancy. The example above shows a contract that uses a mutex flag to forbid reentrancy.
@ -410,20 +412,20 @@ that the assertion fails:
.. code-block:: bash .. code-block:: bash
Warning: CHC: Assertion violation happens here. Warning: CHC: Assertion violation happens here.
Counterexample: Counterexample:
x = 1, lock = true, unknown = 1 x = 1, lock = true, unknown = 1
Transaction trace: Transaction trace:
Mutex.constructor(1) Mutex.constructor(1)
State: x = 0, lock = false, unknown = 1 State: x = 0, lock = false, unknown = 1
Mutex.run() Mutex.run()
unknown.run() -- untrusted external call, synthesized as: unknown.run() -- untrusted external call, synthesized as:
Mutex.set(1) -- reentrant call Mutex.set(1) -- reentrant call
--> m.sol:32:3: --> m.sol:32:3:
| |
32 | assert(xPre == x); 32 | assert(xPre == x);
| ^^^^^^^^^^^^^^^^^ | ^^^^^^^^^^^^^^^^^
.. _smtchecker_options: .. _smtchecker_options:
@ -494,12 +496,11 @@ which has the following form:
.. code-block:: none .. code-block:: none
contracts contracts
{ {
"source1.sol": ["contract1"], "source1.sol": ["contract1"],
"source2.sol": ["contract2", "contract3"] "source2.sol": ["contract2", "contract3"]
} }
.. _smtchecker_engines: .. _smtchecker_engines:

View File

@ -128,10 +128,10 @@ The modulo operation ``a % n`` yields the remainder ``r`` after the division of
by the operand ``n``, where ``q = int(a / n)`` and ``r = a - (n * q)``. This means that modulo by the operand ``n``, where ``q = int(a / n)`` and ``r = a - (n * q)``. This means that modulo
results in the same sign as its left operand (or zero) and ``a % n == -(-a % n)`` holds for negative ``a``: results in the same sign as its left operand (or zero) and ``a % n == -(-a % n)`` holds for negative ``a``:
* ``int256(5) % int256(2) == int256(1)`` * ``int256(5) % int256(2) == int256(1)``
* ``int256(5) % int256(-2) == int256(1)`` * ``int256(5) % int256(-2) == int256(1)``
* ``int256(-5) % int256(2) == int256(-1)`` * ``int256(-5) % int256(2) == int256(-1)``
* ``int256(-5) % int256(-2) == int256(-1)`` * ``int256(-5) % int256(-2) == int256(-1)``
.. note:: .. note::
Modulo with zero causes a :ref:`Panic error<assert-and-require>`. This check can **not** be disabled through ``unchecked { ... }``. Modulo with zero causes a :ref:`Panic error<assert-and-require>`. This check can **not** be disabled through ``unchecked { ... }``.
@ -184,8 +184,8 @@ Address
The address type comes in two flavours, which are largely identical: The address type comes in two flavours, which are largely identical:
- ``address``: Holds a 20 byte value (size of an Ethereum address). - ``address``: Holds a 20 byte value (size of an Ethereum address).
- ``address payable``: Same as ``address``, but with the additional members ``transfer`` and ``send``. - ``address payable``: Same as ``address``, but with the additional members ``transfer`` and ``send``.
The idea behind this distinction is that ``address payable`` is an address you can send Ether to, The idea behind this distinction is that ``address payable`` is an address you can send Ether to,
while a plain ``address`` cannot be sent Ether. while a plain ``address`` cannot be sent Ether.
@ -510,15 +510,15 @@ String literals can only contain printable ASCII characters, which means the cha
Additionally, string literals also support the following escape characters: Additionally, string literals also support the following escape characters:
- ``\<newline>`` (escapes an actual newline) - ``\<newline>`` (escapes an actual newline)
- ``\\`` (backslash) - ``\\`` (backslash)
- ``\'`` (single quote) - ``\'`` (single quote)
- ``\"`` (double quote) - ``\"`` (double quote)
- ``\n`` (newline) - ``\n`` (newline)
- ``\r`` (carriage return) - ``\r`` (carriage return)
- ``\t`` (tab) - ``\t`` (tab)
- ``\xNN`` (hex escape, see below) - ``\xNN`` (hex escape, see below)
- ``\uNNNN`` (unicode escape, see below) - ``\uNNNN`` (unicode escape, see below)
``\xNN`` takes a hex value and inserts the appropriate byte, while ``\uNNNN`` takes a Unicode codepoint and inserts an UTF-8 sequence. ``\xNN`` takes a hex value and inserts the appropriate byte, while ``\uNNNN`` takes a Unicode codepoint and inserts an UTF-8 sequence.
@ -660,9 +660,9 @@ their parameter types are identical, their return types are identical,
their internal/external property is identical and the state mutability of ``A`` their internal/external property is identical and the state mutability of ``A``
is more restrictive than the state mutability of ``B``. In particular: is more restrictive than the state mutability of ``B``. In particular:
- ``pure`` functions can be converted to ``view`` and ``non-payable`` functions - ``pure`` functions can be converted to ``view`` and ``non-payable`` functions
- ``view`` functions can be converted to ``non-payable`` functions - ``view`` functions can be converted to ``non-payable`` functions
- ``payable`` functions can be converted to ``non-payable`` functions - ``payable`` functions can be converted to ``non-payable`` functions
No other conversions between function types are possible. No other conversions between function types are possible.

View File

@ -29,11 +29,11 @@ Suffixes like ``seconds``, ``minutes``, ``hours``, ``days`` and ``weeks``
after literal numbers can be used to specify units of time where seconds are the base after literal numbers can be used to specify units of time where seconds are the base
unit and units are considered naively in the following way: unit and units are considered naively in the following way:
* ``1 == 1 seconds`` * ``1 == 1 seconds``
* ``1 minutes == 60 seconds`` * ``1 minutes == 60 seconds``
* ``1 hours == 60 minutes`` * ``1 hours == 60 minutes``
* ``1 days == 24 hours`` * ``1 days == 24 hours``
* ``1 weeks == 7 days`` * ``1 weeks == 7 days``
Take care if you perform calendar calculations using these units, because Take care if you perform calendar calculations using these units, because
not every year equals 365 days and not even every day has 24 hours not every year equals 365 days and not even every day has 24 hours

View File

@ -30,8 +30,8 @@ set it to ``--optimize-runs=1``. If you expect many transactions and do not care
output size, set ``--optimize-runs`` to a high number. output size, set ``--optimize-runs`` to a high number.
This parameter has effects on the following (this might change in the future): This parameter has effects on the following (this might change in the future):
- the size of the binary search in the function dispatch routine - the size of the binary search in the function dispatch routine
- the way constants like large numbers or strings are stored - the way constants like large numbers or strings are stored
.. index:: allowed paths, --allow-paths, base path, --base-path .. index:: allowed paths, --allow-paths, base path, --base-path
@ -136,13 +136,13 @@ key in the ``"settings"`` field:
.. code-block:: none .. code-block:: none
{ {
"sources": { ... }, "sources": { ... },
"settings": { "settings": {
"optimizer": { ... }, "optimizer": { ... },
"evmVersion": "<VERSION>" "evmVersion": "<VERSION>"
}
} }
}
Target Options Target Options
-------------- --------------
@ -781,7 +781,7 @@ It is recommended to explicitly specify the upgrade modules by using ``--modules
.. code-block:: none .. code-block:: none
$ solidity-upgrade --modules constructor-visibility,now,dotsyntax Source.sol $ solidity-upgrade --modules constructor-visibility,now,dotsyntax Source.sol
The command above applies all changes as shown below. Please review them carefully (the pragmas will The command above applies all changes as shown below. Please review them carefully (the pragmas will
have to be updated manually.) have to be updated manually.)

View File

@ -157,16 +157,16 @@ where an object is expected.
Inside a code block, the following elements can be used Inside a code block, the following elements can be used
(see the later sections for more details): (see the later sections for more details):
- literals, i.e. ``0x123``, ``42`` or ``"abc"`` (strings up to 32 characters) - literals, i.e. ``0x123``, ``42`` or ``"abc"`` (strings up to 32 characters)
- calls to builtin functions, e.g. ``add(1, mload(0))`` - calls to builtin functions, e.g. ``add(1, mload(0))``
- variable declarations, e.g. ``let x := 7``, ``let x := add(y, 3)`` or ``let x`` (initial value of 0 is assigned) - variable declarations, e.g. ``let x := 7``, ``let x := add(y, 3)`` or ``let x`` (initial value of 0 is assigned)
- identifiers (variables), e.g. ``add(3, x)`` - identifiers (variables), e.g. ``add(3, x)``
- assignments, e.g. ``x := add(y, 3)`` - assignments, e.g. ``x := add(y, 3)``
- blocks where local variables are scoped inside, e.g. ``{ let x := 3 { let y := add(x, 1) } }`` - blocks where local variables are scoped inside, e.g. ``{ let x := 3 { let y := add(x, 1) } }``
- if statements, e.g. ``if lt(a, b) { sstore(0, 1) }`` - if statements, e.g. ``if lt(a, b) { sstore(0, 1) }``
- switch statements, e.g. ``switch mload(0) case 0 { revert() } default { mstore(0, 1) }`` - switch statements, e.g. ``switch mload(0) case 0 { revert() } default { mstore(0, 1) }``
- for loops, e.g. ``for { let i := 0} lt(i, 10) { i := add(i, 1) } { mstore(i, 7) }`` - for loops, e.g. ``for { let i := 0} lt(i, 10) { i := add(i, 1) } { mstore(i, 7) }``
- function definitions, e.g. ``function f(a, b) -> c { c := add(a, b) }``` - function definitions, e.g. ``function f(a, b) -> c { c := add(a, b) }```
Multiple syntactical elements can follow each other simply separated by Multiple syntactical elements can follow each other simply separated by
whitespace, i.e. there is no terminating ``;`` or newline required. whitespace, i.e. there is no terminating ``;`` or newline required.
@ -985,9 +985,10 @@ that are not known to the Yul compiler. It also allows you to create
bytecode sequences that will not be modified by the optimizer. bytecode sequences that will not be modified by the optimizer.
The functions are ``verbatim_<n>i_<m>o("<data>", ...)``, where The functions are ``verbatim_<n>i_<m>o("<data>", ...)``, where
- ``n`` is a decimal between 0 and 99 that specifies the number of input stack slots / variables
- ``m`` is a decimal between 0 and 99 that specifies the number of output stack slots / variables - ``n`` is a decimal between 0 and 99 that specifies the number of input stack slots / variables
- ``data`` is a string literal that contains the sequence of bytes - ``m`` is a decimal between 0 and 99 that specifies the number of output stack slots / variables
- ``data`` is a string literal that contains the sequence of bytes
If you for example want to define a function that multiplies the input If you for example want to define a function that multiplies the input
by two, without the optimizer touching the constant two, you can use by two, without the optimizer touching the constant two, you can use
@ -1022,15 +1023,15 @@ verbatim bytecode that are not checked by
the compiler. Violations of these restrictions can result in the compiler. Violations of these restrictions can result in
undefined behaviour. undefined behaviour.
- Control-flow should not jump into or out of verbatim blocks, - Control-flow should not jump into or out of verbatim blocks,
but it can jump within the same verbatim block. but it can jump within the same verbatim block.
- Stack contents apart from the input and output parameters - Stack contents apart from the input and output parameters
should not be accessed. should not be accessed.
- The stack height difference should be exactly ``m - n`` - The stack height difference should be exactly ``m - n``
(output slots minus input slots). (output slots minus input slots).
- Verbatim bytecode cannot make any assumptions about the - Verbatim bytecode cannot make any assumptions about the
surrounding bytecode. All required parameters have to be surrounding bytecode. All required parameters have to be
passed in as stack variables. passed in as stack variables.
The optimizer does not analyze verbatim bytecode and always The optimizer does not analyze verbatim bytecode and always
assumes that it modifies all aspects of state and thus can only assumes that it modifies all aspects of state and thus can only

View File

@ -6,5 +6,5 @@ with EVM dialect.
The main semantic differences to the legacy code generator are the following: The main semantic differences to the legacy code generator are the following:
- Arithmetic operations cause a failing assertion if the result is not in range. - Arithmetic operations cause a failing assertion if the result is not in range.
- Resizing a storage array to a length larger than 2**64 causes a failing assertion. - Resizing a storage array to a length larger than 2**64 causes a failing assertion.