From 27dba5c2d441b958ad53992d7f1e6bf9a42ae60f Mon Sep 17 00:00:00 2001 From: Daniel Kirchner Date: Thu, 11 Aug 2022 11:32:12 +0200 Subject: [PATCH] Clarify cleanup in docs. Co-authored-by: matheusaaguiar <95899911+matheusaaguiar@users.noreply.github.com> --- docs/internals/variable_cleanup.rst | 107 +++++++++++++++++++++------- 1 file changed, 83 insertions(+), 24 deletions(-) diff --git a/docs/internals/variable_cleanup.rst b/docs/internals/variable_cleanup.rst index 9fec6929f..bab3c8bef 100644 --- a/docs/internals/variable_cleanup.rst +++ b/docs/internals/variable_cleanup.rst @@ -4,9 +4,10 @@ Cleaning Up Variables ********************* -When a value is shorter than 256 bit, in some cases the remaining bits -must be cleaned. -The Solidity compiler is designed to clean such remaining bits before any operations +Ultimately, all values in the EVM are stored in 256 bit words. +Thus, in some cases, when the type of a value has less than 256 bits, +it is necessary to clean the remaining bits. +The Solidity compiler is designed to do such cleaning before any operations that might be adversely affected by the potential garbage in the remaining bits. For example, before writing a value to memory, the remaining bits need to be cleared because the memory contents can be used for computing @@ -28,25 +29,83 @@ the boolean values before they are used as the condition for In addition to the design principle above, the Solidity compiler cleans input data when it is loaded onto the stack. -Different types have different rules for cleaning up invalid values: +The following table describes the cleaning rules applied to different types, +where ``higher bits`` refers to the remaining bits in case the type has less than 256 bits. -+---------------+---------------+-------------------+ -|Type |Valid Values |Invalid Values Mean| -+===============+===============+===================+ -|enum of n |0 until n - 1 |exception | -|members | | | -+---------------+---------------+-------------------+ -|bool |0 or 1 |1 | -+---------------+---------------+-------------------+ -|signed integers|sign-extended |currently silently | -| |word |wraps; in the | -| | |future exceptions | -| | |will be thrown | -| | | | -| | | | -+---------------+---------------+-------------------+ -|unsigned |higher bits |currently silently | -|integers |zeroed |wraps; in the | -| | |future exceptions | -| | |will be thrown | -+---------------+---------------+-------------------+ ++---------------+---------------+-------------------------+ +|Type |Valid Values |Cleanup of Invalid Values| ++===============+===============+=========================+ +|enum of n |0 until n - 1 |throws exception | +|members | | | ++---------------+---------------+-------------------------+ +|bool |0 or 1 |results in 1 | ++---------------+---------------+-------------------------+ +|signed integers|higher bits |currently silently | +| |set to the |signextends to a valid | +| |sign bit |value, i.e. all higher | +| | |bits are set to the sign | +| | |bit; may throw an | +| | |exception in the future | ++---------------+---------------+-------------------------+ +|unsigned |higher bits |currently silently masks | +|integers |zeroed |to a valid value, i.e. | +| | |all higher bits are set | +| | |to zero; may throw an | +| | |exception in the future | ++---------------+---------------+-------------------------+ + +Note that valid and invalid values are dependent on their type size. +Consider ``uint8``, the unsigned 8-bit type, which has the following valid values: + +.. code-block:: none + + 0000...0000 0000 0000 + 0000...0000 0000 0001 + 0000...0000 0000 0010 + .... + 0000...0000 1111 1111 + +Any invalid value will have the higher bits set to zero: + +.. code-block:: none + + 0101...1101 0010 1010 invalid value + 0000...0000 0010 1010 cleaned value + +For ``int8``, the signed 8-bit type, the valid values are: + +Negative + +.. code-block:: none + + 1111...1111 1111 1111 + 1111...1111 1111 1110 + .... + 1111...1111 1000 0000 + +Positive + +.. code-block:: none + + 0000...0000 0000 0000 + 0000...0000 0000 0001 + 0000...0000 0000 0010 + .... + 0000...0000 1111 1111 + +The compiler will ``signextend`` the sign bit, which is 1 for negative and 0 for +positive values, overwriting the higher bits: + +Negative + +.. code-block:: none + + 0010...1010 1111 1111 invalid value + 1111...1111 1111 1111 cleaned value + +Positive + +.. code-block:: none + + 1101...0101 0000 0100 invalid value + 0000...0000 0000 0100 cleaned value