From 569afe711aa2d1902b2e726cb0b29c25e50ed30d Mon Sep 17 00:00:00 2001 From: Leonardo Alt Date: Tue, 9 Mar 2021 11:34:55 +0100 Subject: [PATCH] SMTChecker tutorial and docs cleanup --- docs/index.rst | 1 + docs/security-considerations.rst | 288 ------------ docs/smtchecker.rst | 737 +++++++++++++++++++++++++++++++ 3 files changed, 738 insertions(+), 288 deletions(-) create mode 100644 docs/smtchecker.rst diff --git a/docs/index.rst b/docs/index.rst index 81b9d2ec7..014c850de 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -148,6 +148,7 @@ Contents 080-breaking-changes.rst natspec-format.rst security-considerations.rst + smtchecker.rst resources.rst using-the-compiler.rst yul.rst diff --git a/docs/security-considerations.rst b/docs/security-considerations.rst index 54d4e9015..e807d301e 100644 --- a/docs/security-considerations.rst +++ b/docs/security-considerations.rst @@ -413,291 +413,3 @@ Ask for Peer Review The more people examine a piece of code, the more issues are found. Asking people to review your code also helps as a cross-check to find out whether your code is easy to understand - a very important criterion for good smart contracts. - -.. _formal_verification: - -******************* -Formal Verification -******************* - -Using formal verification, it is possible to perform an automated mathematical -proof that your source code fulfills a certain formal specification. -The specification is still formal (just as the source code), but usually much -simpler. - -Note that formal verification itself can only help you understand the -difference between what you did (the specification) and how you did it -(the actual implementation). You still need to check whether the specification -is what you wanted and that you did not miss any unintended effects of it. - -Solidity implements a formal verification approach based on SMT solving. The -SMTChecker module automatically tries to prove that the code satisfies the -specification given by ``require/assert`` statements. That is, it considers -``require`` statements as assumptions and tries to prove that the conditions -inside ``assert`` statements are always true. If an assertion failure is -found, a counterexample is given to the user, showing how the assertion can be -violated. - -The other verification targets that the SMTChecker checks at compile time are: - -- Arithmetic underflow and overflow (`underflow` and `overflow`). -- Division by zero (`divByZero`). -- Trivial conditions and unreachable code (`constantCondition`). -- Popping an empty array (`popEmptyArray`). -- Insufficient funds for a transfer (`balance`). - -The names after each target above can be used when specifying subsets of targets. - -It is currently an experimental feature, therefore in order to use it you need -to enable it via :ref:`a pragma directive`. - -The SMTChecker traverses the Solidity AST creating and collecting program constraints. -When it encounters a verification target, an SMT solver is invoked to determine the outcome. -If a check fails, the SMTChecker provides specific input values that lead to the failure. - -While the SMTChecker encodes Solidity code into SMT constraints, it contains two -reasoning engines that use that encoding in different ways. - -SMT Encoding -============ - -The SMT encoding tries to be as precise as possible, mapping Solidity types -and expressions to their closest `SMT-LIB `_ -representation, as shown in the table below. - -+-----------------------+--------------------------------+-----------------------------+ -|Solidity type |SMT sort |Theories (quantifier-free) | -+=======================+================================+=============================+ -|Boolean |Bool |Bool | -+-----------------------+--------------------------------+-----------------------------+ -|intN, uintN, address, |Integer |LIA, NIA | -|bytesN, enum | | | -+-----------------------+--------------------------------+-----------------------------+ -|array, mapping, bytes, |Tuple |Datatypes, Arrays, LIA | -|string |(Array elements, Integer length)| | -+-----------------------+--------------------------------+-----------------------------+ -|struct |Tuple |Datatypes | -+-----------------------+--------------------------------+-----------------------------+ -|other types |Integer |LIA | -+-----------------------+--------------------------------+-----------------------------+ - -Types that are not yet supported are abstracted by a single 256-bit unsigned -integer, where their unsupported operations are ignored. - -For more details on how the SMT encoding works internally, see the paper -`SMT-based Verification of Solidity Smart Contracts `_. - -Model Checking Engines -====================== - -The SMTChecker module implements two different reasoning engines that use the -SMT encoding above, a Bounded Model Checker (BMC) and a system of Constrained -Horn Clauses (CHC). Both engines are currently under development, and have -different characteristics. - -Bounded Model Checker (BMC) ---------------------------- - -The BMC engine analyzes functions in isolation, that is, it does not take the -overall behavior of the contract throughout many transactions into account when -analyzing each function. Loops are also ignored in this engine at the moment. -Internal function calls are inlined as long as they are not recursive, direct -or indirectly. External function calls are inlined if possible, and knowledge -that is potentially affected by reentrancy is erased. - -The characteristics above make BMC easily prone to reporting false positives, -but it is also lightweight and should be able to quickly find small local bugs. - -Constrained Horn Clauses (CHC) ------------------------------- - -The Solidity contract's Control Flow Graph (CFG) is modelled as a system of -Horn clauses, where the lifecycle of the contract is represented by a loop -that can visit every public/external function non-deterministically. This way, -the behavior of the entire contract over an unbounded number of transactions -is taken into account when analyzing any function. Loops are fully supported -by this engine. Internal function calls are supported, but external function -calls are currently unsupported. - -The CHC engine is much more powerful than BMC in terms of what it can prove, -and might require more computing resources. - -Abstraction and False Positives -=============================== - -The SMTChecker implements abstractions in an incomplete and sound way: If a bug -is reported, it might be a false positive introduced by abstractions (due to -erasing knowledge or using a non-precise type). If it determines that a -verification target is safe, it is indeed safe, that is, there are no false -negatives (unless there is a bug in the SMTChecker). - -In the BMC engine, function calls to the same contract (or base contracts) are -inlined when possible, that is, when their implementation is available. Calls -to functions in other contracts are not inlined even if their code is -available, since we cannot guarantee that the actual deployed code is the same. - -The CHC engine creates nonlinear Horn clauses that use summaries of the called -functions to support internal function calls. The same approach can and will be -used for external function calls, but the latter requires more work regarding -the entire state of the blockchain and is still unimplemented. - -Complex pure functions are abstracted by an uninterpreted function (UF) over -the arguments. - -+-----------------------------------+--------------------------------------+ -|Functions |SMT behavior | -+===================================+======================================+ -|``assert`` |Verification target | -+-----------------------------------+--------------------------------------+ -|``require`` |Assumption | -+-----------------------------------+--------------------------------------+ -|internal |BMC: Inline function call | -| |CHC: Function summaries | -+-----------------------------------+--------------------------------------+ -|external |BMC: Inline function call or | -| |erase knowledge about state variables | -| |and local storage references. | -| |CHC: Function summaries and erase | -| |state knowledge. | -+-----------------------------------+--------------------------------------+ -|``gasleft``, ``blockhash``, |Abstracted with UF | -|``keccak256``, ``ecrecover`` | | -|``ripemd160``, ``addmod``, | | -|``mulmod`` | | -+-----------------------------------+--------------------------------------+ -|pure functions without |Abstracted with UF | -|implementation (external or | | -|complex) | | -+-----------------------------------+--------------------------------------+ -|external functions without |BMC: Unsupported | -|implementation |CHC: Nondeterministic summary | -+-----------------------------------+--------------------------------------+ -|others |Currently unsupported | -+-----------------------------------+--------------------------------------+ - -Using abstraction means loss of precise knowledge, but in many cases it does -not mean loss of proving power. - -:: - - // SPDX-License-Identifier: GPL-3.0 - pragma solidity >=0.5.0; - pragma experimental SMTChecker; - // This may report a warning if no SMT solver available. - - contract Recover - { - function f( - bytes32 hash, - uint8 _v1, uint8 _v2, - bytes32 _r1, bytes32 _r2, - bytes32 _s1, bytes32 _s2 - ) public pure returns (address) { - address a1 = ecrecover(hash, _v1, _r1, _s1); - require(_v1 == _v2); - require(_r1 == _r2); - require(_s1 == _s2); - address a2 = ecrecover(hash, _v2, _r2, _s2); - assert(a1 == a2); - return a1; - } - } - -In the example above, the SMTChecker is not expressive enough to actually -compute ``ecrecover``, but by modelling the function calls as uninterpreted -functions we know that the return value is the same when called on equivalent -parameters. This is enough to prove that the assertion above is always true. - -Abstracting a function call with an UF can be done for functions known to be -deterministic, and can be easily done for pure functions. It is however -difficult to do this with general external functions, since they might depend -on state variables. - -External function calls also imply that any current knowledge that the -SMTChecker might have regarding mutable state variables needs to be erased to -guarantee no false negatives, since the called external function might direct -or indirectly call a function in the analyzed contract that changes state -variables. - -Reference Types and Aliasing -============================= - -Solidity implements aliasing for reference types with the same :ref:`data -location`. -That means one variable may be modified through a reference to the same data -area. -The SMTChecker does not keep track of which references refer to the same data. -This implies that whenever a local reference or state variable of reference -type is assigned, all knowledge regarding variables of the same type and data -location is erased. -If the type is nested, the knowledge removal also includes all the prefix base -types. - -:: - - // SPDX-License-Identifier: GPL-3.0 - pragma solidity >=0.5.0; - pragma experimental ABIEncoderV2; - pragma experimental SMTChecker; - // This will report a warning - - contract Aliasing - { - uint[] array1; - uint[][] array2; - function f( - uint[] memory a, - uint[] memory b, - uint[][] memory c, - uint[] storage d - ) internal { - array1[0] = 42; - a[0] = 2; - c[0][0] = 2; - b[0] = 1; - // Erasing knowledge about memory references should not - // erase knowledge about state variables. - assert(array1[0] == 42); - // However, an assignment to a storage reference will erase - // storage knowledge accordingly. - d[0] = 2; - // Fails as false positive because of the assignment above. - assert(array1[0] == 42); - // Fails because `a == b` is possible. - assert(a[0] == 2); - // Fails because `c[i] == b` is possible. - assert(c[0][0] == 2); - assert(d[0] == 2); - assert(b[0] == 1); - } - function g( - uint[] memory a, - uint[] memory b, - uint[][] memory c, - uint x - ) public { - f(a, b, c, array2[x]); - } - } - -After the assignment to ``b[0]``, we need to clear knowledge about ``a`` since -it has the same type (``uint[]``) and data location (memory). We also need to -clear knowledge about ``c``, since its base type is also a ``uint[]`` located -in memory. This implies that some ``c[i]`` could refer to the same data as -``b`` or ``a``. - -Notice that we do not clear knowledge about ``array`` and ``d`` because they -are located in storage, even though they also have type ``uint[]``. However, -if ``d`` was assigned, we would need to clear knowledge about ``array`` and -vice-versa. - -Real World Assumptions -====================== - -Some scenarios can be expressed in Solidity and the EVM, but are expected to -never occur in practice. -One of such cases is the length of a dynamic storage array overflowing during a -push: If the ``push`` operation is applied to an array of length 2^256 - 1, its -length silently overflows. -However, this is unlikely to happen in practice, since the operations required -to grow the array to that point would take billions of years to execute. diff --git a/docs/smtchecker.rst b/docs/smtchecker.rst new file mode 100644 index 000000000..8635dcb5b --- /dev/null +++ b/docs/smtchecker.rst @@ -0,0 +1,737 @@ +.. _formal_verification: + +################################## +SMTChecker and Formal Verification +################################## + +Using formal verification it is possible to perform an automated mathematical +proof that your source code fulfills a certain formal specification. +The specification is still formal (just as the source code), but usually much +simpler. + +Note that formal verification itself can only help you understand the +difference between what you did (the specification) and how you did it +(the actual implementation). You still need to check whether the specification +is what you wanted and that you did not miss any unintended effects of it. + +Solidity implements a formal verification approach based on +`SMT `_ and +`Horn `_ solving. +The SMTChecker module automatically tries to prove that the code satisfies the +specification given by ``require`` and ``assert`` statements. That is, it considers +``require`` statements as assumptions and tries to prove that the conditions +inside ``assert`` statements are always true. If an assertion failure is +found, a counterexample may be given to the user showing how the assertion can +be violated. If no warning is given by the SMTChecker for a property, +it means that the property is safe. + +The other verification targets that the SMTChecker checks at compile time are: + +- Arithmetic underflow and overflow. +- Division by zero. +- Trivial conditions and unreachable code. +- Popping an empty array. +- Insufficient funds for a transfer. + +The potential warnings that the SMTChecker reports are: + +- `` happens here.``. This means that the SMTChecker proved that a certain property fails. A counterexample may be given, however in complex situations it may also not show a counterexample. This result may also be a false positive in certain cases, when the SMT encoding adds abstractions for Solidity code that is either hard or impossible to express. +- `` might happen here``. This means that the solver could not prove either case within the given timeout. Since the result is unknown, the SMTChecker reports the potential failure for soundness. This may be solved by increasing the query timeout, but the problem might also simply be too hard for the engine to solve. + +It is currently an experimental feature, therefore in order to use it you need +to enable it via :ref:`a pragma directive`. + +.. note:: + The lack of warnings for a verification target represents an undisputed + mathematical proof of correctness, assuming no bugs in the SMTChecker and + the underlying solver. Keep in mind that these problems are + *very hard* and sometimes *impossible* to solve automatically in the + general case. Therefore, several properties might not be solved or might + lead to false positives for large contracts. Every proven property should + be seen as an important achievement. For advanced users, see :ref:`SMTChecker Tuning ` + to learn a few options that might help proving more complex + properties. + +******** +Tutorial +******** + +Overflow +======== + +.. code-block:: Solidity + + pragma solidity >=0.8.0; + pragma experimental SMTChecker; + // This may report a warning if no SMT solver is available. + + contract Overflow { + uint immutable x; + uint immutable y; + + function add(uint _x, uint _y) internal pure returns (uint) { + return _x + _y; + } + + constructor(uint _x, uint _y) { + (x, y) = (_x, _y); + } + + function stateAdd() public view returns (uint) { + return add(x, y); + } + } + +The contract above shows an overflow check example. +The SMTChecker will, by default, check every reachable arithmetic operation +in the contract for potential underflow and overflow. +Here, it reports the following: + +.. code-block:: bash + + Warning: CHC: Overflow (resulting value larger than 2**256 - 1) happens here. + Counterexample: + x = 1, y = 115792089237316195423570985008687907853269984665640564039457584007913129639935 + = 0 + + Transaction trace: + Overflow.constructor(1, 115792089237316195423570985008687907853269984665640564039457584007913129639935) + State: x = 1, y = 115792089237316195423570985008687907853269984665640564039457584007913129639935 + Overflow.stateAdd() + Overflow.add(1, 115792089237316195423570985008687907853269984665640564039457584007913129639935) -- internal call + --> o.sol:9:20: + | + 9 | return _x + _y; + | ^^^^^^^ + +If we add ``require`` statements that filter out overflow cases, +the SMTChecker proves that no overflow is reachable (by not reporting warnings): + +.. code-block:: Solidity + + pragma solidity >=0.8.0; + pragma experimental SMTChecker; + // This may report a warning if no SMT solver is available. + + contract Overflow { + uint immutable x; + uint immutable y; + + function add(uint _x, uint _y) internal pure returns (uint) { + return _x + _y; + } + + constructor(uint _x, uint _y) { + (x, y) = (_x, _y); + } + + function stateAdd() public view returns (uint) { + require(x < type(uint128).max); + require(y < type(uint128).max); + return add(x, y); + } + } + + +Assert +====== + +An assertion represents an invariant in your code: a property that must be true +*for all transactions, including all input and storage values*, otherwise there is a bug. + +The code below defines a function ``f`` that guarantees no overflow. +Function ``inv`` defines the specification that ``f`` is monotonically increasing: +for every possible pair ``(_a, _b)``, if ``_b > _a`` then ``f(_b) > f(_a)``. +Since ``f`` is indeed monotonically increasing, the SMTChecker proves that our +property is correct. You are encouraged to play with the property and the function +definition to see what results come out! + +.. code-block:: Solidity + + pragma solidity >=0.8.0; + pragma experimental SMTChecker; + // This may report a warning if no SMT solver is available. + + contract Monotonic { + function f(uint _x) internal pure returns (uint) { + require(_x < type(uint128).max); + return _x * 42; + } + + function inv(uint _a, uint _b) public pure { + require(_b > _a); + assert(f(_b) > f(_a)); + } + } + +We can also add assertions inside loops to verify more complicated properties. +The following code searches for the maximum element of an unrestricted array of +numbers, and asserts the property that the found element must be greater or +equal every element in the array. + +.. code-block:: Solidity + + pragma solidity >=0.8.0; + pragma experimental SMTChecker; + // This may report a warning if no SMT solver is available. + + contract Max { + function max(uint[] memory _a) public pure returns (uint) { + uint m = 0; + for (uint i = 0; i < _a.length; ++i) + if (_a[i] > m) + m = _a[i]; + + for (uint i = 0; i < _a.length; ++i) + assert(m >= _a[i]); + + return m; + } + } + +Note that in this example the SMTChecker will automatically try to prove three properties: + +1. ``++i`` in the first loop does not overflow. +2. ``++i`` in the second loop does not overflow. +3. The assertion is always true. + +.. note:: + The properties involve loops, which makes it *much much* harder than the previous + examples, so beware of loops! + +All the properties are correctly proven safe. Feel free to change the +properties and/or add restrictions on the array to see different results. +For example, changing the code to + +.. code-block:: Solidity + + pragma solidity >=0.8.0; + pragma experimental SMTChecker; + // This may report a warning if no SMT solver is available. + + contract Max { + function max(uint[] memory _a) public pure returns (uint) { + require(_a.length >= 5); + uint m = 0; + for (uint i = 0; i < _a.length; ++i) + if (_a[i] > m) + m = _a[i]; + + for (uint i = 0; i < _a.length; ++i) + assert(m > _a[i]); + + return m; + } + } + +gives us: + +.. code-block:: bash + + Warning: CHC: Assertion violation happens here. + Counterexample: + + _a = [0, 0, 0, 0, 0] + = 0 + + Transaction trace: + Test.constructor() + Test.max([0, 0, 0, 0, 0]) + --> max.sol:14:4: + | + 14 | assert(m > _a[i]); + + +State Properties +================ + +So far the examples only demonstrated the use of the SMTChecker over pure code, +proving properties about specific operations or algorithms. +A common type of properties in smart contracts are properties that involve the +state of the contract. Multiple transactions might be needed to make an assertion +fail for such a property. + +As an example, consider a 2D grid where both axis have coordinates in the range (-2^128, 2^128 - 1). +Let us place a robot at position (0, 0). The robot can only move diagonally, one step at a time, +and cannot move outside the grid. The robot's state machine can be represented by the smart contract +below. + +.. code-block:: Solidity + + pragma solidity >=0.8.0; + pragma experimental SMTChecker; + // This may report a warning if no SMT solver is available. + + + contract Robot { + int x = 0; + int y = 0; + + modifier wall { + require(x > type(int128).min && x < type(int128).max); + require(y > type(int128).min && y < type(int128).max); + _; + } + + function moveLeftUp() wall public { + --x; + ++y; + } + + function moveLeftDown() wall public { + --x; + --y; + } + + function moveRightUp() wall public { + ++x; + ++y; + } + + function moveRightDown() wall public { + ++x; + --y; + } + + function inv() public view { + assert((x + y) % 2 == 0); + } + } + +Function ``inv`` represents an invariant of the state machine that ``x + y`` +must be even. +The SMTChecker manages to prove that regardless how many commands we give the +robot, even if infinitely many, the invariant can *never* fail. The interested +reader may want to prove that fact manually as well. Hint: this invariant is +inductive. + +We can also trick the SMTChecker into giving us a path to a certain position we +think might be reachable. We can add the property that (2, 4) is *not* +reachable, by adding the following function. + +.. code-block:: Solidity + + function reach_2_4() public view { + assert(!(x == 2 && y == 4)); + } + +This property is false, and while proving that the property is false, +the SMTChecker tells us exactly *how* to reach (2, 4): + +.. code-block:: bash + + Warning: CHC: Assertion violation happens here. + Counterexample: + x = 2, y = 4 + + Transaction trace: + Robot.constructor() + State: x = 0, y = 0 + Robot.moveLeftUp() + State: x = (- 1), y = 1 + Robot.moveRightUp() + State: x = 0, y = 2 + Robot.moveRightUp() + State: x = 1, y = 3 + Robot.moveRightUp() + State: x = 2, y = 4 + Robot.reach_2_4() + --> r.sol:35:4: + | + 35 | assert(!(x == 2 && y == 4)); + | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Note that the path above is not necessarily deterministic, as there are +other paths that could reach (2, 4). The choice of which path is shown +might change depending on the used solver, its version, or just randomly. + +External Calls and Reentrancy +============================= + +Every external call is treated as a call to unknown code by the SMTChecker. +The reasoning behind that is that even if the code of the called contract is +available at compile time, there is no guarantee that the deployed contract +will indeed be the same as the contract where the interface came from at +compile time. + +In some cases, it is possible to automatically infer properties over state +variables that are still true even if the externally called code can do +anything, including reenter the caller contract. + +.. code-block:: Solidity + + pragma solidity >=0.8.0; + pragma experimental SMTChecker; + // This may report a warning if no SMT solver is available. + + interface Unknown { + function run() external; + } + + contract Mutex { + uint x; + bool lock; + + Unknown immutable unknown; + + constructor(Unknown _u) { + require(address(_u) != address(0)); + unknown = _u; + } + + modifier mutex { + require(!lock); + lock = true; + _; + lock = false; + } + + function set(uint _x) mutex public { + x = _x; + } + + function run() mutex public { + uint xPre = x; + unknown.run(); + assert(xPre == x); + } + } + +The example above shows a contract that uses a mutex flag to forbid reentrancy. +The solver is able to infer that when ``unknown.run()`` is called, the contract +is already "locked", so it would not be possible to change the value of ``x``, +regardless of what the unknown called code does. + +If we "forget" to use the ``mutex`` modifier on function ``set``, the +SMTChecker is able to synthesize the behavior of the externally called code so +that the assertion fails: + +.. code-block:: bash + + Warning: CHC: Assertion violation happens here. + Counterexample: + x = 1, lock = true, unknown = 1 + + Transaction trace: + Mutex.constructor(1) + State: x = 0, lock = false, unknown = 1 + Mutex.run() + unknown.run() -- untrusted external call, synthesized as: + Mutex.set(1) -- reentrant call + --> m.sol:32:3: + | + 32 | assert(xPre == x); + | ^^^^^^^^^^^^^^^^^ + + +.. _smtchecker_options: + +***************************** +SMTChecker Options and Tuning +***************************** + +Timeout +======= + +The SMTChecker uses a hardcoded resource limit (``rlimit``) chosen per solver, +which is not precisely related to time. We chose the ``rlimit`` option as the default +because it gives more determinism guarantees than time inside the solver. + +This options translates roughly to "a few seconds timeout" per query. Of course many properties +are very complex and need a lot of time to be solved, where determinism does not matter. +If the SMTChecker does not manage to solve the contract properties with the default ``rlimit``, +a timeout can be given in milliseconds via the CLI option ``--model-checker-timeout