Display location of invalid UTF-8 sequence in unicode literals in SyntaxChecker

This commit is contained in:
Alex Beregszaszi 2020-09-23 13:21:30 +01:00
parent ca743191b7
commit 0e5abbd4a9
3 changed files with 5 additions and 3 deletions

View File

@ -12,6 +12,7 @@ Compiler Features:
* SMTChecker: Support structs.
* SMTChecker: Support ``type(T).min``, ``type(T).max``, and ``type(I).interfaceId``.
* SMTChecker: Support ``address`` type conversion with literals, e.g. ``address(0)``.
* Type Checker: Report position of first invalid UTF-8 sequence in ``unicode""`` literals.
* Type Checker: More detailed error messages why implicit conversions fail.
* Type Checker: Explain why oversized hex string literals can not be explicitly converted to a shorter ``bytesNN`` type.
* Yul Optimizer: Prune unused parameters in functions.

View File

@ -219,11 +219,12 @@ bool SyntaxChecker::visit(Throw const& _throwStatement)
bool SyntaxChecker::visit(Literal const& _literal)
{
if ((_literal.token() == Token::UnicodeStringLiteral) && !validateUTF8(_literal.value()))
size_t invalidSequence;
if ((_literal.token() == Token::UnicodeStringLiteral) && !validateUTF8(_literal.value(), invalidSequence))
m_errorReporter.syntaxError(
8452_error,
_literal.location(),
"Invalid UTF-8 sequence found"
"Contains invalid UTF-8 sequence at position " + toString(invalidSequence) + "."
);
if (_literal.token() != Token::Number)

View File

@ -2,5 +2,5 @@ contract C {
string s = unicode"À";
}
// ----
// SyntaxError 8452: (28-38): Invalid UTF-8 sequence found
// SyntaxError 8452: (28-38): Contains invalid UTF-8 sequence at position 0.
// TypeError 7407: (28-38): Type literal_string (contains invalid UTF-8 sequence at position 0) is not implicitly convertible to expected type string storage ref.