-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement EOF #972
base: forks/prague
Are you sure you want to change the base?
Implement EOF #972
Conversation
c5318cc
to
ab31dd7
Compare
ab31dd7
to
f672f5d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review.
src/ethereum/prague/vm/__init__.py
Outdated
@@ -95,6 +111,7 @@ class Evm: | |||
error: Optional[Exception] | |||
accessed_addresses: Set[Address] | |||
accessed_storage_keys: Set[Tuple[Address, Bytes32]] | |||
eof: EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a descriptive field name. How about:
eof: EOF | |
container_format: Eof |
src/ethereum/prague/vm/eof.py
Outdated
header_end_index: Uint | ||
|
||
|
||
def get_opcode(opcode: int, eof: EOF) -> Ops: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should name this more descriptively. Some (bad) ideas:
integer_to_op
parse_op
validate_op
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated as map_int_to_op
src/ethereum/prague/vm/__init__.py
Outdated
pc: Uint | ||
stack: List[U256] | ||
memory: bytearray | ||
code: Bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we create NewType for code vs. container vs. whatever other types of sections there are?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am interpreting this as the code being implemented. If it is legacy, then it's just the code. If EOF, it is the code in the section being currently implemented.
src/ethereum/prague/vm/eof.py
Outdated
) | ||
|
||
|
||
def validate_header(container: bytes) -> EOFMetadata: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this separate from meta_from_valid_eof1_container
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combined into one function with a flag to validate
src/ethereum/prague/vm/eof.py
Outdated
raise InvalidEOF("Invalid input/output for first section") | ||
|
||
|
||
def get_valid_instructions(code: bytes) -> Set[int]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have a better name for this. From just the signature, it looks like it returns a uniqued list of all the opcodes used in code
, but I think it returns offsets instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has now been integrated into the validate_code_section
function.
f672f5d
to
47180a1
Compare
47180a1
to
5869072
Compare
999635f
to
6a7de73
Compare
Rebased on latest |
6a7de73
to
0157e72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review (5/22)
src/ethereum/prague/fork.py
Outdated
except InvalidEof: | ||
output = MessageCallOutput( | ||
gas, U256(0), tuple(), set(), set(), InvalidEof(), b"" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should preserve the original exception.
except InvalidEof: | |
output = MessageCallOutput( | |
gas, U256(0), tuple(), set(), set(), InvalidEof(), b"" | |
) | |
except InvalidEof as error: | |
output = MessageCallOutput( | |
gas, U256(0), tuple(), set(), set(), error, b"" | |
) |
if data > MAX_ADDRESS_U256: | ||
raise ValueError("Address is too large") | ||
return Address(data.to_be_bytes32()[-20:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be more clear if we used to_be_bytes
, left_pad_zero_bytes
, and Address.LENGTH
?
@@ -62,7 +90,7 @@ def compute_contract_address(address: Address, nonce: Uint) -> Address: | |||
return Address(padded_address) | |||
|
|||
|
|||
def compute_create2_contract_address( | |||
def compute_contract_address_2( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps?
def compute_contract_address_2( | |
def compute_contract_address_with_salt( |
return Address(data.to_be_bytes32()[-20:]) | ||
|
||
|
||
def compute_contract_address_1(address: Address, nonce: Uint) -> Address: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def compute_contract_address_1(address: Address, nonce: Uint) -> Address: | |
def compute_contract_address_with_nonce(address: Address, nonce: Uint) -> Address: |
Perhaps?
elif isinstance(target, Address): | ||
current_target = target | ||
msg_data = data | ||
code = get_account(env.state, target).code | ||
if code_address is None: | ||
code_address = target | ||
|
||
if get_eof_version(code) == EofVersion.LEGACY: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting really nested. Is there a good way to refactor this?
|
||
if get_eof_version(code) == EofVersion.LEGACY: | ||
eof = None | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a catch-all here will make this more difficult to catch if we add another EofVersion
variant.
Is mypy smart enough to let us do something like:
eof_version = get_eof_version(code)
if eof_version == EofVersion.LEGACY:
...
elif eof_version == EofVersion.EOF1:
...
else:
assert_never(eof_version)
IIRC, assert_never
is available in typing_extensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More review (14 / 22)
stack_height: Optional[OperandStackHeight] | ||
|
||
|
||
SectionMetadata = Dict[Uint, InstructionMetadata] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly because I'm just curious, but why Dict[Uint, InstructionMetadata]
over List[Optional[InstructionMetadata]]
? Is this very hole-y data?
from ..instructions import Ops | ||
|
||
eof: Eof | ||
sections: Dict[Uint, SectionMetadata] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sections: Dict[Uint, SectionMetadata] | |
sections: SectionMetadata |
Maybe?
The current validator instance. | ||
""" | ||
code = validator.current_code | ||
position = Uint(validator.current_pc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't current_pc
already a Uint
?
target_type = eof_meta.type_section_contents[target_index] | ||
target_outputs = target_type[1] | ||
|
||
if target_outputs == 0x80: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does 0x80
come from? Should it be a constant?
current_outputs = current_section_type[1] | ||
target_outputs = target_section_type[1] | ||
|
||
if target_outputs != 0x80 and target_outputs > current_outputs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is the same 0x80
, and so should be in a constant.
raise InvalidEof("Kind code not specified in header") | ||
kind_code = container[counter] | ||
counter += 1 | ||
if validate and kind_code != 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to disentangle validation from the other logic? Perhaps in two functions, maybe with a dataclass to store intermediate values?
Just seems like this function is a bit complex.
if op_metadata.stack_height.max > computed_maximum_stack_height: | ||
computed_maximum_stack_height = op_metadata.stack_height.max | ||
|
||
if computed_maximum_stack_height > 1023: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make an EOF max stack height constant?
] | ||
for index in range(len(eof_meta.container_section_contents)): | ||
if ( | ||
index in eofcreate_references |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make some intermediate variables to clean up these conditions?
Maybe:
eofcreate_seen = index in eofcreate_references
returncontract_seen = index in returncontract_references
if eofcreate_seen and returncontract_seen:
...
if ( | ||
len(container) < EOF_MAGIC_LENGTH | ||
or container[:EOF_MAGIC_LENGTH] != EOF_MAGIC | ||
): | ||
raise InvalidEof("Invalid magic") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, maybe:
if ( | |
len(container) < EOF_MAGIC_LENGTH | |
or container[:EOF_MAGIC_LENGTH] != EOF_MAGIC | |
): | |
raise InvalidEof("Invalid magic") | |
if len(container) < EOF_MAGIC_LENGTH: | |
raise InvalidEof("Too short for magic") | |
if container[:EOF_MAGIC_LENGTH] != EOF_MAGIC: | |
raise InvalidEof("Invalid magic") |
Personally I find separate if
blocks easier to read than a single long if
condition.
|
||
# PROGRAM COUNTER | ||
relative_offset = int.from_bytes( | ||
evm.code[evm.pc + 1 : evm.pc + 3], "big", signed=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ever possible to get an IndexError
here, or is this safe because of the EOF validation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last bit
if evm.eof is None and Uint(return_data_start_position) + Uint(size) > len( | ||
evm.return_data | ||
): | ||
raise OutOfBoundsRead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe?
if evm.eof is None and Uint(return_data_start_position) + Uint(size) > len( | |
evm.return_data | |
): | |
raise OutOfBoundsRead | |
if evm.eof is None: | |
if Uint(return_data_start_position) + Uint(size) > len(evm.return_data): | |
raise OutOfBoundsRead |
charge_gas(evm, copy_gas_cost + extend_memory.cost) | ||
|
||
# OPERATION | ||
assert evm.eof is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume there's a check somewhere else that enforces datacopy
is only used in EOF contracts? Should we leave a comment explaining that?
b234ede
to
0dbb07f
Compare
Rebased on current |
This commit updates to the latest test releases and fixes some minor bugs exposed by the increased coverage. In particular, the bugs related to faulty return flag in the type section are fixed. Any code section that has RETF instruction or has a JUMPF to a returning section is considered a returning section. Otherwise, the code section is considered non-returning. The output in the type section should reflect this behaviour. If not, the container is invalid
0dbb07f
to
9c12043
Compare
What was wrong?
The current specs do not support
EOF
Related to Issue #971
How was it fixed?
Implement the 11 EOF EIPs
Cute Animal Picture