-
Notifications
You must be signed in to change notification settings - Fork 20.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core/state: move state log mechanism to a separate layer #30569
base: master
Are you sure you want to change the base?
Conversation
core/blockchain.go
Outdated
var wStateDb = vm.StateDB(statedb) | ||
if w := statedb.Wrapped(); w != nil { | ||
wStateDb = w | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the hacks, going from *state.StateDB
to vm.StateDB
ec06411
to
4a25e24
Compare
After having considered it some more, I am even more convinced that the approach of #30441, adding read-hooks inside the It does not discriminate between event sources.
The "solution" to these sorts of problems would be to, in certain situations, disable the With the layered solution, there are no such complexities, as long as we can switch between the logging-statedb and the non-logging-statedb.
Switching between one and the other can probably be done in many ways, I'm open to suggestions. One way would be to have two interfaces type LoggingEnabled interface {
WithLoggingDisabled() vm.StateDB
}
type LoggingDisabled interface {
WithLoggingEnabled() vm.StateDB
} Example how that would look, going from a dual-layered logging statedb to a single-layered raw statedb: if evm.Config.Tracer != nil && evm.Config.Tracer.OnTxStart != nil {
ctx := evm.GetVMContext()
newctx := &(*ctx) // shallow copy
if sdb, ok := ctx.StateDB.(vm.LoggingEnabled); ok {
newctx.StateDB = sdb.WithLoggingDisabled()
}
evm.Config.Tracer.OnTxStart(newctx, tx, msg.From)
if evm.Config.Tracer.OnTxEnd != nil {
defer func() {
evm.Config.Tracer.OnTxEnd(receipt, err)
}()
}
} |
cae550b
to
68a0aff
Compare
Minus the ugliness regarding swapping between shimmed and non-shimmed state, this PR is mostly done. Ideas for how to make the switching nicer are appreciated |
My understanding for the main use-case of the read hooks is to collect the prestate for the transaction/call. So the ordering and how often we emit a OnReadBalance doesn't matter for the tracers. IF that were to matter you are right, then we would need to add a reason to specify what is this read about.
This was an interesting realization. I think the friction point here is statedb emitting logs for the same methods that are exposed to the tracers via a statedb instance. Honestly I don't like so much that we are exposing statedb to the tracers. We had to do it exactly to fetch prestate values. So IMO if we add read hooks we can drop the statedb. But I'd ask for opinion from users before committing to that. Generally I am ok with your approach if it allows us to keep the read hooks :) |
Honestly I don't like so much that we are exposing statedb to the tracers. We had to do it exactly to fetch prestate values. So IMO if we add read hooks we can drop the statedb.
Well, the prestate tracer is perhaps the first driving usecase, but querying state from a tracer is *very* useful and powerful. I would prefer it to remain, definitely! Trying to collect state by catching per-scope readhooks sounds like a nightmare in comparison :)
Generally I am ok with your approach if it allows us to keep the read hooks :)
With my approach as a basis, I won't object to you read-hooking the entire statedb interface.
|
I think if it's possible to remove state read access from tracers, we should do it. Full state access will become impossible later with stateless clients, so it will have to be removed at that time anyway. |
Well, I would assume users of somewhat advanced tracing to be using stateful clients. Anyway, it is a decision unrelated to this PR. It is related to the other PR, since that one doesn't have the ability to switch between logging/nonlogging statedb. If we want to remove state access, let's someone make a PR and discuss it then, IMO. @fjl , you are a type artist. Any ideas for making the hacks in this PR neater? |
@rjl493456442 implemented an alternative state overriding here: #29950. In this implementation, we switch out the backend from underneath the // Reader defines the interface for accessing accounts and storage slots
// associated with a specific state.
type Reader interface {
// Account retrieves the account associated with a particular address.
//
// - Returns a nil account if it does not exist
// - Returns an error only if an unexpected issue occurs
// - The returned account is safe to modify after the call
Account(addr common.Address) (*types.StateAccount, error)
// Storage retrieves the storage slot associated with a particular account
// address and slot key.
//
// - Returns an empty slot if it does not exist
// - Returns an error only if an unexpected issue occurs
// - The returned storage slot is safe to modify after the call
Storage(addr common.Address, slot common.Hash) (common.Hash, error)
// Copy returns a deep-copied state reader.
Copy() Reader
} But if the the core parts of So we could have e.g. I am not really sure what are the pros and cons with either approach. @rjl493456442 any thoughts? |
0a4002e
to
54d7790
Compare
core/state/statedb_logger.go
Outdated
func (s *stateDBLogger) SetCode(address common.Address, code []byte) { | ||
s.StateDB.SetCode(address, code) | ||
if s.hooks.OnCodeChange != nil { | ||
s.hooks.OnCodeChange(address, types.EmptyCodeHash, nil, crypto.Keccak256Hash(code), code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, for some reason I thought selfdestruct code removal is also going through this pathway. Then those prevCodehash and prevCode fields are mostly useless. I guess can keep for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
core/state/statedb_logger.go
Outdated
) | ||
|
||
type stateDBLogger struct { | ||
*StateDB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
undo this
Notes:
|
Notes:
|
…te layer core/state: wip move state log mechanism in to a separate layer core/state: fix tests core/state: fix miscalc in OnBalanceChange eth/tracers: fix tests core/state: re-enable verkle witness generation internal/ethapi: fix simulation + new logging schema
…g burn core/state, core/vm: refactor statedb hooking core/state: trace consensus finalize and system calls eth/tracers/internal/tracetest: fix tests after refactor core/state: some renaming and cleanup of statedb-hooking system core/state: remove unecessary methods, implement hooked subbalance, more testing
16fa089
to
8ca0cd5
Compare
@@ -98,7 +99,10 @@ func (p *StateProcessor) Process(block *types.Block, statedb *state.StateDB, cfg | |||
receipts = append(receipts, receipt) | |||
allLogs = append(allLogs, receipt.Logs...) | |||
} | |||
|
|||
var tracingStateDB = vm.StateDB(statedb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we create a hookedStatedb in the first place, ahead of transaction execution?
Now we create a hooked one for each transaction, it looks wasteful for me, perhaps I miss something?
defer func() { | ||
evm.Config.Tracer.OnTxEnd(receipt, err) | ||
}() | ||
var tracingStateDB = vm.StateDB(statedb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we allocate a hooked statedb and reuse it for all transactions?
res, err := applyMessageWithEVM(ctx, evm, msg, timeout, gp) | ||
// If an internal state error occurred, let that have precedence. Otherwise, | ||
// a "trie root missing" type of error will masquerade as e.g. "insufficient gas" | ||
if err := state.Error(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we need this change.
Originally, in applyMessageWithEVM
, state.Error
is checked first and the error will be returned if it's non-nil;
Now the state.Error
checking is removed in applyMessageWithEVM
and be performed here.
These two approaches should be totally equivalent? Isn't it?
return prev | ||
} | ||
|
||
func (s *hookedStateDB) SetNonce(address common.Address, nonce uint64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename this SetNonce
to IncreaseNonce
? The semantic always expect to increment the nonce by 1
core/state/statedb_hooked.go
Outdated
|
||
func (s *hookedStateDB) Selfdestruct6780(address common.Address) uint256.Int { | ||
prev := s.StateDB.Selfdestruct6780(address) | ||
if s.hooks.OnBalanceChange != nil && !prev.IsZero() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be wrong here.
SelfDestruct6780 will only change the state if the account is newly created within the same transaction; otherwise it's noop.
We should somehow determine if the state is changed then invoke the hook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I guess if the SelfDestruct6780(self)
happens, and it's not in same-tx, there's no balance-change. This code can just check the current balance before calling the hook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we could check the balance beforehand, and don't invoke the hooks if the balance is already zero
@rjl493456442 your commit 8c7526c undoes an intentional change:
|
350d4bb
to
21f5a77
Compare
@holiman I undo my cleanup commit and push various fixes on top |
In this PR, I have moved the logging-facilities out of
*state.StateDB
, in to a wrapping struct which implementsvm.StateDB
instead.In most places, it was pretty straight-forward.
Some internal code uses the direct object-accessors to mutate the state, particularly in testing and in setting up state overrides, which means that these changes are unobservable for the hooked layer. This is fine, how we configure the overrides are not necessarily part of the API we want to publish.
The trickiest part about the layering is that when the selfdestructs are finally deleted during
Finalise
, there's the possibility that someone sent some ether to it, which is burnt at that point, and thus needs to be logged. The hooked layer reaches into the inner layer to figure out these events.In package
vm
, the conversion fromstate.StateDB + hooks
into a hookedvm.StateDB
is performed where needed.