[RFC] Use multi-arm bandits for selecting which sequence of transactions to mutate #1438

gustavo-grieco · 2025-08-29T13:09:45Z

This branch contains some code to experiment with multi-armed bandit strategies for fuzzing smart contracts in Echidna. The code is not working yet, and it will require some changes in hbandit (which is not included in this PR yet)

The idea is to treat each sequence of transactions in the corpus as an arm, and update its reward based on whether mutations of that sequence lead to new coverage. This allows the fuzzer to adaptively prioritize sequences that are empirically more likely to yield interesting behaviors, rather than relying solely on recency-based heuristics.

Key points:

Sequence selection: Each transaction sequence in the corpus is represented as a bandit arm. Rewards are updated when a mutation of a sequence produces new coverage. This is intended to focus fuzzing effort on sequences that are more productive, independently of their age in the corpus.
Reward assignment: Only sequences that actually produce new coverage are rewarded. Early in the campaign, when the corpus is small or empty, rewards are sparse and must be carefully interpreted to avoid bias from initial coverage gains.
Sequence length considerations: Long sequences can produce sparse feedback, making reward assignment noisy. Experiments may explore shorter sequences initially, then incrementally extend sequences that are proven productive, following principles similar to AFL/libFuzzer incremental input sizing.
Mutation strategy: While the current implementation applies generic mutations, the architecture allows for potential extension to mutation-level bandits, where different mutation operators are prioritized based on coverage yield.
Integration with existing scheduling: The existing epochs/(corpus size) recency-based selection is retained for early experimentation. The bandit system can be compared against it to evaluate the impact on coverage efficiency and discovery of deep contract paths.
Limitations: This is an experimental branch; credit assignment, reward propagation, and sequence length heuristics may require tuning. Extensive experimentation will be necessary to assess whether bandit-guided selection improves coverage and reduces wasted fuzzing effort compared to the current heuristic scheduling.

Overall, this PR aims to set the foundation for adaptive fuzzing using multi-armed bandits in Echidna, enabling systematic exploration of whether learned prioritization of sequences can outperform simple recency-based heuristics.

initial code

35935e9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Use multi-arm bandits for selecting which sequence of transactions to mutate #1438

[RFC] Use multi-arm bandits for selecting which sequence of transactions to mutate #1438

Uh oh!

gustavo-grieco commented Aug 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[RFC] Use multi-arm bandits for selecting which sequence of transactions to mutate #1438

Are you sure you want to change the base?

[RFC] Use multi-arm bandits for selecting which sequence of transactions to mutate #1438

Uh oh!

Conversation

gustavo-grieco commented Aug 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant