Skip to content

Conversation

@taco-paco
Copy link
Contributor

@taco-paco taco-paco commented Nov 13, 2025

Summary by CodeRabbit

  • Refactor
    • Optimized ledger truncation process to execute asynchronously, reducing blocking operations and improving system responsiveness during maintenance operations.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 13, 2025

Walkthrough

The truncation loop in the ledger truncator is moved from synchronous execution to a blocking thread pool via tokio::task::spawn_blocking. The chunked deletion, error handling, flush operations, and compaction steps are preserved, with execution deferred to a separate thread to prevent async runtime blocking.

Changes

Cohort / File(s) Summary
Async truncation refactoring
magicblock-ledger/src/ledger_truncator.rs
Offload ledger truncation loop to blocking thread using tokio::task::spawn_blocking with cloned ledger handle; move delete_slot_range and flush calls inside spawned task; await task completion and handle cancellation errors; retain post-truncation compaction

Sequence Diagram(s)

sequenceDiagram
    participant Async Runtime
    participant Thread Pool
    participant Ledger

    rect rgb(200, 240, 255)
    Note over Async Runtime,Ledger: Before: Blocking in async context
    Async Runtime->>Ledger: delete_slot_range (chunks)
    Ledger-->>Async Runtime: per-chunk results
    Async Runtime->>Ledger: flush()
    Ledger-->>Async Runtime: flush result
    end

    rect rgb(240, 220, 255)
    Note over Async Runtime,Ledger: After: Non-blocking via spawn_blocking
    Async Runtime->>Thread Pool: spawn_blocking task
    activate Thread Pool
    Thread Pool->>Ledger: delete_slot_range (chunks)
    Ledger-->>Thread Pool: per-chunk results
    Thread Pool->>Ledger: flush()
    Ledger-->>Thread Pool: flush result
    deactivate Thread Pool
    Thread Pool-->>Async Runtime: task result / cancellation
    end

    Async Runtime->>Ledger: compact_slot_range()
    Ledger-->>Async Runtime: compaction result
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Verify that cloning the ledger handle is safe and maintains proper reference semantics
  • Confirm error handling for delete_slot_range chunks is identical to the original logic
  • Ensure flush() error logging is preserved within the spawned task
  • Check that compact_slot_range() is correctly awaited after the blocking task completes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary change: moving the delete operation (truncation loop) from synchronous execution to a separate blocking thread using tokio::task::spawn_blocking.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/ledger-truncator/delete-non-blocking

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 235d41f and d62eae2.

📒 Files selected for processing (1)
  • magicblock-ledger/src/ledger_truncator.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/locks.rs:110-122
Timestamp: 2025-11-04T10:53:50.922Z
Learning: In magicblock-processor, the TransactionScheduler runs in a single, dedicated thread and will always remain single-threaded. The `next_transaction_id()` function in scheduler/locks.rs uses `unsafe static mut` which is safe given this architectural guarantee.
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/coordinator.rs:227-238
Timestamp: 2025-11-07T13:20:13.793Z
Learning: In magicblock-processor's ExecutionCoordinator (scheduler/coordinator.rs), the `account_contention` HashMap intentionally does not call `shrink_to_fit()`. Maintaining slack capacity is beneficial for performance by avoiding frequent reallocations during high transaction throughput. As long as empty entries are removed from the map (which `clear_account_contention` does), the capacity overhead is acceptable.
🧬 Code graph analysis (1)
magicblock-ledger/src/ledger_truncator.rs (1)
magicblock-api/src/magic_validator.rs (1)
  • ledger (735-737)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: run_make_ci_test
  • GitHub Check: run_make_ci_lint
  • GitHub Check: Build Project
🔇 Additional comments (2)
magicblock-ledger/src/ledger_truncator.rs (2)

206-235: Good use of spawn_blocking to prevent async runtime blocking.

Moving the deletion loop and flush operations to a blocking thread is the correct approach for RocksDB I/O, which can block for significant periods. The logic is correctly preserved with chunked deletion and error handling.

One behavioral note: if the parent async task is cancelled while the blocking task is running, the blocking task will continue executing until completion. The await on line 233 will return a JoinError which you log, but the deletion operations won't be interrupted. This seems acceptable given the periodic nature of truncation.


233-237: Compaction proceeding after deletion failures is intentional and already documented.

The code at line 136 includes an explicit comment: "We will still compact." This confirms that the behavior on line 237—proceeding with compaction despite potential deletion task failures—is intentional, defensive design. The same pattern appears earlier at line 139, where compaction proceeds after a flush error. No preconditions prevent compaction from running, and the resilient behavior is by design.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

Manual Deploy Available

You can trigger a manual deploy of this PR branch to testnet:

Deploy to Testnet 🚀

Alternative: Comment /deploy on this PR to trigger deployment directly.

⚠️ Note: Manual deploy requires authorization. Only authorized users can trigger deployments.

Comment updated automatically when the PR is synchronized.

Copy link
Contributor

@bmuddha bmuddha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, seems like it has fixed the startup freeze.

@taco-paco taco-paco merged commit ded9c50 into master Nov 14, 2025
20 checks passed
@taco-paco taco-paco deleted the fix/ledger-truncator/delete-non-blocking branch November 14, 2025 05:48
thlorenz added a commit that referenced this pull request Nov 14, 2025
* master:
  fix: move delete onto separate thread (#629)
  feat: return ledger + accountsdb metrics (#624)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants