Skip to content

Conversation

@starius
Copy link
Contributor

@starius starius commented Nov 6, 2025

lnd v0.20.0-rc3 delays ChainNotifier startup which causes Loop to hit "chain notifier RPC is still in the process of starting" during initial subscriptions (LND commit lightningnetwork/lnd@c6f458e). Add a shared retry helper in lndclient so block epoch, confirmation and spend registrations transparently retry until the sub-server is ready, along with regression tests covering the behavior.

With lnd PR lightningnetwork/lnd#10352 the server now returns gRPC codes.Unavailable instead of codes.Unknown, so the helper accepts either signal (status code or a string).

Pull Request Checklist

  • PR is opened against correct version branch.
  • Version compatibility matrix in the README and minimal required version
    in lnd_services.go are updated.
  • Update macaroon_recipes.go if your PR adds a new method that is called
    differently than the RPC method it invokes.

Copy link

@mohamedawnallah mohamedawnallah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The changeset LGTM overall, I have left one main comment as thought otherwise seems good to go!

// chainNotifierRetryBackoff defines the delay between successive
// subscription attempts while waiting for the ChainNotifier sub-server
// to become operational.
chainNotifierRetryBackoff = 500 * time.Millisecond

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chainNotifierRetryBackoff defines the delay between successive subscription attempts while waiting for the ChainNotifier sub-server to become operational.

Like that fixed-delay backoff value. Seems a good heuristic value 👍

// v0.20.0-rc3+ when a ChainNotifier RPC is invoked before the
// sub-server finishes initialization.
chainNotifierStartupMessage = "chain notifier RPC is still in the " +
"process of starting"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

observation: It seems the goal here is to check the availability of ChainRPC sub-server before registering spend notification and looks like that is achieved by attempting to registering spend notification and checking on that exact error message if no error returned form the register notification we assume the server started that is why it succeeded

Copy link

@mohamedawnallah mohamedawnallah Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: What about asserting periodically on the synced_to_chain for wallet from LND GetInfo RPC call we know that synced_to_chain is would be true after the server is started (with almost no delay since we got the most recent height) as seen in the referenced code block down below. That way we don't need to check on raw exact strings which can be volatile also this gives a chance to perhaps remove a fair amount of code in the retry mechanism down below

https://github.com/lightningnetwork/lnd/blob/096ab65b1d5b1f5b79b4e3ea2659e904de0eeda2/lnd.go#L703-L745

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the idea! I found that LND returns gRPC code Unknown with that error. I sent a PR to LND lightningnetwork/lnd#10352 so it returns Unavailable, which suits better this condition (the server is not ready to serve the request yet, try again later).

On lndclient side I kept both checks for now. When LND PR is merged and that version is released, we can drop the string matching code, leaving only code comparison, which is much more reliable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the idea! I found that LND returns gRPC code Unknown with that error. I sent a PR to LND lightningnetwork/lnd#10352 so it returns Unavailable, which suits better this condition (the server is not ready to serve the request yet, try again later).

On lndclient side I kept both checks for now. When LND PR is merged and that version is released, we can drop the string matching code, leaving only code comparison, which is much more reliable.

Sounds good

@starius starius force-pushed the fix-chainnotifier-lnd-20-rc3 branch from 6265f73 to 0e692ef Compare November 7, 2025 01:24
lnd v0.20.0-rc3 delays ChainNotifier startup which causes Loop to hit
"chain notifier RPC is still in the process of starting" during initial
subscriptions (LND commit c6f458e478f9ef2cf1d394972bfbc512862c6707).
Add a shared retry helper in lndclient so block epoch, confirmation and spend
registrations transparently retry until the sub-server is ready, along with
regression tests covering the behavior.

With lnd PR lightningnetwork/lnd#10352 the server now
returns gRPC codes.Unavailable instead of codes.Unknown, so the helper accepts
either signal (status code or a string).
@starius starius force-pushed the fix-chainnotifier-lnd-20-rc3 branch from 0e692ef to 8dc4a0c Compare November 7, 2025 01:53
Copy link

@mohamedawnallah mohamedawnallah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks!

@ziggie1984
Copy link

ziggie1984 commented Nov 7, 2025

why are we not using this in the lndclient before doing any calls: https://lightning.engineering/api-docs/api/lnd/state/subscribe-state/#grpc ?

Waiting until the server is Active ?

@bhandras
Copy link
Member

bhandras commented Nov 7, 2025

why are we not using this in the lndclient before doing any calls: https://lightning.engineering/api-docs/api/lnd/state/subscribe-state/#grpc ?

Waiting until the server is Active ?

The usual pattern is to call GetInfo to ensure LND is ready, but I agree that maybe integrating the state RPC is better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants