-
Notifications
You must be signed in to change notification settings - Fork 47
chainrpc: retry notifier RPCs during startup lag #253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
mohamedawnallah
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! The changeset LGTM overall, I have left one main comment as thought otherwise seems good to go!
| // chainNotifierRetryBackoff defines the delay between successive | ||
| // subscription attempts while waiting for the ChainNotifier sub-server | ||
| // to become operational. | ||
| chainNotifierRetryBackoff = 500 * time.Millisecond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chainNotifierRetryBackoff defines the delay between successive subscription attempts while waiting for the ChainNotifier sub-server to become operational.
Like that fixed-delay backoff value. Seems a good heuristic value 👍
| // v0.20.0-rc3+ when a ChainNotifier RPC is invoked before the | ||
| // sub-server finishes initialization. | ||
| chainNotifierStartupMessage = "chain notifier RPC is still in the " + | ||
| "process of starting" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
observation: It seems the goal here is to check the availability of ChainRPC sub-server before registering spend notification and looks like that is achieved by attempting to registering spend notification and checking on that exact error message if no error returned form the register notification we assume the server started that is why it succeeded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: What about asserting periodically on the synced_to_chain for wallet from LND GetInfo RPC call we know that synced_to_chain is would be true after the server is started (with almost no delay since we got the most recent height) as seen in the referenced code block down below. That way we don't need to check on raw exact strings which can be volatile also this gives a chance to perhaps remove a fair amount of code in the retry mechanism down below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the idea! I found that LND returns gRPC code Unknown with that error. I sent a PR to LND lightningnetwork/lnd#10352 so it returns Unavailable, which suits better this condition (the server is not ready to serve the request yet, try again later).
On lndclient side I kept both checks for now. When LND PR is merged and that version is released, we can drop the string matching code, leaving only code comparison, which is much more reliable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the idea! I found that LND returns gRPC code Unknown with that error. I sent a PR to LND lightningnetwork/lnd#10352 so it returns Unavailable, which suits better this condition (the server is not ready to serve the request yet, try again later).
On lndclient side I kept both checks for now. When LND PR is merged and that version is released, we can drop the string matching code, leaving only code comparison, which is much more reliable.
Sounds good
6265f73 to
0e692ef
Compare
lnd v0.20.0-rc3 delays ChainNotifier startup which causes Loop to hit "chain notifier RPC is still in the process of starting" during initial subscriptions (LND commit c6f458e478f9ef2cf1d394972bfbc512862c6707). Add a shared retry helper in lndclient so block epoch, confirmation and spend registrations transparently retry until the sub-server is ready, along with regression tests covering the behavior. With lnd PR lightningnetwork/lnd#10352 the server now returns gRPC codes.Unavailable instead of codes.Unknown, so the helper accepts either signal (status code or a string).
0e692ef to
8dc4a0c
Compare
mohamedawnallah
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Thanks!
|
why are we not using this in the lndclient before doing any calls: https://lightning.engineering/api-docs/api/lnd/state/subscribe-state/#grpc ? Waiting until the server is Active ? |
The usual pattern is to call |
lnd v0.20.0-rc3 delays ChainNotifier startup which causes Loop to hit "chain notifier RPC is still in the process of starting" during initial subscriptions (LND commit lightningnetwork/lnd@c6f458e). Add a shared retry helper in lndclient so block epoch, confirmation and spend registrations transparently retry until the sub-server is ready, along with regression tests covering the behavior.
With lnd PR lightningnetwork/lnd#10352 the server now returns gRPC codes.Unavailable instead of codes.Unknown, so the helper accepts either signal (status code or a string).
Pull Request Checklist
in
lnd_services.goare updated.macaroon_recipes.goif your PR adds a new method that is calleddifferently than the RPC method it invokes.