Skip to content

Conversation

@ozkanturidi
Copy link

(feat) add retry and fail logic for bootstrap keeper with configurable timeout and maximum number of retries variables

Copy link
Contributor

@v0lkan v0lkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realized a blocking issue if the api request never succeeds; recommended possible solution.

Feel free to take it, update it, mutate it.

Thanks for your contribution.

Comment on lines +32 to +33
// API. The function retries retries with exponential backoff using a configurable timeout and a maximum
// number of retry attempts per keeper. If a keeper cannot be reached within the timeout, the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we limit the line lenght to 80 chars pls.

"keeper_url", keeperURL,
)

err := api.Contribute(keeperShare, keeperID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this request hangs:

  1. The keeperCtx deadline expires after timeout
  2. The context's Done() channel gets closed
  3. Nothing happens — because nothing is listening to that channel
  4. api.Contribute continues blocking forever

Suggestion: wrap this in a goroutine:

func contributeWithContext(ctx context.Context, api *spike.API, share []byte, keeperID string) error {
    done := make(chan error, 1)
    
    go func() {
        err := api.Contribute(share, keeperID)
        done <- err
    }()
    
    select {
    case err := <-done:
        return err
    case <-ctx.Done():
        return ctx.Err()
    }
}

and

err := contributeWithContext(keeperCtx, api, keeperShare, keeperID)

@v0lkan
Copy link
Contributor

v0lkan commented Dec 17, 2025

@ozkanturidi lint an unit tests are failing. — can you check them out?

Don't mind integration tests, it's a different bug; but lint and units need to pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants