Conversation
408 Request Timeout
429 Too Many Requests
(respect the Retry-After header if it's in seconds and less than 10)
500 Internal Server Error
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout
Owner
|
Try it out, if benchmark results look good it could be a good option.
…On Thu, Dec 14, 2023, 21:45 Henrik Ahlgren ***@***.***> wrote:
This is an attempt to fix #332
<#332> in a simple manner
(not using anything fancy like urllib3.Retry). I think it should improve
d/l performance significantly on datasets with large amounts of 404 images,
but I have not done a lot of benchmarking.
I haven't found any best practices (like RFCs) wrt what HTTP codes to
retry, but the following should be a reasonable list:
- 408 Request Timeout
- 429 Too Many Requests (respect the Retry-After header if it's in
seconds and less than 10)
- 500 Internal Server Error
- 502 Bad Gateway
- 503 Service Unavailable
- 504 Gateway Timeout
------------------------------
You can view, comment on, or merge this pull request online at:
#368
Commit Summary
- 9387c7a
<9387c7a>
Retry only on certain HTTP codes
File Changes
(1 file <https://github.com/rom1504/img2dataset/pull/368/files>)
- *M* img2dataset/downloader.py
<https://github.com/rom1504/img2dataset/pull/368/files#diff-2ded925514a1a2d2eebace43140502f4b37b16c8bd36a5a801360def648088b1>
(12)
Patch Links:
- https://github.com/rom1504/img2dataset/pull/368.patch
- https://github.com/rom1504/img2dataset/pull/368.diff
—
Reply to this email directly, view it on GitHub
<#368>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAR437VIQ5II2JNCMDOYHN3YJNQPJAVCNFSM6AAAAABAVOJP5SVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA2DENBVHEZTQOI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is an attempt to fix #332 in a simple manner (not using anything fancy like urllib3.Retry). I think it should improve d/l performance significantly on datasets with large amounts of 404 images, but I have not done a lot of benchmarking.
I haven't found any best practices (like RFCs) wrt what HTTP codes to retry, but the following should be a reasonable list: