-
I am using a commercial proxy service that works in the following way: you get a so-called "super-proxy" URL, which, when The problem I have is that this does not seem to be the case with an async client that also does retries. The following is my implementation of a retrying async client: class AsyncRetryingClient(httpx.AsyncClient):
def __init__(self, name: str, throttle_response_code: int = 403, *args: Any, **kwargs: Any) -> None:
super().__init__(*args, **kwargs)
self.name = name
self.throttle_response_code = throttle_response_code
self.requests_total = 0
self.requests_timedout = 0
self.requests_errored = 0
self.requests_accepted = 0
self.requests_throttled = 0
async def send(self, request: httpx.Request, *args: Any, **kwargs: Any) -> httpx.Response:
tries = 10
delay_seconds = 0
while tries > 0:
self.requests_total += 1
try:
response = await super().send(request, *args, **kwargs)
except httpx.TimeoutException as e:
self.requests_timedout += 1
logger.error(f"Timeout {self.timeout} reached while requesting {request.url}: {e}. Retrying...")
tries -= 1
continue
except Exception as e:
self.requests_errored += 1
logger.error(f"Unexpected error while requesting {request.url}: {e}. Retrying...")
tries -= 1
continue
if response.status_code == self.throttle_response_code:
self.requests_throttled += 1
else:
self.requests_accepted += 1
logger.debug(
f"{self.__class__.__name__}({self.name}): "
f"requests total: {self.requests_total}, "
f"requests timed out: {self.requests_timedout}, "
f"requests errored: {self.requests_errored}, "
f"requests succeeded: {self.requests_accepted}, "
f"requests throttled: {self.requests_throttled}",
)
# if response.status_code >= 400: # TODO: 404?
if response.status_code < 400:
return response
tries -= 1
await asyncio.sleep(delay_seconds)
delay_seconds *= 2
else:
raise RuntimeError("Boom!")
client = AsyncRetryingClient(
name="my_client",
throttle_response_code=403,
timeout=httpx.Timeout(5),
base_url="https://example.com/",
headers={"User-Agent": "httpx"},
proxy="https://username:password@my-super-proxy.com",
) If there is a timeout, or any other request error, or if I get a 403 back (meaning I am being throttled), I'd want to retry the request, hoping that the retry will be sent through a new proxy address. This unfortunately does not seem to work, because if I get a 403 once, on every retry I get a 403, too.
Since every Before I start digging any further, I would like to understand if what I am trying to do would actually work. Firstly, would every |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
The behaviour I see is indeed caused by connection pooling, because if I apply My guess is it would not be possible to use connection pooling with a "super-proxy" if you want the "super-proxy" to do what it is supposed to do - give you a random proxy IP every time. |
Beta Was this translation helpful? Give feedback.
The behaviour I see is indeed caused by connection pooling, because if I apply
httpx.Limits(max_keepalive_connections=0)
I am not getting 403s anymore. Which means that the keep-alive connection does include the connection to the proxy, which it its turn is once again proven by the fact that in httpcoreHTTPProxy
in subclass ofConnectionPool
.My guess is it would not be possible to use connection pooling with a "super-proxy" if you want the "super-proxy" to do what it is supposed to do - give you a random proxy IP every time.