Is the Session object from Python's Requests library thread safe?

PythonThread SafetyConnection PoolingPython Requests

Python Problem Overview


Python's popular http://docs.python-requests.org/en/latest/">Requests</a> library is said to be thread-safe on its home page, but no further details are given. If I call requests.session(), can I then safely pass this object to multiple threads like so:

session = requests.session()
for i in xrange(thread_count):
    threading.Thread(
        target=target,
        args=(session,),
        kwargs={}
    )

and make requests using the same connection pool in multiple threads?

If so, is this the recommended approach, or should each thread be given its own connection pool? (Assuming the total size of all the individual connection pools summed to the size of what would be one big connection pool, like the one above.) What are the pros and cons of each approach?

Python Solutions


Solution 1 - Python

After reviewing the source of requests.session, I'm going to say the session object might be thread-safe, depending on the implementation of CookieJar being used.

Session.prepare_request reads from self.cookies, and Session.send calls extract_cookies_to_jar(self.cookies, ...), and that calls jar.extract_cookies(...) (jar being self.cookies in this case).

The source for Python 2.7's cookielib acquires a lock (threading.RLock) while it updates the jar, so it appears to be thread-safe. On the other hand, the documentation for cookielib says nothing about thread-safety, so maybe this feature should not be depended on?

UPDATE

If your threads are mutating any attributes of the session object such as headers, proxies, stream, etc. or calling the mount method or using the session with the with statement, etc. then it is not thread-safe.

Solution 2 - Python

https://github.com/psf/requests/issues/1871 implies that Session is not thread-safe, and that at least one maintainer recommends one Session per thread.

I just opened https://github.com/psf/requests/issues/2766 to clarify the documentation.

Solution 3 - Python

I also faced the same question and went to the source code to find a suitable solution for me. In my opinion Session class generally has various problems.

  1. It initializes the default HTTPAdapter in the constructor and leaks it if you mount another one to 'http' or 'https'.
  2. HTTPAdapter implementation maintains the connection pool, I think it is not something to create on each Session object instantiation.
  3. Session closes HTTPAdapter, thus you can't reuse the connection pool between different Session instances.
  4. Session class doesn't seem to be thread safe according to various discussions.
  5. HTTPAdapter internally uses the urlib3.PoolManager. And I didn't find any obvious problem related to the thread safety in the source code, so I would rather trust the documentation, which says that urlib3 is thread safe.

As the conclusion from the above list I didn't find anything better than overriding Session class

class HttpSession(Session):
    def __init__(self, adapter: HTTPAdapter):
        self.headers = default_headers()
        self.auth = None
        self.proxies = {}
        self.hooks = default_hooks()
        self.params = {}
        self.stream = False
        self.verify = True
        self.cert = None
        self.max_redirects = DEFAULT_REDIRECT_LIMIT
        self.trust_env = True
        self.cookies = cookiejar_from_dict({})
        self.adapters = OrderedDict()
        self.mount('https://', adapter)
        self.mount('http://', adapter)

    def close(self) -> None:
        pass

And creating the connection factory like:

class HttpSessionFactory:
    def __init__(self,
             pool_max_size: int = DEFAULT_CONNECTION_POOL_MAX_SIZE,
             retry: Retry = DEFAULT_RETRY_POLICY):
        self.__http_adapter = HTTPAdapter(pool_maxsize=pool_max_size, max_retries=retry)

    def session(self) -> Session:
        return HttpSession(self.__http_adapter)

    def close(self):
        self.__http_adapter.close()

Finally, somewhere in the code I can write:

with self.__session_factory.session() as session:
    response = session.get(request_url)

And all my session instances will reuse the same connection pool. And somewhere at the end when the application stops I can close the HttpSessionFactory. Hope this will help somebody.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDJGView Question on Stackoverflow
Solution 1 - PythonmillerdevView Answer on Stackoverflow
Solution 2 - PythonGreg WardView Answer on Stackoverflow
Solution 3 - PythonvatuskaView Answer on Stackoverflow