Asynchrony in the Flask

You know the Flask - a Python microframework for web applications. I guess it's not the first tool which comes to mind when someone asks about asynchronous web applications. However it's easy to fill Flask with asynchrony.

The story begins with a task to fetch large files from remote web server and serve them to the clients of the web application written with the Flask. The application is deployed with uWSGI and since the task is "network bound" a bunch of HTTP requests can quickly load all available uWSGI workers and block other requests from being processed.

That is why I started reading the chapter asynchronous/non-blocking modes in uWSGI documentation, which says that uWSGI supports Gevent loop engine. This is not the only option, but I was also looking for asynchronous HTTP client and heard that Gevent can monkey patch Python sockets. This lead me to the great gevent For the Working Python Developer tutorial written by the Gevent Community. There's "Real Word Applications" section showing example of a realtime chat room implemented with the Flask. It substitutes Flask WSGI server with Gevent implementation:

import flask
import gevent.pywsgi

flask_app = flask.Flask(__name__)

# ...

if __name__ == '__main__':
    gevent_server = gevent.pywsgi.WSGIServer(('', 5000), flask_app)
    gevent_server.serve_forever()  # instead of

This is not enough though, the commented ellipsis must be replaced with a handler downloading some large file and streaming it to the client. Let the file be the photo of Seattle from Wikimedia Commons with the size of 17,870 × 4,198 pixels and 14.59 MB.

It took some time playing in IPython with Python requests library to find how to read response in chunks:

import requests
seattle_photo = ''
response = requests.get(seattle_photo, stream=True)
for chunk in response.iter_content(CHUNK_SIZE):
    # stream the chunk

Streaming can be done in Flask if the handler returns flask.Response(iterable), but before it can be written, gevent.monkey needs to patch_all() in order to make Python requests work asynchronously (flask_app is implicitly renamed to app for convenience and convention bellow):

import requests
import gevent.monkey


CHUNK_SIZE = 1024  # bytes

def seattle():
    url = ''
    response = requests.get(url, stream=True)
    def downloader():
        yield ''
        for chunk in response.iter_content(CHUNK_SIZE):
            yield chunk
    return Response(downloader(), mimetype='image/jpeg')

The only non-obvious statement is yield ''. It makes greenlet to switch and accept other requests. Without it the app does not seem to switch context to another request until completion of the current download.

The full example is available as on It's a bit more elaborate in order to show concurrency in a terminal:

# install Python requirements
pip install flask gevent requests
# download Python example
curl -O
# launch the app
python &
# test with Apache benchmark
ab -n 3 -c 3

The output is available on github.