A Flask Full of Whiskey (WSGI)

Serving up python web applications has never been easier with the suite of WSGI servers currently at our disposal. Both uWSGI and gunicorn behind Nginx are excellent performers for serving up a Flask app…

Yup, what more could you ask for in life right? There are a number of varieties too, to suit one’s preference. Joking aside, this article is about configuring and stress testing a few WSGI (Web Server Gateway Interface) alternatives for serving up a Python web application. Here is what we cover in this post

A simple application is written with the Flask web development framework. The only API exposed is for generating a random quotation by querying a backend resource. In this case, it is Elasticsearch that has indexed a large number of quotations.
Look at the following standalone WSGI webservers – gunicorn, uWSGI, and the default werkzeug that Flask is bundled with.
Look at the benefit of using Nginx to front the client requests that are proxied back to the above.
Use supervisor to manage the WSGI servers and Locust to drive the load test.

We go through some code/config snippets here for illustration, but the full code can be obtained from github.

1. WSGI Servers

Unless a web site is entirely static, the webserver needs a way to engage external applications to get some dynamic data. Over time many approaches have been implemented to make this exercise lean, efficient and easy. We had the good old CGI that spawned a new process for each request. Then came mod_python that embedded Python into the webserver, followed by FastCGI that allowed the webserver to tap into a pool of long-running processes to dispatch the request to. They all have their strengths and weaknesses. See the discussion and links on this stackoverflow page for example.

The current favorite is the WSGI protocol that allows for a complete decoupling of webservers and the applications they need to access. Here is a general schematic.

Figure 1. The web server and the python application communicate via an intermediate WSGI server that translates between the http and wsgi protocols. The WSGI server is not just a translator of course. It is threaded to distribute the incoming requests over multiple instances of the Flask app.

The WSGI servers are Http enabled on their own, so the client/Nginx can talk to them via Http. In the case of the uWSGI server, there is the option of uwsgi protocol as well for Nginx, and uwsgi_curl to test from command line.
Nginx proxies the request back to a WSGI server configured for that URI.
The WSGI server is configured with the Python application to call with the request. The results are relayed all the way back.

2. Application

The application is simple. The application is all of just one file – quotes.py. It allows for a single GET request.

/quotes/byId?id=INTEGER_NUMBER

1	/quotes/byId?id=INTEGER_NUMBER

The app fetches the quotation document from an Elasticsearch index with that INTEGER_NUMBER as the document ID, and renders them as follows.

Figure 2. What are we learning from this blog post?

The images and CSS are served by Nginx when available.

from flask import Flask, request, render_template, logging
from elasticsearch import Elasticsearch
client = Elasticsearch([{'host':'localhost','port':9200}])
app = Flask(__name__)

@app.route('/quotes/byId', methods=['GET'])
def getById():
    docId = request.args.get('id')
    quote = client.get(index=index, id=docId)
    return render_template('quote.html',quote=quote)

from flask import Flask, request, render_template, logging

from elasticsearch import Elasticsearch

client = Elasticsearch([{'host':'localhost','port':9200}])

app = Flask(__name__)

@app.route('/quotes/byId', methods=['GET'])

def getById():

docId = request.args.get('id')

quote = client.get(index=index, id=docId)

return render_template('quote.html',quote=quote)

In the absence of Nginx, they are sent from the static folder.

@app.route('/css/&lt;path:path>')
def css(path):
    return send_from_directory(app.static_folder + '/css/', path, mimetype='text/css')
@app.route('/images/&lt;path:path>')
def image(path):
    return send_from_directory(app.static_folder + '/images/', path, mimetype='image/jpg')

@app.route('/css/<path:path>')

def css(path):

return send_from_directory(app.static_folder + '/css/', path, mimetype='text/css')

@app.route('/images/<path:path>')

def image(path):

return send_from_directory(app.static_folder + '/images/', path, mimetype='image/jpg')

That is the entirety of the application. Identical no matter which WSGI server we choose to use. Here is the directory & file layout.

.
├── config.py     # Config options for gunicorn
├── quotes.py
├── static
│   ├── css
│   │   └── quote.css
│   ├── favicon.ico
│   └── images
│       ├── eleanor-roosevelt.jpg
│       ├── martha washington.jpg
│       └── maya angelou.jpg
├── templates
│   └── quote.html
└── wsgi.py       # Used by uWSGI

├── config.py # Config options for gunicorn

├── quotes.py

├── static

│ ├── css

│ │ └── quote.css

│ ├── favicon.ico

│ └── images

│ ├── eleanor-roosevelt.jpg

│ ├── martha washington.jpg

│ └── maya angelou.jpg

├── templates

│ └── quote.html

└── wsgi.py # Used by uWSGI

When using the built-in werkzeug as the WSGI server, we supply the runtime config in quotes.py module. Config for uWSGI and gunicorn is supplied at the time of invoking the service. We will cover that next along with that for Nginx, and for the service manager supervisord.

3. Configuration

The number of concurrent processes/workers in use by any of the WSGI servers has an impact on performance. The recommended value is about twice the number of cores but can be larger if it does not degrade the performance. We do not mess with threads per worker here as the memory footprint of our application is small. See this post for some discussion on the use of workers vs threads.

We start with 6 workers and vary it to gauge the impact. We use the same number for both gunicorn and uWSGI servers so the comparison is apples to apples. Unfortunately, there does not seem to be a way to do the same with the werkzueg server.

3.1 Supervisor

We use supervisord to manage the WSGI server processes. This allows for easier configuration, control, a clean separation of logs by app/wsgi and a UI to boot. The configuration file for each server is placed at /etc/supervisor/conf.d, and the supervisord service is started up.

[/etc/supervisor] ls conf.d/*
conf.d/gunicorn.conf  conf.d/uwsgi.conf  conf.d/uwsgi-http.conf  conf.d/werkzeug.conf

[/etc/supervisor] sudo systemctl start supervisor.service

[/etc/supervisor] ls conf.d/*

conf.d/gunicorn.conf conf.d/uwsgi.conf conf.d/uwsgi-http.conf conf.d/werkzeug.conf

[/etc/supervisor] sudo systemctl start supervisor.service

Here is a screenshot of UI (by default at localhost:9001) that shows the running WSGI servers, and controls to stop/start, tail the logs and such.

Figure 3. Supervisor service enables clean and simple management of WSGI servers

The difference between uwsgi and uwsgi-http is that the latter has a Http endpoint while the former works with the binary uwsgi protocol. We talked about this in the context of Figure 1. Let us look at the configuration files for each. Note that the paths in the config files below are placeholders with ‘…’ to be replaced appropriately as per the exact path on the disk.

3.2 gunicorn

The command field in the config invokes gunicorn. The gunicorn server works with the app object in quotes.py and makes the web api available at port 9999. Here is the config file /etc/conf.d/gunicorn.conf

[program:gunicorn]
command=.../virtualenvs/.../bin/gunicorn -c .../quoteserver/config.py -b 0.0.0.0:9999 quotes:app
directory=.../quoteserver
autostart=true
redirect_stderr=true
stdout_logfile=/var/log/supervisor/gunicorn.log

[program:gunicorn]

command=.../virtualenvs/.../bin/gunicorn -c .../quoteserver/config.py -b 0.0.0.0:9999 quotes:app

directory=.../quoteserver

autostart=true

redirect_stderr=true

stdout_logfile=/var/log/supervisor/gunicorn.log

A separate file config.py is used to supply the number of threads, logging details and such.

workers = 6
threads = 1
worker_class = 'sync'
loglevel = 'ERROR'
error_log = '-' # means stdout
access_log = '-' # means stdout
access_log_format = '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"'

workers = 6

threads = 1

worker_class = 'sync'

loglevel = 'ERROR'

error_log = '-' # means stdout

access_log = '-' # means stdout

access_log_format = '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"'

3.3 uWSGI

The uWSGI server can offer either a Http or a uwsgi endpoint as we mentioned earlier. Using the uwsgi endpoint is recommended when the uWSGI server is behind a webserver like Nginx. The configuration below is for the Http endpoint. For the uwsgi endpoint, we replace “–http 127.0.0.1:9997” with “–socket 127.0.0.1:9998“

[program:uwsgi-http]
command=.../virtualenvs/.../bin/uwsgi --chdir .../quoteserver --wsgi-file wsgi.py --http 127.0.0.1:9997 --processes 6 --threads 1 --master --disable-logging --log-4xx --log-5xx
directory=.../quoteserver
autostart=true
stdout_logfile=/var/log/supervisor/uwsgi-http.log
redirect_stderr=true

[program:uwsgi-http]

command=.../virtualenvs/.../bin/uwsgi --chdir .../quoteserver --wsgi-file wsgi.py --http 127.0.0.1:9997 --processes 6 --threads 1 --master --disable-logging --log-4xx --log-5xx

directory=.../quoteserver

autostart=true

stdout_logfile=/var/log/supervisor/uwsgi-http.log

redirect_stderr=true

The config is similar to that for gunicorn, but we do not use a separate config file. The key difference is the argument ‘–wsgi-file’ that points to the module with application object to be used by uWSGI server.

[uwsgi] more wsgi.py 

from quotes import app as application

if __name__ == "__main__":
    application.run()

[uwsgi] more wsgi.py

from quotes import app as application

if __name__ == "__main__":

application.run()

3.4 werkzeug

The options for the default werkzeug server are given as part of the app.run (…) call in quotes.py module. We disable logging in order to not impact performance numbers.

# in quotes.py
if __name__ == "__main__":
    app.logger.disabled = True
    log = logging.getLogger('werkzeug')
    log.disabled = True
    app.run(host='localhost', port=9996, debug=False, threaded=True)

# in quotes.py

if __name__ == "__main__":

app.logger.disabled = True

log = logging.getLogger('werkzeug')

log.disabled = True

app.run(host='localhost', port=9996, debug=False, threaded=True)

The only thing for left for supervisord to do is to make it run as a daemon.

[program:werkzeug]
command=.../virtualenvs/.../bin/python .../quoteserver/quotes.py
directory=.../quoteserver
autostart=true
redirect_stderr=true
stdout_logfile=/var/log/supervisor/werkzeug.log

[program:werkzeug]

command=.../virtualenvs/.../bin/python .../quoteserver/quotes.py

directory=.../quoteserver

autostart=true

redirect_stderr=true

stdout_logfile=/var/log/supervisor/werkzeug.log

3.5 Nginx

When Nginx is used, we need it to correctly route the requests to the above WSGI servers. We use the URI signature to decide which WSGI server should be contacted. Here is the relevant configuration from nginx.conf.

    location /werkzeug/ {
      proxy_pass http://127.0.0.1:9996/;
    }
    location /uwsgi-http/ {
      proxy_pass http://127.0.0.1:9997/;
    }
    location /uwsgi/ {
      rewrite ^/uwsgi/(.*) /$1 break;
      include uwsgi_params;
      uwsgi_pass 127.0.0.1:9998;
    }
    location /gunicorn/ {
      proxy_pass http://127.0.0.1:9999/;
    }

location /werkzeug/ {

proxy_pass http://127.0.0.1:9996/;

}

location /uwsgi-http/ {

proxy_pass http://127.0.0.1:9997/;

}

location /uwsgi/ {

rewrite ^/uwsgi/(.*) /$1 break;

include uwsgi_params;

uwsgi_pass 127.0.0.1:9998;

}

location /gunicorn/ {

proxy_pass http://127.0.0.1:9999/;

}

We identify the WSGI server by the leading part of the URI and take care to proxy it back to its correct port we defined that server to be listening on.

3.5 Summary

With all this under the belt here is a summary diagram with the flow of calls from the client to backend when Nginx is in place.

Figure 4. The flow and routing of requests from the client when Nginx fronts the WSGI servers.

In the absence of Nginx, the client sends requests directly to the Http endpoints enabled by the WSGI servers. Clear enough – no need for another diagram.

4. Load testing with Locust

Locust is a load testing framework for Python. The tests are conveniently defined in code and the stats collected as csv files. Here is a simple script that engages Locust while also collecting system metrics at the same time. We use cmonitor_collector for gathering load, and memory usage metrics.

#!/bin/bash
/usr/bin/cmonitor_collector --sampling-interval=3 --output-filename=./system_metrics.json # start gathering memory/cpu usage sats
sleep 10
pipenv run locust -f ./load_tests.py --host localhost --csv ./results -c 500 -r 10 -t 60m --no-web
sleep 10
pid=$(ps -ef | grep 'cmonitor_collector' | grep -v 'grep' | awk '{ print $2 }')
kill -15 $pid # stop the monitor

#!/bin/bash

/usr/bin/cmonitor_collector --sampling-interval=3 --output-filename=./system_metrics.json # start gathering memory/cpu usage sats

sleep 10

pipenv run locust -f ./load_tests.py --host localhost --csv ./results -c 500 -r 10 -t 60m --no-web

sleep 10

pid=$(ps -ef | grep 'cmonitor_collector' | grep -v 'grep' | awk '{ print $2 }')

kill -15 $pid # stop the monitor

Start a system monitor to collect the load, memory usage, etc… stats
Run the tests described in load_tests.py on the localhost
Save the results to files ‘results_stats.csv’ and ‘results_stats_history.csv’.
A total of 500 users are simulated with 10 users/second added as the test starts
The test runs for 60 minutes
Locust enables a UI as well (localhost:5557) with plots and such but not used here
Stop the system monitor
Post-process the csv data from Locust, and the system metrics data to generate graphics that can be compared across the different WSGI alternatives

The only test we have to define is hitting the single API that we have exposed – …/quotes/byId?id=xxxx

from locust import HttpLocust, TaskSet, task, between
import random, sys, json

# We have 7 ways of engaging the API via Http
url = "http://localhost/gunicorn/quotes" # for nginx + gunicorn
#url = "http://localhost/uwsgi/quotes" # for nginx + uwsgi
#url = "http://localhost/uwsgi-http/quotes" # for nginx + uwsgi-http
#url = "http://localhost/werkzeug/quotes" # for nginx + werkzeug
#url = "http://localhost:9999/quotes" # for direct gunicorn
#url = "http://localhost:9997/quotes" # for direct uwsgi-http
#url = "http://localhost:9996/uwsgi/quotes" # for direct werkzeug

class UserBehaviour(TaskSet):
    @task(1)
    def getAQuote (self):
        quoteId = random.randint(1,900000)
        self.client.get(url + "/byId?id="+str(quoteId))

class WebsiteUser(HttpLocust):
    task_set = UserBehaviour
    wait_time = between(1, 3) # wait between 1 - 3 seconds

from locust import HttpLocust, TaskSet, task, between

import random, sys, json

# We have 7 ways of engaging the API via Http

url = "http://localhost/gunicorn/quotes" # for nginx + gunicorn

#url = "http://localhost/uwsgi/quotes" # for nginx + uwsgi

#url = "http://localhost/uwsgi-http/quotes" # for nginx + uwsgi-http

#url = "http://localhost/werkzeug/quotes" # for nginx + werkzeug

#url = "http://localhost:9999/quotes" # for direct gunicorn

#url = "http://localhost:9997/quotes" # for direct uwsgi-http

#url = "http://localhost:9996/uwsgi/quotes" # for direct werkzeug

class UserBehaviour(TaskSet):

@task(1)

def getAQuote (self):

quoteId = random.randint(1,900000)

self.client.get(url + "/byId?id="+str(quoteId))

class WebsiteUser(HttpLocust):

task_set = UserBehaviour

wait_time = between(1, 3) # wait between 1 - 3 seconds

The code simulates a user that waits between 1 and 3 seconds before hitting the API again with a random integer as the ID of the quote to fetch.

5. Results

Finally time for some results. Took a while to get here for sure, but we have quite a few moving pieces. Plotting the collected data is straightforward (I used matplotlib here) so we will skip the code for that. You can get plots.py from github.

We have two series of runs – (a) with 6 workers, and (b) with 60 workers for WSGI. Each series has 7 locust runs as shown in the code snippet above. Locust generates data for a variety of metrics – the number of requests, failures, response times, etc… as a function of time. Likewise, cmonitor collects data on the load, memory usage etc… of the hardware. Figure 5 below shows the results with workers.

Figure 5. Performance results with 6 workers. gunicorn and uWSGI (uwsgi protcol) perform the best with/without Nginx

The main conclusions of interest from Figure 5 (E & F) are the following.

Performance: We consider the average response time (Figure 5F). The performance of the uWSGI and gunicorn servers is comparable with/without Nginx. The default server werkzeug that Flask comes with is the worst, one of the reasons for their recommendation – do NOT use it in production. Also if you like uWSGI, go for the binary uwsgi protocol and put it behind Nginx, as it is the best. Here is the explicit order.
1. uWSGI server (uwsgi) behind Nginx
2. gunicorn server without Nginx
3. uWSGI server (Http) behind Nginx
4. gunicorn server behind Nginx
5. uWSGI server (Http) without Nginx
6. werkzeug, with/without Nginx
Why is the response time increasing? The reason is NOT because the server performance is degrading with time. Rather it is the starbucks phenomena for early morning coffee, at work! What Locust is reporting here is the total elapsed time between when the request is fired and the response is received. The rate at which we are firing requests is bigger than the rate at which the server is clearing them. So the requests get queued, and the line gets longer and longer with time. The requests that get in the line early have a smaller wait time than the ones that join the line later. This, of course, manifests as the longer response time for later requests.
Why the step increase for median and smooth for the average? The median (or any percentile) is just one integer number (milliseconds) whereas the average is, of course, the average (a float) of all the numbers. The percentile is based on the counts on either side of its current value and given the randomness – increases slowly and by quantum jumps. The average, on the other hand, increases continuously.

But there is more we can learn from here from the figures A-D.

(A) The total number of requests increases with time – of course! Kind of linear but not quite, and there is some variation between the runs too. All that is simply because the wait_time between successive requests from the simulated users has been randomized in the code snippet above.
(B) There are some failures but very, very few compared to the total number of requests served. Perhaps not enough to draw big conclusions.
(C & D) There is plenty of free memory in all cases and not much load. But it does seem that when Nginx is not used, larger memory is being consumed with the server experiencing a slightly higher load.

We are clearly not taxing the server with 6 workers. Let us bump up the workers to 60 and see what we get. This is in Figure 6 below.

Figure 6. Performance with 60 workers is qualitatively the same as with 6 workers. gunicorn and uWSGI (uwsgi protcol) are still the best with/without Nginx.

We have clearly increased the load and the memory usage (C & D) but all of our conclusions from Figure 5 still hold with the uWSGI server as the leader, followed by gunicorn. Before closing this post, let us look at response time results with 6 and 60 workers on the same plot focusing only on uWSGI and gunicorn alone.

Figure 7. Increasing the number of workers does not have a huge impact in our case. The uWSGI server behind Nginx is the best performer, followed by gunicorn with/without Nginx.

6. Conclusions

We have learned in this post that:

the uWSGI server behind Nginx is the best performer
we cannot go wrong with gunicorn either with/without Nginx
we want to avoid placing the uWSGI Http server behind Nginx as they recommend on their website
we do not use the default werkzeug server in production!

Happy learning!

A Flask Full of Whiskey (WSGI)

1. WSGI Servers

2. Application

3. Configuration

3.1 Supervisor

3.2 gunicorn

3.3 uWSGI

3.4 werkzeug

3.5 Nginx

3.5 Summary

4. Load testing with Locust

5. Results

6. Conclusions

Like this:

Related

Leave a Reply Cancel reply