DEV Community

Chris White
Chris White

Posted on

Python Deployment: WSGI

In the last installment I covered various protocols that can be used to connect with a python application. What's missing from all this is how to structure the actual application code. This article will look at WSGI as one method of solving this challenge.

WSGI Basics

The Web Server Gateway Interface (WSGI) standard was first established with PEP333 and later PEP3333 become the current version of it. It standardizes a few callable method signatures and outlines some practices on how the application and server should interact with each other. To help with implementation python includes a wsgiref module showing the basic format of an WSGI application.

Simple WSGI Server

The first useful feature of the wsgiref is a basic server that can be spun up with minimal code:

from wsgiref.simple_server import make_server, demo_app

with make_server('localhost', 8000, demo_app) as httpd:
    print("Serving HTTP on port 8000...")

    # Respond to requests until process is killed
    httpd.serve_forever()
Enter fullscreen mode Exit fullscreen mode

Along with a simple client:

import requests

response = requests.get('http://localhost:8000')
print(response.text)
Enter fullscreen mode Exit fullscreen mode

The demo_app prints Hello World and then a dump of all the environment variables:

$ python simple_client.py 
Hello world!
<snip>
QUERY_STRING = ''
REMOTE_ADDR = '127.0.0.1'
REMOTE_HOST = ''
REQUEST_METHOD = 'GET'
SCRIPT_NAME = ''
SERVER_NAME = 'localhost'
SERVER_PORT = '8000'
SERVER_PROTOCOL = 'HTTP/1.1'
SERVER_SOFTWARE = 'WSGIServer/0.2'
<snip>
wsgi.errors = <_io.TextIOWrapper name='<stderr>' mode='w' encoding='utf-8'>
wsgi.file_wrapper = <class 'wsgiref.util.FileWrapper'>
wsgi.input = <_io.BufferedReader name=4>
wsgi.multiprocess = False
wsgi.multithread = False
wsgi.run_once = False
wsgi.url_scheme = 'http'
wsgi.version = (1, 0)
Enter fullscreen mode Exit fullscreen mode

Along with CGI like variables there are wsgi specific ones as outlined by the standard. Now we'll look at a more expanded example to see what's going on behind the scenes.

Application Object

The application object is a callable which is interacted with in WSGI. A more expanded version could look something like this:

from wsgiref.simple_server import make_server
from wsgiref.validate import validator


def application(env, start_response):
    content = []
    content.append(b'Hello World ')
    content.append(f"{env['wsgi.version'][0]}.{env['wsgi.version'][1]}".encode('utf-8'))
    content_length = sum(len(i) for i in content)

    status = '200 OK'
    response_headers = [
        ('Content-Type', 'plain/text'),
        ('Content-Length', str(content_length))
    ]
    start_response(status, response_headers)

    return content

validated_app = validator(application)

with make_server('localhost', 8000, validated_app) as httpd:
    print("Serving HTTP on port 8000...")

    # Respond to requests until process is killed
    httpd.serve_forever()
Enter fullscreen mode Exit fullscreen mode

This prints out "Hello World 1.0", which includes a string plus the version tuple for WSGI. There is also a validator which wraps around our application and ensures various parts of it are WSGI compliant. This can be useful in development practices to catch potential issues quickly. In production however you'd be better off performance wise exposing the bare application. The first important part of this is the application signature:

def application(env, start_response):
Enter fullscreen mode Exit fullscreen mode

It takes in two arguments, an environment and a start_response callable. Next is content for passing back to the client. The WSGI standard defines this as an iteratable, which we'll use a list for:

def application(env, start_response):
    content = []
    content.append(b'Hello World ')
    content.append(f"{env['wsgi.version'][0]}.{env['wsgi.version'][1]}".encode('utf-8'))
    content_length = sum(len(x) for x in content)
Enter fullscreen mode Exit fullscreen mode

Another consideration here is that the return should be bytes. This is why the formatted string is further encoded in utf-8. Content length is is then calculated for the header value. This means that the output can now be sent:

    status = '200 OK'
    response_headers = [
        ('Content-Type', 'plain/text'),
        ('Content-Length', str(content_length))
    ]
    start_response(status, response_headers)

    return content
Enter fullscreen mode Exit fullscreen mode

Response headers are passed in as a list of tuples. start_response is the method callable which was passed into the main application. The passing in of this value is handled by the wsgiref simple server. From there the content is passed back to the caller for the client to receive. Generators functions can be used as an alternative:

from wsgiref.simple_server import make_server
from wsgiref.validate import validator


def application(env, start_response):
    status = '200 OK'
    response_headers = [
        ('Content-Type', 'text/plain'),
    ]
    start_response(status, response_headers)

    def generate_content():
        yield b'Hello World '
        yield f"{env['wsgi.version'][0]}.{env['wsgi.version'][1]}\n".encode('utf-8')

    return generate_content()

validated_app = validator(application)

with make_server('localhost', 8000, validated_app) as httpd:
    print("Serving HTTP on port 8000...")

    # Respond to requests until process is killed
    httpd.serve_forever()
Enter fullscreen mode Exit fullscreen mode

Ideally yield would be used in cases of more intense processing where memory starvation might be an issue.

Application Caller

So looking at all of this what's actually happening with the caller? What's providing start_response and wsgi.version? The server itself handles this generally one of two ways:

  1. The server is pure python, setting up the wsgi environment variables and then importing the application module to call it with the generated environment and start_response callable
  2. Mostly the same as 1 save that there is some kind of embedded interpreter / tie in with the Python C API (or cffi)

gunicorn is an example of the first solution. mod_wsgi is an example of doing it using the second method. If you're just doing development work I'd highly recommend using the first method. It also has a potential benefit to tap into JIT optimizations if you're using PyPy due to being a long running script.

File Wrapper

For the case of files WSGI also has a variable wsgi.file_wrapper which may or may not be available and provide chunked file streaming:

from os.path import getsize
from wsgiref.simple_server import make_server
from wsgiref.validate import validator

def application(env, start_response):
    status = '200 OK'
    content_length = str(getsize('large-file.json'))
    response_headers = [
        ('Content-Type', 'application/json'),
        ('Content-Length', content_length)
    ]
    start_response(status, response_headers)
    return env['wsgi.file_wrapper'](open('large-file.json', 'rb'))

validated_app = validator(application)

with make_server('localhost', 8000, validated_app) as httpd:
    print("Serving HTTP on port 8000...")

    # Respond to requests until process is killed
    httpd.serve_forever()
Enter fullscreen mode Exit fullscreen mode

In this case I know wsgi.file_wrapper is available, but that's not always the case. iter(lambda: filelike.read(block_size), '') can be used as a replacement for cases where it's not. os.path.getsize (or os.stat().st_size, which it uses behind the scenes) can be used for purposes of obtaining the Content-Length value.

Input Stream

Input from the client is obtained via the wsgi.input environment variable. While it could be stdin, some solutions may utilize a buffer of some kind instead. I'll use an echo client and server to demonstrate this:

wsgi_input_client.py

import requests

post_data = {
    'test1': 'foobar',
    'test2': 'foobar',
    'test3': 'foobar'
}

r = requests.post('http://localhost:8000/', data=post_data)
print(r.content)
Enter fullscreen mode Exit fullscreen mode

wsgi_input_server.py

from wsgiref.simple_server import make_server
from wsgiref.validate import validator


def application(env, start_response):
    input = env['wsgi.input']
    data = input.read(int(env['CONTENT_LENGTH']))

    status = '200 OK'
    response_headers = [
        ('Content-Type', 'text/plain'),
        ('Content-Length', env['CONTENT_LENGTH']),
    ]
    start_response(status, response_headers)
    return [data]

validated_app = validator(application)

with make_server('localhost', 8000, validated_app) as httpd:
    print("Serving HTTP on port 8000...")

    # Respond to requests until process is killed
    httpd.serve_forever()
Enter fullscreen mode Exit fullscreen mode

Running the server and then calling the client against it:

$ python wsgi_input_client.py 
b'test1=foobar&test2=foobar&test3=foobar'
Enter fullscreen mode Exit fullscreen mode

So the way of handling input is fairly standard and not much different from how you'd implement it in a CGI script.

Exceptions

An alternative call for start_response can be used for exceptions in case something goes wrong:

import sys
from wsgiref.simple_server import make_server
from wsgiref.validate import validator


def application(env, start_response):
    try:
        content = []
        content.append(b'Hello World ')
        content.append(f"{env['wsgi.version'][0]}.{env['wsgi.version'][1]}".encode('utf-8'))
        content_length = sum(len(i) for i in content)

        status = '200 OK'
        response_headers = [
            ('Content-Type', 'text/plain'),
            ('Content-Length', str(content_length))
        ]
        start_response(status, response_headers)
        1/0
        return content
    except:
        status = '500 Internal Server Error'
        response_headers = [
            ('Content-Type', 'text/plain')
        ]
        start_response(status, response_headers, sys.exc_info())
        return [b'An error has occurred']

validated_app = validator(application)

with make_server('localhost', 8000, validated_app) as httpd:
    print("Serving HTTP on port 8000...")

    # Respond to requests until process is killed
    httpd.serve_forever()
Enter fullscreen mode Exit fullscreen mode

In this case the client will receive An error has occurred message along with a 500 status return due to the division by 0. sys.exc_info() is the standard return to the third argument of start_response only if an exception is present (and also the only time an additional call to it can occur).

Chunked Input With wsgi.input_terminated

While wsgiref.simple_server is useful for basic cases it does have one particular area that developers struggled with: chunked input. MDN has an example of what chunked encoding looks like:

HTTP/1.1 200 OK
Content-Type: text/plain
Transfer-Encoding: chunked

7\r\n
Mozilla\r\n
11\r\n
Developer Network\r\n
0\r\n
\r\n
Enter fullscreen mode Exit fullscreen mode

Data is sent in chunks where a hex value of the length of the data and the actual data itself is sent. The end is indicated by a 0 size and then followed by \r\n on its own line. Unfortunately the wsgiref server doesn't handle this. To work around this limitation Armin Ronacher of the Flask framework proposed a wsgi.input_terminated solution. This has been implemented in many wsgi server solutions already. To showcase this I'll use the werkzeug library which provides various WSGI utilities via pip install werkzeug:

from werkzeug.serving import make_server
from werkzeug.wrappers import Request, Response

def application(environ, start_response):
    request = Request(environ)
    with open('test.json', 'wb') as stream_fp:
        stream_fp.write(request.stream.read())

    resp = Response('Hello World!', mimetype='text/plain')
    return resp(environ, start_response)

if __name__ == '__main__':
    HOST = '127.0.0.1'
    PORT = 8123

    httpd = make_server(HOST, PORT, application)
    print(f'Serving on http://{HOST}:{PORT}')
    try:
        httpd.serve_forever()
    except KeyboardInterrupt:
        print('^C')
Enter fullscreen mode Exit fullscreen mode

In this example werkzeug has wrappers around the request and response to make it easier to access certain properties of each. To test this out I'll be chunk posting a 25MB JSON file which will then be written to the working directory of the server. "Hello World!" will be printed when the process is done. To see more of the request I'll be using curl instead of the usual python requests script:

$ curl -v -H "Transfer-Encoding: chunked" -d @large-file.json http://127.0.0.1:8123/
*   Trying 127.0.0.1:8123...
* Connected to 127.0.0.1 (127.0.0.1) port 8123 (#0)
> POST / HTTP/1.1
> Host: 127.0.0.1:8123
> User-Agent: curl/7.74.0
> Accept: */*
> Transfer-Encoding: chunked
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
* Signaling end of chunked upload via terminating chunk.
* Mark bundle as not supporting multiuse
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: Werkzeug/2.3.6 Python/3.10.12
< Date: Sun, 20 Aug 2023 10:55:16 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 12
< Connection: close
< 
* Closing connection 0
Hello World!
Enter fullscreen mode Exit fullscreen mode

So the request was written, but the code didn't change too much from the standard WSGI version. That's because things are handled behind the scenes:

if environ.get("HTTP_TRANSFER_ENCODING", "").strip().lower() == "chunked":
            environ["wsgi.input_terminated"] = True
            environ["wsgi.input"] = DechunkedInput(environ["wsgi.input"])
Enter fullscreen mode Exit fullscreen mode

So this will set the wsgi.input_terminated if it finds a Transfer-Encoding header with the value of chunked. The DeChunkedInput class works with reading the chunk segments:

line = self._rfile.readline().decode("latin1")
_len = int(line.strip(), 16)
Enter fullscreen mode Exit fullscreen mode

So here the length value is read in, which is a hex encoded integer followed by \r\n. Then readinto handles reading based on that value and also checking for the terminating size 0 with an \r\n on its own line after that.

Chunked Response

When the werkzeug server has threading or multiprocess enabled, it will utilize HTTP/1.1 and can return chunked responses as well. This makes it easier to deal with the fact that getting content length from dynamic data can be tedious. As an example:

from werkzeug.serving import make_server
from werkzeug.wrappers import Request, Response

def application(environ, start_response):
    request = Request(environ)
    with open('test.json', 'wb') as stream_fp:
        stream_fp.write(request.stream.read())

    def generate_response():
        yield "Line 1"
        yield "Line 2"
        yield "Line 3"
        yield "Line 4"

    resp = Response(generate_response(), mimetype='text/plain')
    return resp(environ, start_response)

if __name__ == '__main__':
    HOST = '127.0.0.1'
    PORT = 8123

    httpd = make_server(HOST, PORT, application, threaded=True)
    print(f'Serving on http://{HOST}:{PORT}')
    try:
        httpd.serve_forever()
    except KeyboardInterrupt:
        print('^C')
Enter fullscreen mode Exit fullscreen mode

Running curl with raw mode will show chunked data that it normally abstracts away from us:

$ curl -iv --raw -H "Transfer-Encoding: chunked" -d @large-file.json http://127.0.0.1:8123/
*   Trying 127.0.0.1:8123...
* Connected to 127.0.0.1 (127.0.0.1) port 8123 (#0)
> POST / HTTP/1.1
> Host: 127.0.0.1:8123
> User-Agent: curl/7.74.0
> Accept: */*
> Transfer-Encoding: chunked
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
HTTP/1.1 100 Continue

* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
HTTP/1.1 100 Continue

* Signaling end of chunked upload via terminating chunk.
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Server: Werkzeug/2.3.6 Python/3.10.12
Server: Werkzeug/2.3.6 Python/3.10.12
< Date: Sun, 20 Aug 2023 11:22:23 GMT
Date: Sun, 20 Aug 2023 11:22:23 GMT
< Content-Type: text/plain; charset=utf-8
Content-Type: text/plain; charset=utf-8
< Transfer-Encoding: chunked
Transfer-Encoding: chunked
< Connection: close
Connection: close

< 
6
Line 1
6
Line 2
6
Line 3
6
Line 4
0

* Closing connection 0
Enter fullscreen mode Exit fullscreen mode

The return now utilizes Transfer-Encoding: chunked and the chunked data can be seen as well.

Range Requests

Range is a special HTTP feature where you can download a specific portion of something. The primary use for this is to support resuming transfers from a certain byte. As an example:

from werkzeug.serving import make_server
from werkzeug.wrappers import Request, Response

def application(environ, start_response):
    request = Request(environ)
    start, end = request.range.ranges[0]

    with open('large-file.json') as stream_fp:
        stream_fp.seek(start)
        data = stream_fp.read(end - start)

    resp = Response(data, mimetype='text/plain')
    return resp(environ, start_response)

if __name__ == '__main__':
    HOST = '127.0.0.1'
    PORT = 8123

    httpd = make_server(HOST, PORT, application, threaded=True)
    print(f'Serving on http://{HOST}:{PORT}')
    try:
        httpd.serve_forever()
    except KeyboardInterrupt:
        print('^C')
Enter fullscreen mode Exit fullscreen mode

Note that the reason why the request.range.ranges is a list of tuples is because you can have multiple range declarations. In this case it will be a controlled session where I know there will only be one range value. Using curl again with range modifiers we can see that the requested 200 bytes were returned:

$ curl -v -r 1000-1199 http://127.0.0.1:8123/
*   Trying 127.0.0.1:8123...
* Connected to 127.0.0.1 (127.0.0.1) port 8123 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:8123
> Range: bytes=1000-1199
> User-Agent: curl/7.74.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Server: Werkzeug/2.3.6 Python/3.10.12
< Date: Sun, 20 Aug 2023 11:48:04 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 200
< Connection: close
< 
* Closing connection 0
cacfcd","before":"437c03652caa0bc4a7554b18d5c0a394c2f3d326","commits":[{"sha":"6b089eb4a43f728f0a594388092f480f2ecacfcd","author":{"email":"5c682c2d1ec4073e277f9ba9f4bdf07e5794dabe@rspt.ch","name":"rs
Enter fullscreen mode Exit fullscreen mode

Note that it's 200 bytes due to 0 indexing handling. You'll want to iterate through the ranges list in practice to support the multi-range declarations.

Conclusion

This includes a look at WSGI and using the werkzeug library to simplify some of the more advanced HTTP related features. Thanks to being a PEP standard you won't have too much trouble finding software to support it. In the next part of the series I'll be looking at WSGI server solutions to deliver WSGI content.

Top comments (3)

Collapse
 
robsongrangeiro profile image
Robson Grangeiro

I'm still digesting the previous post... super excited to find out what's next!

Collapse
 
cwprogram profile image
Chris White

Thanks for the comment! The protocols post is something I wrote more out of curiosity on how things work. Practically speaking though many modern solutions work off HTTP and the ones that do work off the other protocols handle everything for you. If you just want to mess around with something SCGI is pretty easy to work with.

Collapse
 
svemaraju profile image
Srikanth

It is empowering to see how we can create a webserver with very simple amount of code. Thanks for the great post.