I've been doing a lot with IBM Cloud Functions recently. So far I've always been passing up simple query string values or json-encoded data. But what if you want to upload an actual binary file? The usual way, in the web world, is to use the multipart/form-data
content type.
But how does Apache Openwhisk (that IBM Cloud Functions are based on) handle this?
If you set the function to raw
type, rather than getting the JSON-decoded values passed to your function as a dictionary, you get a dictionary that gives you the raw data. The data itself if base64 encoded in __ow_body
and the headers in __ow_headers
.
There are several ways to process this in Python. The requests_toolbelt
package has a MultipartDecoder
class. But I'm going to use one from the cgi
library that is included in Python itself:
from cgi import parse_multipart, parse_header
from io import BytesIO
from base64 import b64decode
def main(args):
c_type, p_dict = parse_header(args['__ow_headers']['content-type'])
decoded_string = b64decode(args['__ow_body'])
p_dict['boundary'] = bytes(p_dict['boundary'], "utf-8")
p_dict['CONTENT-LENGTH'] = len(decoded_string)
form_data = parse_multipart(BytesIO(decoded_string), p_dict)
# Do something with the data. In this simple example
# we will just return a dict with the part name and
# length of the content of that part
ret = {}
for key, value in form_data.items():
ret[key] = len(value[0])
return ret
The parse_multipart
method expects to find a header called CONTENT-LENGTH
so we need to add that in to our headers before passing it to the parser.
To create this cloud function, we need to set --web raw
flag to tell Openwhisk not to try and parse the payload as JSON for us:
% ibmcloud fn action create upload upload.py --web raw
We can then get the URL for it:
% ic fn action get upload --url
ok: got action upload
https://eu-gb.functions.appdomain.cloud/api/v1/web/1d0ffa5a-835d-4c40-ac80-77ca4a35f028/upload
And we can then call it using cURL and upload some data to it. In this case we are passing both a simple string value (foo
) and the contents of a binary WAV audio file:
% curl -F id=foo -F audio=@test.wav https://eu-gb.functions.appdomain.cloud/api/v1/web/1d0ffa5a-835d-4c40-ac80-77ca4a35f028/upload.json
{
"audio": 3840102,
"id": 3
}
It seems that IBM Cloud functions has a limit of 5MB for the payload. I guess anything larger than this and you'd be better to upload it to IBM Cloud Object Storage (COS) then pass the URL of that uploaded item to the cloud function.
Top comments (1)
Thank you sir :-) Today I was just struggling on how AWS Api Gateway HttpApi could handle multipart/form-data and this same IBM example works perfectly there as well.