When a file is uploaded from the client-side to the server, how can you save this uploaded file without it being read into memory (on a WSGI Python application)?
And also do this without using a third-party module or framework. The Python standard library has a
Code snippet of the HTML form:
The problem with this though is the whole file content is read into memory before it saves the file because of the line
This problem can be fixed by using chunks, so that chunks of the file content are read into memory instead of the whole file content.
And also do this without using a third-party module or framework. The Python standard library has a
cgi
module that can parse POST form data including enctype="multipart/form-data"
uploaded files. Here is a minimal code example:Code snippet of the HTML form:
return """ <!doctype html> <html> <h1>Upload new File</h1> <form method=post enctype=multipart/form-data> <input type=file name=file> <input type=submit value=Upload> </form> </html> """Code snippet of Processing the uploaded file with Python:
# Note: 'environ' is the variable that contains environment & request data that the WSGI server passes to the appliaction() import cgi field_storage = cgi.FieldStorage( fp=environ['wsgi.input'], environ=environ, keep_blank_values=True ) for item in field_storage.list: # if it's a POST file if item.filename: storage_file_path = '/path/to/storage_dir/' + item.filename # Read the uploaded file file_content = item.file.read() # Save file with open(storage_file_path, 'wb') as file: file.write(file_content)
The problem with this though is the whole file content is read into memory before it saves the file because of the line
file_content = item.file.read()
. This is a problem because very large files that are uploaded will use too much memory/use up all the memory.This problem can be fixed by using chunks, so that chunks of the file content are read into memory instead of the whole file content.
import cgi field_storage = cgi.FieldStorage( fp=environ['wsgi.input'], environ=environ, keep_blank_values=True ) for item in field_storage.list: # if it's a POST file if item.filename: storage_file_path = '/path/to/storage_dir/' + item.filename # Save file (in chunks - 100000 byte chunks) # Note: At the end of each iteration, the garbage collector will clear out the current chunk from the memory so you don't need to use 'del chunk' at end of loop with open(storage_file_path, 'wb') as file: while True: chunk = item.file.read(100000) if not chunk: break file.write(chunk)This works, but is it possible to save an uploaded file without having to read any of the file contents into memory? Because there will still be a problem of using too much memory if many users upload a file at the same time. Any help appreciated.