Python Forum
Save a file uploaded from client-side without having to read into memory
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Save a file uploaded from client-side without having to read into memory
#1
When a file is uploaded from the client-side to the server, how can you save this uploaded file without it being read into memory (on a WSGI Python application)?

And also do this without using a third-party module or framework. The Python standard library has a cgi module that can parse POST form data including enctype="multipart/form-data" uploaded files. Here is a minimal code example:

Code snippet of the HTML form:

return """
    <!doctype html>
    <html>
        <h1>Upload new File</h1>
        <form method=post enctype=multipart/form-data>
            <input type=file name=file>
            <input type=submit value=Upload>
        </form>
    </html>
"""
Code snippet of Processing the uploaded file with Python:

    # Note: 'environ' is the variable that contains environment & request data that the WSGI server passes to the appliaction()

    import cgi
    field_storage = cgi.FieldStorage(
        fp=environ['wsgi.input'],
        environ=environ,
        keep_blank_values=True
    )

    for item in field_storage.list:

        # if it's a POST file
        if item.filename:
        
            storage_file_path = '/path/to/storage_dir/' + item.filename
            
            # Read the uploaded file
            file_content = item.file.read()
            
            # Save file
            with open(storage_file_path, 'wb') as file:
                file.write(file_content)

The problem with this though is the whole file content is read into memory before it saves the file because of the line file_content = item.file.read(). This is a problem because very large files that are uploaded will use too much memory/use up all the memory.

This problem can be fixed by using chunks, so that chunks of the file content are read into memory instead of the whole file content.

    import cgi
    field_storage = cgi.FieldStorage(
        fp=environ['wsgi.input'],
        environ=environ,
        keep_blank_values=True
    )
    
    for item in field_storage.list:
    
        # if it's a POST file
        if item.filename:
        
            storage_file_path = '/path/to/storage_dir/' + item.filename
    
            # Save file (in chunks - 100000 byte chunks)
            # Note: At the end of each iteration, the garbage collector will clear out the current chunk from the memory so you don't need to use 'del chunk' at end of loop
            with open(storage_file_path, 'wb') as file:
                while True:
                    chunk = item.file.read(100000)
                    if not chunk:
                        break
                    file.write(chunk)
This works, but is it possible to save an uploaded file without having to read any of the file contents into memory? Because there will still be a problem of using too much memory if many users upload a file at the same time. Any help appreciated.
Reply
#2
Quote:When a file is uploaded from the client-side to the server, how can you save this uploaded file without it being read into memory (on a WSGI Python application)?
If you have uploaded the file to the server, it's already in memory. That's where it goes when uploaded!
Reply
#3
Is it possible to make the uploaded file go straight to the hard drive instead of memory though?

A bit like how a swapfile works (uses space on a hard drive when the memory is fully utilised) but instead of using the memory at all for the uploaded file content, it uses the hard drive straight away without any file content being used in the memory.
Reply
#4
(Nov-21-2019, 06:58 AM)andym118 Wrote: Is it possible to make the uploaded file go straight to the hard drive instead of memory though?

If the system is capable of sendfile(2), you can make a zero-copy from a socket to a file descriptor and reversed.
Here is a module, which supports it: https://pypi.org/project/pysendfile/
Under the hood they use mmap.

Another very important thing is following:
Quote:Also, it must be clear that the file can only be sent “as is” (e.g. you can’t modify the content while transmitting). There might be problems with non regular filesystems such as NFS, SMBFS/Samba and CIFS. For this please refer to proftpd documentation.

This means, that you can't modify the stream on the fly.

EDIT: It seems, that this has been implemented since Python 3.3: http://michaldul.com/python/sendfile/
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Open/save file on Android frohr 0 280 Jan-24-2024, 06:28 PM
Last Post: frohr
  Recommended way to read/create PDF file? Winfried 3 2,784 Nov-26-2023, 07:51 AM
Last Post: Pedroski55
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 1,312 Nov-09-2023, 10:56 AM
Last Post: mg24
  how to save to multiple locations during save cubangt 1 509 Oct-23-2023, 10:16 PM
Last Post: deanhystad
  save values permanently in python (perhaps not in a text file)? flash77 8 1,121 Jul-07-2023, 05:44 PM
Last Post: flash77
  read file txt on my pc to telegram bot api Tupa 0 1,049 Jul-06-2023, 01:52 AM
Last Post: Tupa
  parse/read from file seperated by dots giovanne 5 1,043 Jun-26-2023, 12:26 PM
Last Post: DeaD_EyE
  Formatting a date time string read from a csv file DosAtPython 5 1,161 Jun-19-2023, 02:12 PM
Last Post: DosAtPython
  How do I read and write a binary file in Python? blackears 6 6,016 Jun-06-2023, 06:37 PM
Last Post: rajeshgk
  Read csv file with inconsistent delimiter gracenz 2 1,143 Mar-27-2023, 08:59 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020