Apr-13-2017, 03:13 PM
You could use some ancillary file and after every sucessfully processed file add its bzip2's filepath to it. And on start of script load this file and then check against it while iterating. os.walk traverses in an arbitrary order, so when you start script with some files processed, you need to start your counter from number of processed files and increase counter/extract only files not in the ancillary file.
But perhaps simplest solution would be to remove your initial problem with connection... I dont know what exactly do you use to connect to your instance, but if you use ssh, then install tmux or screen on your instance and use it to run your script - with tmux/screen you can detach from your session and log out without stopping your script, or attach to a running session if you got disconnected. And if you dont use ssh, then you should start to use ssh.
But perhaps simplest solution would be to remove your initial problem with connection... I dont know what exactly do you use to connect to your instance, but if you use ssh, then install tmux or screen on your instance and use it to run your script - with tmux/screen you can detach from your session and log out without stopping your script, or attach to a running session if you got disconnected. And if you dont use ssh, then you should start to use ssh.