Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Watch SFTP Folder
#1


Hi Guys,

I'm sure you will be able to tell from reading this, but I am quite new to python :)

What I'm trying to do is connect to a SFTP server to download files - the files are not always available at the same time each day so I'd like to connect at 2pm and just stay connected polling the directory until all the files are available and then download them.

I have no issues downloading the files if they exist when the script is executed, but because of the delays sometimes they are not there and I'll end up needing to execute it again.

I get run a query using cx_oracle to and put the results into a list - these are all of the filenames that are expected/required to be downloaded for example [file1.csv, file2.csv etc]

The part I'm stuck on / not sure how to do is making the script stay connected to the server until all files are available.

I've got the code below to work on a local folder to tell me when files are added / removed, so I was thinking I might be able to adapt this somehow to just create a list of files that have been added since my connection was made and then I would just somehow compare the list to my expected list that (I mentioned above) somehow?

import os, time
path_to_watch = "."
before = dict ([(f, None) for f in os.listdir (path_to_watch)])
while 1:
  time.sleep (10)
  after = dict ([(f, None) for f in os.listdir (path_to_watch)])
  added = [f for f in after if not f in before]
  removed = [f for f in before if not f in after]
  if added: print "Added: ", ", ".join (added)
  if removed: print "Removed: ", ", ".join (removed)
  before = after

Hoping someone is kind enough to give me some pointers / some sample code that I may be able to adapt...


Many thanks in advance!
Reply
#2
It depends on what package you use to connect to SFTP. For example pysftp hasĀ 
pysftp.Connection.listdir()

also you may want to use watchdog to monitor for file system changes

here is an example code for sync local and remote (sftp server).

https://codereview.stackexchange.com/que...local-ones



EDIT: I see the example is the other way sync from local to remote and you want the opposite
just google watchdog python ftp
Reply
#3
Thanks for the reply Buran...
I ended up getting it to work a different way - I'm sure there is a much better way to do this, but it is working the way I need so it'll do for now until I learn more Big Grin

            t0 = time.time()
            while total_files != total_files_required:
                time.sleep(10)
                remote_files = sftp.listdir("REMOTE_DIRECTORY_TO_WATCH")
                logging.info("Robot is currently waiting for the total matched filenames to = the total expected matches, current number of matches is: " + str(total_files))
                for filename in expected_files:
                    t1 = time.time()
                    logging.info("Looking for: " + filename + " in the remote directory.")
                    total_time = t1-t0
                    if total_time == 7200:
                        logging.warning("Robot has been waiting for all of the expected extract files for 2 hours & has now aborted.")
                        update_job_log_table("subjet for email notification", "job status", "email template to use", job number)
                        sys.exit(1)
                    if filename + SUFFIX_TO_FETCH in remote_files:
                        logging.info("I've found filename: " + filename + " in the remote directory & will remove it from the list of required files variable so i'm only looking for files that are still missing.")
                        expected_files = [expected_files for expected_files in expected_files if expected_files != filename]
                        total_files = total_files + 1
So basically I have a query before this part which will tell me the filenames that are expected in the remote folder + the count of expected files.
It loops through looking for these files names and if it is found it will +1 to the total_files variable and remove that file from the expected_files list variable and continue the loop looking or what is left in the expected files variable.
Once the total_files variable = the number of expected files in the query it will start the download.
If it has been waiting for 2 hours it will call the update_job_log_table to update a table in a db and send a notification email about the failure and then exit.

The only part I have not tested is the time part to exit after 2 hours, everything else however, works fine Smile

If someone reads this and wants to suggest better ways of doing this I'm more than happy to have a read and give it a shot Big Grin
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Compare folder A and subfolder B and display files that are in folder A but not in su Melcu54 3 531 Jan-05-2024, 05:16 PM
Last Post: Pedroski55
  no such file or directory in SFTP saisankalpj 2 1,540 Nov-25-2022, 11:07 AM
Last Post: DeaD_EyE
  file transfer via python SFTP SCP mg24 3 2,971 Sep-15-2022, 04:20 AM
Last Post: mg24
  Compare filename with folder name and copy matching files into a particular folder shantanu97 2 4,473 Dec-18-2021, 09:32 PM
Last Post: Larz60+
  Move file from one folder to another folder with timestamp added end of file shantanu97 0 2,467 Mar-22-2021, 10:59 AM
Last Post: shantanu97
  Python Cut/Copy paste file from folder to another folder rdDrp 4 5,041 Aug-19-2020, 12:40 PM
Last Post: rdDrp
  Opening CSV file from SFTP server not working cluelessintern 0 2,767 Apr-08-2020, 08:10 PM
Last Post: cluelessintern
  Watch new files and modify it Macha 3 2,731 Mar-27-2019, 03:47 PM
Last Post: metulburr
  Delete directories in folder is not working after folder is updated asheru93 2 2,648 Feb-13-2019, 12:37 PM
Last Post: asheru93
  Connect to SFTP to read cvs files arunlal 1 3,014 Nov-20-2018, 08:32 AM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020