Python Forum
Copy Paste excel files based on the first letters of the file name
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Copy Paste excel files based on the first letters of the file name
#1
Hi together,

I have been learning Python for just 2 months. So basicially I am still a beginner.
However, I am already trying to automate one of my working streams.

For this automiziation I need to copy paste excel files from one path to another path, whereas the "source" path is always a bit different and also the file name within the respective source path is always a bit different (only the first letters of the file names are identical).

E.g:

Files I would like to copy (when I enter "Q4"):
C:\Reports\Oct 23\19) Volumes 1423423_1323.xlsx
C:\Reports\Nov 23\19) Volumes 21313_323.xlsx
C:\Reports\Dec 23\19) Volumes 23434_2132.xlsx

Destination:
C:\Calculation\Q4

My idea is to create 4 lists, when input is e.g. Q4 then go through the Q4 list using it for the source path.
How can I implement it?

Then secondly I would like to say when the first letters of the file name is "19)" then copy it.
By they in the source path there are also hundred other files.

Here is my current working status:

import shutil

Q1 = ["Jan", "Feb", "Mar"]
Q2 = ["Apr", "May", "Jun"]
Q3 = ["Jul", "Aug", "Sep"]
Q4 = ["Oct", "Nov", "Dec"]

print("Which quarter do you want to pull?")
quarter_input = input("Please enter the quarter Q1, Q2, Q3 or Q4: ")


if quarter_input == str("Q4"):
    
    for x in Q4:
    
        file_to_copy = fr"C:\Reports\{x} 23\19) Volumes 1423423_1323.xlsx"

        destination_directory = fr"C:\Calculation\19) Volumes 1423423_1323.xlsx"

        shutil.copy(file_to_copy, destination_directory)
Reply
#2
I find the best and easiest way to copy and back up files is to use a programme called rsync. rsync always checks if files have been modified since the last time you copied and only copies files which have changed.

You can use rsync over networks, to anywhere, even to your usb stick!

pathlib is a module to make switching from Linux type paths to Windows type paths, and vice versa, easy. So if you make programmes which should work in either Linux or Windows, use pathlib. If you only use Windows, or only use Linux, you can just use your normal paths.

When you copy files the second time, there is no need to copy files which have not changed. You can get the last modification time from

Quote:os.stat(file).st_mtime

and you can get the file name when using pathlib from: Path.name

from pathlib import Path
import os
src = Path('/home/pedro/myPython/openpyxl/xlsx_files/5000/')
dst = Path('/home/pedro/tmp/5000/')
for file in src.glob("*"):
    #get the file name from the Path file
    file_name = file.name
    print(f'source file is {file_name}, modification time, st_mtime is: {os.stat(file).st_mtime}')
So, if you only want files with 19 in the name you can do this:

for file in src.glob("*"):
    #get the file name from file
    file_name = file.name
    if '19' in file_name:
        print(f'source file is {file_name}, modification time, st_mtime is: {os.stat(file).st_mtime}
the output:

Output:
source file is random_data 19.txt, modification time, st_mtime is: 1707303135.7392948
You should use shutil.copy2() because that will not alter file metadata. Files in the source directory, modified after you copied with shutil.copy2(), will have a different st_mtime to the file in the destination directory. After the first time you copy, you can check os.stat(file).st_mtime before you copy. Only copy files which have changed.

from pathlib import Path
import shutil
import os
src = Path('/home/pedro/myPython/openpyxl/xlsx_files/5000/')
dst = Path('/home/pedro/tmp/5000/')
# to check if a file exists in pathlib
# if my_file.is_file()
for file in src.glob("*"):
    #get the file name from file
    file_name = file.name
    print(f'source file is {file_name}, modification time, st_mtime is: {os.stat(file).st_mtime}')
    # dst is a directory
    # join file_name to dst
    dfile = dst.joinpath(file_name)
    print('destination file is', dfile)
    # the first time there are no files in dst which will throw an error below
    # if the file is not in the destination folder, copy it over and continue
    if not dfile.is_file():
        shutil.copy2(file, dst)
        continue
    else:
        print(f'the file {dfile} exists, so check modification time.')
    # If more than 1 second difference, set this to any number you like
    print(f'modification time difference = {os.stat(file).st_mtime - os.stat(dfile).st_mtime}')
    if os.stat(file).st_mtime - os.stat(dfile).st_mtime > 1:
        # copy2() will not alter the metadata
        print('Modification times are different so copy to destination ... ')
        shutil.copy2(file, dst)
But like I said, if you learn to use rsync, if that works in Windoze, well, why bother with Python except for learning how?

This command copies files from this laptop to my old laptop over the LAN at home. After the first time, only files which have changed will be copied over.

Quote:rsync -av -e "ssh" --progress /home/pedro/Documents [email protected]:/home/pedro/

So I can use my old laptop like a giant usb stick to back up any folders and files!

PS: I would avoid file names with spaces, that can cause trouble when you want to do things in a bash terminal!
Reply
#3
Thanks for your response.... I will look at it later

In the meantime I have also found a solution.... I am pretty sure it is not the most elegant solution but it works :)


import shutil
import os

Q1 = ["Jan", "Feb", "Mar"]
Q2 = ["Apr", "May", "Jun"]
Q3 = ["Jul", "Aug", "Sep"]
Q4 = ["Oct", "Nov", "Dec"]

print("Which quarter do you want to pull?")
quarter_input = input("Please enter the quarter Q1, Q2, Q3 or Q4: ")


if quarter_input == str("Q4"):
         
    for x in Q4:
        
        path = fr"C:\Reports\{x} 23"
        files = os.listdir(fr"C:\Reports\{x} 23")
        
        files_txt = [i for i in files if i.startswith(str(19))]
        
        files_txt = str(files_txt)[:-2][2:]      
                      
        file_to_copy = fr"C:\Reports\{x} 23\{files_txt}"

        destination_directory = fr"C:\Calculation\{x} Volume Report.xlsx"

        shutil.copy(file_to_copy, destination_directory)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  docx file to pandas dataframe/excel iitip92 1 487 Jun-27-2024, 05:28 AM
Last Post: Pedroski55
  Copy and Paste Files - Command MicheliBarcello 2 411 Jun-25-2024, 05:04 AM
Last Post: rodiongork
  Why is the copy method name in python list copy and not `__copy__`? YouHoGeon 2 660 Apr-04-2024, 01:18 AM
Last Post: YouHoGeon
  Python openyxl not updating Excel file MrBean12 1 808 Mar-03-2024, 12:16 AM
Last Post: MrBean12
  What script to paste folders thenewcoder 1 955 Nov-29-2023, 09:40 AM
Last Post: Pedroski55
  Search Excel File with a list of values huzzug 4 1,745 Nov-03-2023, 05:35 PM
Last Post: huzzug
  Updating sharepoint excel file odd results cubangt 1 1,302 Nov-03-2023, 05:13 PM
Last Post: noisefloor
  Move Files based on partial Match mohamedsalih12 2 1,304 Sep-20-2023, 07:38 PM
Last Post: snippsat
  Copy data from Excel and paste into Discord (Midjourney) Joe_Wright 4 2,742 Jun-06-2023, 05:49 PM
Last Post: rajeshgk
  Making a question answering chatbot based on the files I upload into python. Joejones 1 1,801 May-19-2023, 03:09 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020