Python Forum

Full Version: Read all csv files, and store the last line from each folder
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,
I want to read all csv files and read each file and capture last line form each file.
I use the below code, but I struck.

import os
from glob import glob
PATH = "D:\data\project\*\*\InputFiles"
EXT = "*.csv"
all_csv_files = [file
                 for path, subdir, files in os.walk(PATH)
                 for file in glob(os.path.join(path, EXT))]
print(all_csv_files)
with pathlib written here without testing, but I think it's right or close to it:
from pathlib import Path

p = Path("D:\data\project\*\*\InputFiles")
csv_files = [csvfile for csvfile in p.iterdir() if csvfile.is_file() and csvfile.suffix == '.csv']
for filename in csv_files:
    print(filename)
I use below code. Here, the there will be 4 levels (subfolders) so I use

Exammple: Rootfolder/project/cataegory after this I do not know folder names, but there will be 4 more subfolders and in the last folder my file exist.

My file name is: .Dailycollection.log

If file exist after this : Rootfolder/project/cataegory, then I want to capture full path of the file.
p = Path("D:\Mekala_Backupdata\*\*\*\*\")
csv_files = [csvfile for csvfile in p.iterdir() if csvfile.is_file() and csvfile.suffix == '*.Dailycollection.log']
for filename in csv_files:
print(filename)
I can't run the following code as I don't use windows, but this should list all csv files in all directories
within and below a root directory of D:\data\project\*\*\InputFiles (you will have to supply actual values for '*')
from pathlib import Path


def walk_dir(starting_dir):
    flist = []
    for path in Path(starting_dir).iterdir():
        if path.is_file():
            if path.suffix == '.csv':
                print(path)
                flist.append(path)
        elif path.is_dir():
            walk_dir(path)

    for file in flist:
        print(file)


if __name__ == '__main__':
    start_path = 'D:\data\project\*\*\InputFiles'
    import os
    os.chdir(os.path.abspath(os.path.dirname(__file__)))
    srcdir = Path('.')
    savefile = srcdir / 'allcsvfiles.txt'
    walk_dir(start_path, savefile)
Actually the last layer (sub)directory does not know (not fixed). But the file will be saved into this last folder. Hence I want to use the wild card.
The file is available at :D:\Mekala_Backupdata\*\*\*\*\ --> here the file.