Find specific subdir, open files and find specific lines that are missing from a file - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Find specific subdir, open files and find specific lines that are missing from a file (/thread-29208.html) |
Find specific subdir, open files and find specific lines that are missing from a file - tester_V - Aug-22-2020 Hi, I have a directory with bunch of subdirectories each subdir has a one file only, I need to process files only form the subdirectories that have letter “H” in a name. Each file will contain lines with the words "CELL-1", "CELL-2" up to "CELL-12 "- I’m interested in those lines . I'd like to scan the file line by line and find/print "CELL-XX" lines for processing that are present in a file and the ones that are missing from a file. Something like this: output_file_a.write() line CELL-1 -missing line CELL-2 - infile line CELL-3 -missing and so on..... output_file_b.write() line CELL-1 -infile line CELL-2 - infile line CELL-3 -missing and so on..... I can find all the files and print out “CELL-xx” lines that are in each file but not the one that are missing. Thank you. import os import pathlib path = 'c:/path_tosubdirs/' mytof = 'H' for file in os.listdir(path): hdir_f = os.path.join(path, file) if mytof in hdir_f : ### Directories with 'H" in name path2 = hdir_f for file1 in os.listdir(path2): hdir_f1 = os.path.join(path2, file1) print ("DIR\path\file ->>",hdir_f1) with open (hdir_f1) as cells_file : for el in cells_file : if 'CELL-' in el : el=el.rstrip() print("CELL-xx ", el) RE: Find specific subdir, open files and find specific lines that are missing from a file - ndc85430 - Aug-23-2020 Out of curiosity, is it necessary to write this yourself? Does Windows not have a tool like grep? RE: Find specific subdir, open files and find specific lines that are missing from a file - Gribouillis - Aug-23-2020 You could try something along the line of import re with open (hdir_f1) as cells_file: inum = set(int(match.group(1)) for match in (re.search(r"CELL\-(\d+)", line) for line in cells_file) if match) for i in range(1, 13): print('line CELL-{} - {}'.format( i, 'infile' if i in inum else 'missing')) RE: Find specific subdir, open files and find specific lines that are missing from a file - tester_V - Aug-23-2020 To 'ndc85430' the solution i'm looking for will be a part of a "bigger" script. I'd like to keep it all in "Python". RE: Find specific subdir, open files and find specific lines that are missing from a file - millpond - Aug-24-2020 I beleive that the proper way to parse the file (and using the matching suggested) would be with readlines. Only line by line can tell you if the expression does *not* exist on a given line. Not familiar enough with re.search to know if defaults to line-by-line. ` RE: Find specific subdir, open files and find specific lines that are missing from a file - tester_V - Aug-24-2020 (Aug-24-2020, 06:53 AM)millpond Wrote: I beleive that the proper way to parse the file (and using the matching suggested) would be with readlines. Do oy think you can show me how to do this? I'd like to know how I could do this, I'm sure there are many other ways to accomplish the tusk I just do not see any of them... THank you! RE: Find specific subdir, open files and find specific lines that are missing from a file - tester_V - Aug-24-2020 To Gribouillis. Thet snippet you shared - works! Thank you for your help. One more request for you. Could you explain the code please? I think I understand it but probably not. Thank you! Tester_V RE: Find specific subdir, open files and find specific lines that are missing from a file - Gribouillis - Aug-24-2020 tester_V Wrote:Could you explain the code please?Well the expresssion re.search(r"CELL\-(\d+)", line) returns either None or a MatchObject in the sense of the re module. There is a match object if a substring such as CELL-8 was found in the line. With the match object, one can get the number 8, that's the value returned by the expression int(match.group(1)) . Thus the sequencesequence = (re.search(r"CELL\-(\d+)", line) for line in cells_file)is a sequence such as None, None, match, None, match,... with one item per line of the file.The expression inum = set(int(match.group(1)) for match in sequence if match)computes the set of all integers found in the above matches, hence all the integers i such that CELL-i was found in the file (actually, only the first occurrence on each line is taken into account). Then there is a loop, equivalent to for i in [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]: if i in inum: print(f"CELL-{i} infile") else: print(f"CELL-{i} missing") RE: Find specific subdir, open files and find specific lines that are missing from a file - tester_V - Aug-25-2020 To Gribouillis: Outstanding! Thank you for the code and the coaching. I really appreciate it and probably many other... |