Extracting information from a file - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Extracting information from a file (/thread-22995.html) |
Extracting information from a file - lokhtar - Dec-06-2019 Hello guys, I have a directory with a lot of text files, I need to loop through them and extract a certain section from them. The Text files are formatted in a standard way, in this way: Quote:ABC: this is a text What I need to do, is to extract the "section 5" portion of the text. I know the split method, but splitting the file by ":", and then further splitting by "*" - doesn't quite seem right: import glob #list of all the text files path = "reports/*.txt" file_id=0 #loop through files, one at a time for file_name in glob.glob(path): file_id += 1 with open (file_name, 'rt') as myfile: current_file = myfile.read() section_list = current_file.split(':') for list_section in section_list: further_split = list_section.split('*') for x in further_split: print("An item in list :" + str(further_split))Is there a more elegant/better way to get to what I need? What I am really after is that within the section that I care about, I want to loop through each of the subsections, which are delineated by "*" and work with those strings. I would appreciate any help! RE: Extracting information from a file - Larz60+ - Dec-06-2019 please attach a small test file RE: Extracting information from a file - lokhtar - Dec-06-2019 It won't let me upload files on this forum for some reason, so I uploaded it on the website: http://s000.tinyupload.com/download.php?file_id=55964519954061249833&t=5596451995406124983370453 RE: Extracting information from a file - Larz60+ - Dec-06-2019 You need 5 posts then you will be able to upload, so should be able to do so soon, same applies to editing your post. I used your link this time. I'll get back as soon as I get a chance to examine code. RE: Extracting information from a file - Larz60+ - Dec-06-2019 This will get you started. I switched to pathlib (requires python 3.6 or newer) rather than Glob as it is OOP and then just printer each line without modification. You can add your parsing back in from here import os from pathlib import Path def read_files(): # assure that starting path is script path os.chdir(os.path.abspath(os.path.dirname(__file__))) file_id = 0 #list of all the text files scriptpath = Path('.') reportpath = scriptpath / 'reports' # print(f"working directory: {reportpath.resolve()}") # Get list of text files textfiles = [filename for filename in reportpath.iterdir() \ if filename.is_file() and filename.suffix == '.txt'] print() for filename in textfiles: # #loop through files, one at a time # for file_name in glob.glob(path): file_id += 1 with filename.open() as myfile: for line in myfile: line = line.strip() print(f"{line}") # section_list = current_file.split(':') # for list_section in section_list: # further_split = list_section.split('*') # for x in further_split: # print("An item in list :" + str(further_split)) if __name__ == '__main__': read_files()output:
RE: Extracting information from a file - lokhtar - Dec-09-2019 Thank you! RE: Extracting information from a file - snippsat - Dec-09-2019 A example with one way to parse lines in SECTION 5.flag = 1 with open('sample_file.txt') as f: for line in f: if line.startswith('SECTION 5:'): flag = 0 #next(f) # Will skip first line in SECTION 5 if line.startswith('SECTION 6:'): flag = 1 if not flag and not line.startswith('SECTION 5:'): print(line.strip())
|