Dec-06-2019, 03:39 PM
Hello guys,
I have a directory with a lot of text files, I need to loop through them and extract a certain section from them. The Text files are formatted in a standard way, in this way:
What I need to do, is to extract the "section 5" portion of the text. I know the split method, but splitting the file by ":", and then further splitting by "*" - doesn't quite seem right:
I would appreciate any help!
I have a directory with a lot of text files, I need to loop through them and extract a certain section from them. The Text files are formatted in a standard way, in this way:
Quote:ABC: this is a text
SECTION 2: this is more text
ANOTHER SECTION: blah blah blah. This is another section
SECTION 4:
SECTION 5:
* A list
* Another list. I need this
YET ANOTHER SECTION: A bunch. of sentences. exist here.
OTHER FINDINGS: None.
FINAL
THIS IS NOT IMPORTANT
What I need to do, is to extract the "section 5" portion of the text. I know the split method, but splitting the file by ":", and then further splitting by "*" - doesn't quite seem right:
import glob #list of all the text files path = "reports/*.txt" file_id=0 #loop through files, one at a time for file_name in glob.glob(path): file_id += 1 with open (file_name, 'rt') as myfile: current_file = myfile.read() section_list = current_file.split(':') for list_section in section_list: further_split = list_section.split('*') for x in further_split: print("An item in list :" + str(further_split))Is there a more elegant/better way to get to what I need? What I am really after is that within the section that I care about, I want to loop through each of the subsections, which are delineated by "*" and work with those strings.
I would appreciate any help!