Dec-06-2019, 03:39 PM
Hello guys,
I have a directory with a lot of text files, I need to loop through them and extract a certain section from them. The Text files are formatted in a standard way, in this way:
What I need to do, is to extract the "section 5" portion of the text. I know the split method, but splitting the file by ":", and then further splitting by "*" - doesn't quite seem right:
Is there a more elegant/better way to get to what I need? What I am really after is that within the section that I care about, I want to loop through each of the subsections, which are delineated by "*" and work with those strings.
I would appreciate any help!
I have a directory with a lot of text files, I need to loop through them and extract a certain section from them. The Text files are formatted in a standard way, in this way:
Quote:ABC: this is a text
SECTION 2: this is more text
ANOTHER SECTION: blah blah blah. This is another section
SECTION 4:
SECTION 5:
* A list
* Another list. I need this
YET ANOTHER SECTION: A bunch. of sentences. exist here.
OTHER FINDINGS: None.
FINAL
THIS IS NOT IMPORTANT
What I need to do, is to extract the "section 5" portion of the text. I know the split method, but splitting the file by ":", and then further splitting by "*" - doesn't quite seem right:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
import glob #list of all the text files path = "reports/*.txt" file_id = 0 #loop through files, one at a time for file_name in glob.glob(path): file_id + = 1 with open (file_name, 'rt' ) as myfile: current_file = myfile.read() section_list = current_file.split( ':' ) for list_section in section_list: further_split = list_section.split( '*' ) for x in further_split: print ( "An item in list :" + str (further_split)) |
I would appreciate any help!