Python Forum
Extract a string between 2 words from a text file - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Extract a string between 2 words from a text file (/thread-35419.html)



Extract a string between 2 words from a text file - OscarBoots - Nov-01-2021

Hi Forum,

I have a very long Code script containing SAS Proc SQL. There's a lot of excess code that I don't need and I know that the individual scripts will begin with 'proc sql' and end with 'quit;'.

I've read up and tried a few things but can't get any useful results.

Can anyone give me an idea of a Python3 script that would output a list the individual scripts using a .txt file?

Thanks Peter


RE: Extract a string between 2 words from a text file - Larz60+ - Nov-01-2021

pseudo code
p = path to script dir
python_filelist = [x for x in p.iterdir() if x.is_file() and x.suffix == '.py']
# or if you only want file names without paths:
python_filelist = [x.name for x in p.iterdir() if x.is_file() and x.suffix == '.py']
example:
from pathlib import Path
import os


os.chdir(os.path.abspath(os.path.dirname(__file__)))
p = Path('.')

python_filelist = [x for x in p.iterdir() if x.is_file() and x.suffix == '.py'
    and not x.name == __file__]

print()
for file in python_filelist:
    print(file)
Output:
AddressCorrection.py BadgeDbModel.py BadgePaths.py common.py ConsolidateCsvFiles.py CreateDict.py ExtractBusinessListings.py MakePretty.py ParseBusinessListings.py PrettifyPage.py prog.py ShowFiles.py SplitAddresses.py TryAddressCorrection.py ziggy.py __init__.py



RE: Extract a string between 2 words from a text file - ibreeden - Nov-02-2021

So you have a large file with code and you want to extract only "proc sql" parts to a text file.
I would read the lines from the source file one by one and check for a line containing "proc sql". When found, toggle a boolean variable, say "do_copy" to True and as long as this variable is true, copy the lines to your target file.
And also check for lines containing "quit". In that case toggle the boolean to "False" so the copying stops.

sourcefile = "venv/file1.txt"   # Replace this with your source file.
targetfile = "venv/file2.txt"   # Replace this with your target file

do_copy = False     # Boolean: copy line to target or not

with open(sourcefile, "r") as source, open(targetfile, "w") as target:
    for sourceline in source:
        if "proc sql" in sourceline.lower():
            do_copy = True
        if do_copy:
            target.write(sourceline)
        if "quit" in sourceline.lower():
            target.write("\n")
            do_copy = False