(May-08-2021, 06:32 AM)Gribouillis Wrote: [ -> ]Here is one trick: for many years, there has been a module in Pypi named grin. It is some sort of grep command written in pure Python. It could be a fruitful idea to go and see how they solve the same problem (now there is also grin3
Looking at so it's a more powerful way with added command line availability using
argparse/regex of what i have done in my code with regex.
Thinking of so may i use my code as a project using
Click(my clear favorite) to add command line functionality.
With just a simple regex change in my script testing
alice_in_wonderland.txt
How many times and where is mention in Alice in wonderland is
beautiful Soup
👀,could by a Python refence
One change in code using word boundaries(find whole word exact match).
pattern = re.compile(r'\bbeautiful Soup\b')
Output:
Found <Soup of the evening, beautiful Soup!> in file <alice_in_wonderland.txt> on line <2988>
Found <Soup of the evening, beautiful Soup!> in file <alice_in_wonderland.txt> on line <2989>
Found <Beautiful, beautiful Soup!> in file <alice_in_wonderland.txt> on line <2993>
Found <Pennyworth only of beautiful Soup?> in file <alice_in_wonderland.txt> on line <2998>
Found <Pennyworth only of beautiful Soup?> in file <alice_in_wonderland.txt> on line <2999>
Found <Beautiful, beautiful Soup!'> in file <alice_in_wonderland.txt> on line <3018>
How many CHAPTER is it in Alice in wonderland?
pattern = re.compile(r'CHAPTER.*?')
Output:
Found <CHAPTER I> in file <alice_in_wonderland.txt> on line <12>
Found <CHAPTER II> in file <alice_in_wonderland.txt> on line <247>
Found <CHAPTER III> in file <alice_in_wonderland.txt> on line <471>
Found <CHAPTER IV> in file <alice_in_wonderland.txt> on line <731>
Found <CHAPTER V> in file <alice_in_wonderland.txt> on line <1017>
Found <CHAPTER VI> in file <alice_in_wonderland.txt> on line <1329>
Found <CHAPTER VII> in file <alice_in_wonderland.txt> on line <1671>
Found <CHAPTER VIII> in file <alice_in_wonderland.txt> on line <2030>
Found <CHAPTER IX> in file <alice_in_wonderland.txt> on line <2354>
Found <CHAPTER X> in file <alice_in_wonderland.txt> on line <2693>
Found <CHAPTER XI> in file <alice_in_wonderland.txt> on line <3022>
Found <CHAPTER XII> in file <alice_in_wonderland.txt> on line <3301>
Whole code.
import os
import re
def find_files(file_type):
os.chdir(path)
with os.scandir(path) as it:
for entry in it:
if entry.name.endswith(file_type) and entry.is_file():
yield entry.name
def find_in_file(files, pattern):
for file in files:
with open(file, encoding='utf-8') as f:
for index, line in enumerate(f, 1):
for match in re.finditer(pattern, line):
print(f'Found <{line.strip()}> in file <{file}> on line <{index}>')
if __name__ == '__main__':
path = r'E:\div_code\new\finditer_any'
pattern = re.compile(r'CHAPTER.*?')
file_type = '.txt'
files = find_files(file_type)
find_in_file(files, pattern)