![]() |
Help with generating regx Pattern please - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Help with generating regx Pattern please (/thread-42866.html) |
Help with generating regx Pattern please - lastyle - Aug-26-2024 Hi all, i need to extract "Names" from a Textfile which i liked to solve via Regx, but i fail to generate the correct pattern for that. Can somebody please help me out ? i need to extract after the second Number until the end of Line ˜16524:˜ 1 §š£™££LÁ-SÔÙÌE£™£š£¥ ˜11241:˜ 159 —.M˜A›CX. to §š£™££LÁ-SÔÙÌE£™£š£¥ —.M˜A›CX. but my current output is ♣:˜ 1 ▼§š£™££♣LÁ-SÔÙÌE£™£š£▼¥ ♣:˜ 159 —.M˜A›CX♣. A sample file with 21 Lines is attached and my current code looks like : import re def parse_username(content): print(f"Content after seconds: {content.strip()}") def process_file(file_path, encoding='utf-8'): pattern = re.compile(r'\d+(.*)') try: with open(file_path, 'r', encoding=encoding) as file: for line in file: match = pattern.search(line) if match: content_after_number = match.group(1) parse_username(content_after_number) except UnicodeDecodeError as e: print(f"Error decoding file: {e}") except FileNotFoundError: print(f"File not found: {file_path}") except Exception as e: print(f"An unexpected error occurred: {e}") process_file('names.txt')Thanks in Advance RE: Help with generating regx Pattern please - DeaD_EyE - Aug-27-2024 import re from io import StringIO text_file_like = StringIO( """ ˜16524:˜ 1 §š£™££LÁ-SÔÙÌE£™£š£¥ ˜11241:˜ 159 —.M˜A›CX. """ ) # maybe a regex could solve the problem # but it could make the problem harder to solve # test it on https://regex101.com/ REGEX = re.compile(r"˜\d{5}:˜ \d+ (.+)") def get_names(fd): for line in fd: if match := REGEX.search(line): yield match.group(1) for name in get_names(text_file_like): print(name)
RE: Help with generating regx Pattern please - Pedroski55 - Aug-28-2024 Maybe like this: import re s = "˜16524:˜ 1 §š£™££LÁ-SÔÙÌE£™£š£¥" t = "˜11241:˜ 159 —.M˜A›CX." e = re.compile(r'(?<=\d\s)([\w\W]*)') res_s = e.search(s) <re.Match object; span=(12, 37), match='\x1f§š£™£\x9e£\x05LÁ-SÔÙÌE\x9e£™£š£\x1f¥'> print(res_s.group()) res_t = e.search(t) <re.Match object; span=(14, 25), match='—.M˜A›C\x9eX\x05.'> print(res_t.group())Gives:
RE: Help with generating regx Pattern please - snippsat - Aug-28-2024 import re text = '''\ ˜16524:˜ 1 §š£™££LÁ-SÔÙÌE£™£š£¥ ˜11241:˜ 159 —.M˜A›CX. ˜11243:˜ 90 š™ÄŸIšDI ˜11245:˜ 89 BITSHAKER ˜11247:˜ 20 °À³™È›OŸLY˜ œÍO˜SEŸS›«™ÀÀ®’ ˜11248:˜ 11 EAGLEWING ˜11252:˜ 2 ›ÔAPE˜R/ÔÒÉ—ÁÄ ˜11257:˜ 103 ÒEDÁLERT ˜11260:˜ 30 ÓT0RMFR0NT ˜11268:˜ 189 NFODIZ ˜11269:˜ 74 ÓENTINEL/ÅXCESS ˜11270:˜ 90 š™ÄŸIšDI ˜11272:˜ 13 ¤¤¬»PCOLLI–NS¬»¤¤ ˜11276:˜ 75 ×EASEL ˜11278:˜ 82 ÊAZZCAT ˜11286:˜ 105 Ï™š™›™˜šĞŸÔ˜™Ià ™ÆŸÒšÅÅ›šÚŸÅ ˜11290:˜ 172 ROTTEROY ˜11294:˜ 185 ĞITCHER ˜11299:˜ 121 MORPHFROG ˜11300:˜ 156 ÂLACK ÂEARD ˜11317:˜ 80 HEDNING''' pattern = r'^\˜\d+:˜ \d+ (.+)' for match in re.finditer(pattern, text, re.MULTILINE): print(match.group(1))
|