Python Forum
[SOLVED] Read text file from some point till EOF?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[SOLVED] Read text file from some point till EOF?
#1
Hello,

I need to jump to a given string in a text file, and grab everything from that point to the end of the file for further processing using a regex.

I'm not sure why Python isn't happy:
with open(INPUTFILE,encoding="utf-8") as reader:
	content = reader.read()

content = re.search(content,'^SessionData.+',re.MULTILINE|re.DOTALL)

c:\temp>list.files.py
Traceback (most recent call last):
  File "C:\list.files.py", line 10, in <module>
    content = re.search(content,'^SessionData.+',re.MULTILINE|re.DOTALL)
  File "C:\Python38-32\lib\re.py", line 201, in search
    return _compile(pattern, flags).search(string)
  File "C:\Python38-32\lib\re.py", line 304, in _compile
    p = sre_compile.compile(pattern, flags)
  File "C:\Python38-32\lib\sre_compile.py", line 764, in compile
    p = sre_parse.parse(p, flags)
  File "C:\Python38-32\lib\sre_parse.py", line 948, in parse
    p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
  File "C:\Python38-32\lib\sre_parse.py", line 443, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
  File "C:\Python38-32\lib\sre_parse.py", line 554, in _parse
    code1 = _class_escape(source, this)
  File "C:\Python38-32\lib\sre_parse.py", line 349, in _class_escape
    raise source.error('bad escape %s' % escape, len(escape))
re.error: bad escape \T at position 1494 (line 76, column 27)
Here's an example of what is found in the intput file:
FilePath = C:\Temp\SomeFile.txt
Is it because Python interprets \T as a wrongly formated tab sequence?

I've also tried to open the file in binary mode, but get the same error:
with open(INPUTFILE,"rb") as reader:
What would be the right way to extract part of a file for further parsing?

Thank you.
Reply
#2
Found it:

import re

INPUTFILE = "whole.txt"
with open(INPUTFILE,encoding="utf-8") as reader:
	content = reader.read()

subset= content[content.index("SessionData"):]

#Don't use ^ and $!
files = re.findall(r'FilePath = (.+)', subset)
for file in files:
	print(file)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to read a file as binary or hex "string" so that I can do regex search? tatahuft 3 976 Dec-19-2024, 11:57 AM
Last Post: snippsat
  [SOLVED] [Linux] Write file and change owner? Winfried 6 1,457 Oct-17-2024, 01:15 AM
Last Post: Winfried
  Read TXT file in Pandas and save to Parquet zinho 2 1,192 Sep-15-2024, 06:14 PM
Last Post: zinho
  [solved] how to delete the 10 first lines of an ascii file paul18fr 7 1,648 Aug-07-2024, 08:18 PM
Last Post: Gribouillis
  Pycharm can't read file Genericgamemaker 5 1,510 Jul-24-2024, 08:10 PM
Last Post: deanhystad
  Python is unable to read file Genericgamemaker 13 3,445 Jul-19-2024, 06:42 PM
Last Post: snippsat
  Connecting to Remote Server to read contents of a file ChaitanyaSharma 1 3,136 May-03-2024, 07:23 AM
Last Post: Pedroski55
Question [SOLVED] Correct way to convert file from cp-1252 to utf-8? Winfried 8 9,409 Feb-29-2024, 12:30 AM
Last Post: Winfried
  Recommended way to read/create PDF file? Winfried 3 4,516 Nov-26-2023, 07:51 AM
Last Post: Pedroski55
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 3,705 Nov-09-2023, 10:56 AM
Last Post: mg24

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020