Python Forum
[SOLVED] Read text file from some point till EOF?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[SOLVED] Read text file from some point till EOF?
#1
Hello,

I need to jump to a given string in a text file, and grab everything from that point to the end of the file for further processing using a regex.

I'm not sure why Python isn't happy:
with open(INPUTFILE,encoding="utf-8") as reader:
	content = reader.read()

content = re.search(content,'^SessionData.+',re.MULTILINE|re.DOTALL)

c:\temp>list.files.py
Traceback (most recent call last):
  File "C:\list.files.py", line 10, in <module>
    content = re.search(content,'^SessionData.+',re.MULTILINE|re.DOTALL)
  File "C:\Python38-32\lib\re.py", line 201, in search
    return _compile(pattern, flags).search(string)
  File "C:\Python38-32\lib\re.py", line 304, in _compile
    p = sre_compile.compile(pattern, flags)
  File "C:\Python38-32\lib\sre_compile.py", line 764, in compile
    p = sre_parse.parse(p, flags)
  File "C:\Python38-32\lib\sre_parse.py", line 948, in parse
    p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
  File "C:\Python38-32\lib\sre_parse.py", line 443, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
  File "C:\Python38-32\lib\sre_parse.py", line 554, in _parse
    code1 = _class_escape(source, this)
  File "C:\Python38-32\lib\sre_parse.py", line 349, in _class_escape
    raise source.error('bad escape %s' % escape, len(escape))
re.error: bad escape \T at position 1494 (line 76, column 27)
Here's an example of what is found in the intput file:
FilePath = C:\Temp\SomeFile.txt
Is it because Python interprets \T as a wrongly formated tab sequence?

I've also tried to open the file in binary mode, but get the same error:
with open(INPUTFILE,"rb") as reader:
What would be the right way to extract part of a file for further parsing?

Thank you.
Reply
#2
Found it:

import re

INPUTFILE = "whole.txt"
with open(INPUTFILE,encoding="utf-8") as reader:
	content = reader.read()

subset= content[content.index("SessionData"):]

#Don't use ^ and $!
files = re.findall(r'FilePath = (.+)', subset)
for file in files:
	print(file)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [SOLVED] Correct way to convert file from cp-1252 to utf-8? Winfried 8 803 Feb-29-2024, 12:30 AM
Last Post: Winfried
  Recommended way to read/create PDF file? Winfried 3 2,872 Nov-26-2023, 07:51 AM
Last Post: Pedroski55
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 1,431 Nov-09-2023, 10:56 AM
Last Post: mg24
  read file txt on my pc to telegram bot api Tupa 0 1,106 Jul-06-2023, 01:52 AM
Last Post: Tupa
  parse/read from file seperated by dots giovanne 5 1,105 Jun-26-2023, 12:26 PM
Last Post: DeaD_EyE
  Formatting a date time string read from a csv file DosAtPython 5 1,253 Jun-19-2023, 02:12 PM
Last Post: DosAtPython
  How do I read and write a binary file in Python? blackears 6 6,517 Jun-06-2023, 06:37 PM
Last Post: rajeshgk
  Loop through json file and reset values [SOLVED] AlphaInc 2 2,100 Apr-06-2023, 11:15 AM
Last Post: AlphaInc
  Read csv file with inconsistent delimiter gracenz 2 1,196 Mar-27-2023, 08:59 PM
Last Post: deanhystad
  Read text file, modify it then write back Pavel_47 5 1,589 Feb-18-2023, 02:49 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020