Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Matching string from a file
#1
Greetings!
I’d like to match strings in files, it seems simple but I’m failing to do this…
It has multiple white spaces before the word Start Time or End Time and the Time and Date of the event
String “                                                                    Start Time  2/28/2024 8:34:34 AM ”
I tried :
  
if re.search("\s+\Start\s\Time",el) : #  < ------------- el is a line from the file ,,,
    print(f" START LN {el}")
And got an error message “ bad escape \T at position 11”
Then I tried:
if re.search("\s+\Start\s\",el) : #  < ------------- el is a line from the file ,,,
    print(f" START LN {el}")
This one prints tons of other lines I do not care about. Confused
I was sure by using “\s+” would filter the line I wanted but it does not.
Would you help me with this?

Thank you.
Reply
#2
Quote:
if re.search("\s+\Start\s\",el) : #  < ------------- el is a line from the file ,,,
    print(f" START LN {el}")
This one prints tons of other lines I do not care about. Confused
No, there is a snytax error that would prevent the program from running. You cannot have a single backslash at the end of a string literal.

You need to protect against "\" being interpreted as the start of an escape sequence. I would use raw strings.

I don't think you fully understand what \ does in a regex pattern. Why are you using \Start in your pattern? \S is "match any non-whitespace character". \T doesn't have a special meaning in a re pattern. That's why you got an error.

Quote:I was sure by using “\s+” would filter the line I wanted but it does not.
Putting \s+ at the start of the pattern just forces you to have one whitespace character before Start Time. To ignore lines that contain your pattern as well as other text, include the start (^) and end ($) of string in your pattern. You might want to use "match" instead of "search". Match looks for the entire string to match the pattern. Search is happy if it finds your pattern anywhere in the string.

This might work:
with open("test.txt", "r") as file:
    for index, line in enumerate(file):
        result = re.match(r"\s*?(Start Time.*?[A|PM])\s*?$", line)
        if result:
            print(f"{index:3}: ({result.start()}, {result.end()}) {result.groups()[0]}")
Or you could just strip all the leading and starting whitespace and assume any line that starts with "Start Time" is a line you are looking for.
tester_V likes this post
Reply
#3
Always use RAW strings when you create patterns for the re module, for example
if re.search(r"^\s+\Start\s\Time",el)  # <-- note the r" syntax
By the way I don't think \T is allowed in the re syntax, and \S matches other characters than S alone.
tester_V likes this post
« We can solve any problem by introducing an extra level of indirection »
Reply
#4
If you want to be really picky.
with open("test.txt", "r") as file:
    for line in file:
        result = re.match(r"\s*(Start Time {1,2}(\d{1,2}/\d{1,2}/\d{4}) (\d{1,2}:\d{1,2}:\d{1,2} [AP]M))\s*$", line)
        if result:
            print(result.groups())
match() forces pattern to start at the start of line.
r"" makes the pattern a raw string. Don't have to worry about escape sequences.
\s* matches any number of whitespace characters.
() creates groups. This pattern has a group for the "Start Time...PM" part, the date part and the time part.
Start Time matches Start Time.
{1,2} matches one or two spaces.
\d{1,2} matches 1 or 2 digits.
/ matches /.
: matches :.
[AP]M matches AM or PM.
\s*$ matches whitespace up to the end of the line.
tester_V likes this post
Reply
#5
You guys are great! That is what I looking for..get some code and the explanation... Smile
Reply
#6
(Mar-04-2024, 09:07 PM)tester_V Wrote: Greetings!
I’d like to match strings in files, it seems simple but I’m failing to do this…
It has multiple white spaces before the word Start Time or End Time and the Time and Date of the event
String “                                                                    Start Time  2/28/2024 8:34:34 AM ”
I tried :
  
if re.search("\s+\Start\s\Time",el) : #  < ------------- el is a line from the file ,,,
    print(f" START LN {el}")
And got an error message “ bad escape \T at position 11”
Then I tried:
if re.search("\s+\Start\s\",el) : #  < ------------- el is a line from the file ,,,
    print(f" START LN {el}")
This one prints tons of other lines I do not care about. Confused
I was sure by using “\s+” would filter the line I wanted but it does not.
Would you help me with this?

Thank you.

It seems like you're encountering issues with your regular expression syntax. Here's how you can correct it:

import re

# Sample line from the file
el = "                                                                    Start Time  2/28/2024 8:34:34 AM"

# Use raw string literal to avoid escaping issues
if re.search(r"\s+Start\s+Time", el):
    print(f"START LN {el}")
In this corrected version:

I used a raw string literal (r"...") for the regular expression to avoid issues with backslashes.
I adjusted the regular expression to \s+Start\s+Time, which matches one or more whitespace characters before and after "Start Time".
i hope This should correctly filter the lines containing "Start Time" as you intended.

Best Regard
Danish Hafeez | QA Assistant
buran write Mar-05-2024, 05:55 AM:
Clickbait link removed
tester_V likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Need to replace a string with a file (HTML file) tester_V 1 776 Aug-30-2023, 03:42 AM
Last Post: Larz60+
  matching a repeating string Skaperen 2 1,256 Jun-23-2022, 10:34 PM
Last Post: Skaperen
  Matching multiple parts in string fozz 31 6,349 Jun-13-2022, 09:38 AM
Last Post: fozz
  Matching Exact String(s) Extra 4 1,930 Jan-12-2022, 04:06 PM
Last Post: Extra
  Help with python code to search string in one file & replace with line in other file mforthman 26 11,958 Dec-19-2017, 07:11 PM
Last Post: Larz60+
  Searching a text file to find words matching a pattern Micael 3 87,956 Nov-07-2017, 08:52 PM
Last Post: Micael
  Matching Duplicate file names with different extentions wmc326 2 3,997 Aug-07-2017, 11:59 PM
Last Post: wavic
  find cell value with matching regular expression of a row in excel file hruday 4 30,932 Jul-05-2017, 01:02 PM
Last Post: sparkz_alot

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020