Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python Regex
#1
I have the following text file:
Output:
S = 0, X are: (1, 0, 1, 1, 0, 0, )S = 0, Z are: (0, 1, 1, 1, 0, 1, )Data bits measurement:0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, Measured l0: 1, Measured l1: 0 S = 0, X are: (1, 1, 0, 0, 0, 1, )S = 0, Z are: (1, 1, 0, 0, 1, 1, )Data bits measurement:1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, Measured l0: 0, Measured l1: 1 S = 0, X are: (1, 1, 0, 0, 1, 1, )S = 0, Z are: (1, 1, 0, 0, 1, 1, )Data bits measurement:1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, Measured l0: 1, Measured l1: 0 S = 0, X are: (1, 1, 1, 0, 1, 1, )S = 0, Z are: (0, 0, 0, 1, 0, 0, )Data bits measurement:0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, Measured l0: 1, Measured l1: 0 S= 0, X are: (1, 1, 0, 1, 1, 0, )S = 0, Z are: (0, 0, 0, 1, 0, 0, )Data bits measurement:0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, Measured l0: 1, Measured l1: 0 S = 0, X are: (1, 1, 0, 0, 1, 1, )S = 0, Z are: (1, 1, 1, 0, 0, 1, )Data bits measurement:1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, Measured l0: 1, Measured l1: 0
Now, I want to count how many same Z value I have and I have many l0 I have for this Z value:
For example:
Output:
011101 - 1 and l0 1 (in total) 110011 - 2 and l0 1 (in total) 000100 - 2 and l0 2 (in total) 111001 - 1 and l0 1 (in total)
I thought regex can be nice but I could not manage to use it:
Here is my code:
with open("f.dat") as file:
    lines = file.readlines()
    #Z=re.search("(^Z*)",line)
    for i in re.findall(r'Z are: (=(\d+)&',lines):
           print(i)
And I got this error:
Error:
raise source.error("missing ), unterminated subpattern", re.error: missing ), unterminated subpattern at position 17
Reply
#2
Parentheses have special meaning in regex and must appear in pairs. If you want to include the () characters in the pattern you need to precede with \: \( or \).

I don't understand some of your pattern.

Why does it contain "&"? This character does not appear in your string and has no special meaning in a regex pattern that I know of. Did you mean "*" (fat finger mistake)?

\d (decimal) appears multiple times in the pattern, but there are also commas and spaces, so the repeat is "\d, ", not "\d". Should "(\d+)&" be "(?:\d, )*"

If you have a repeated group like "(?:(\d), )+" you can only get the last match for the group. You cannot use grouping to get "011101" out of "0, 1, 1, 1, 0, 1, ", you could only retrieve the last match for the group, "1". I think you should match everything between the parentheses and remove all ", " from the string.

I think you want to do this:
import re

data = "S = 0, X are: (1, 0, 1, 1, 0, 0, )S = 0, Z are: (0, 1, 1, 1, 0, 1, )Data bits measurement:0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, Measured l0: 1, Measured l1: 0"

pattern = r"Z are: \((.*)\)"  # \( and \) are to have ( and ) as characters in the pattern.  (.*) is the group of characters we want returned
match = re.search(pattern, data)
group = match.group(1)  # Get the first group.  This will be something like "0, 1, 1, 1, 0, 1, "
print(group, group.replace(", ", ""))  # Strip out the ", " to get the desired string.

#or
match = re.findall(pattern, data)
group = match[0]
print(group, group.replace(", ", ""))
Output:
0, 1, 1, 1, 0, 1, 011101 0, 1, 1, 1, 0, 1, 011101
What does the "1 and 10" mean in your example output "011101 - 1 and 10 1 (in total)"
carecavoador likes this post
Reply
#3
(Sep-21-2022, 08:13 PM)deanhystad Wrote: Parentheses have special meaning in regex and must appear in pairs. If you want to include the () characters in the pattern you need to precede with \: \( or \).

I don't understand some of your pattern.

Why does it contain "&"? This character does not appear in your string and has no special meaning in a regex pattern that I know of. Did you mean "*" (fat finger mistake)?

\d (decimal) appears multiple times in the pattern, but there are also commas and spaces, so the repeat is "\d, ", not "\d". Should "(\d+)&" be "(?:\d, )*"

If you have a repeated group like "(?:(\d), )+" you can only get the last match for the group. You cannot use grouping to get "011101" out of "0, 1, 1, 1, 0, 1, ", you could only retrieve the last match for the group, "1". I think you should match everything between the parentheses and remove all ", " from the string.

I think you want to do this:
import re

data = "S = 0, X are: (1, 0, 1, 1, 0, 0, )S = 0, Z are: (0, 1, 1, 1, 0, 1, )Data bits measurement:0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, Measured l0: 1, Measured l1: 0"

pattern = r"Z are: \((.*)\)"  # \( and \) are to have ( and ) as characters in the pattern.  (.*) is the group of characters we want returned
match = re.search(pattern, data)
group = match.group(1)  # Get the first group.  This will be something like "0, 1, 1, 1, 0, 1, "
print(group, group.replace(", ", ""))  # Strip out the ", " to get the desired string.

#or
match = re.findall(pattern, data)
group = match[0]
print(group, group.replace(", ", ""))
Output:
0, 1, 1, 1, 0, 1, 011101 0, 1, 1, 1, 0, 1, 011101
What does the "1 and 10" mean in your example output "011101 - 1 and 10 1 (in total)"

Thanks a lot for the answer :) It is not 10 but l0: 1. It is the measurement result. I want to keep track of this result too.
Your answer is very nice. I will first try it and try to figure out how much I can go from there. I will back to you if I have any troubles :)
Thanks
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  python regex: get rid of double dot wardancer84 4 2,379 Sep-09-2021, 03:03 PM
Last Post: wardancer84
  Using Regex Expression With Isin in Python eddywinch82 0 2,299 Apr-04-2021, 06:25 PM
Last Post: eddywinch82
  Exception handling in regex using python ShruthiLS 1 2,374 May-04-2020, 08:12 AM
Last Post: anbu23
  Python the regex not getting any attributes sarath_unrelax 1 1,862 Dec-19-2019, 11:06 AM
Last Post: Larz60+
  Python regex to get only numbers tantony 6 4,107 Oct-09-2019, 11:53 PM
Last Post: newbieAuggie2019

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020