Python Forum
Regex find string then return 10 character after it
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Regex find string then return 10 character after it
#1
Hi,
I’ve been looking at Regex and regex cheat sheets and can find how to search for a string and then group them but it would be useful for me to be able to set a number of variables.

For example

varX = “x: “
varY = “y: “
varZ = “z: “

And then to specify for regex Python that I want to return the next 10 following characters.

Can someone point me to the terms to search for syntax in manuals to read

Thanks for your help
Reply
#2
You may be over complicating this (or maybe I'm over simplifying it) because you could achieve the objective with a simple sting slice:

text_string = "For some reason x:is in this string"

varX = "x:"

index = text_string.find(varX)+len(varX)

print(text_string[index:index+10])
pyStund likes this post
Sig:
>>> import this

The UNIX philosophy: "Do one thing, and do it well."

"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse

"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Reply
#3
You question is unclear. Forget about regex and Python and describe your problem. Something like:
Quote:This is an example of some text I want to parse (provide example). I want my program to find strings that look like this (provide example) and I want to do this with the matching string(describe processing performed on string).
I have no idea what you mean by "group them" or "useful for me to be able to set a number of variables".
Reply
#4
Rob101 understood it and got the answer right.
Thanks for your help though.
The personalities of commenters you find in the forums with replies like this match the Arduino forums.
Im not sure if it is a game to send something back to lift message counts, or if its like men who need to drive big cars to compensate for lacking authority in other areas of their lives.

(Aug-03-2022, 06:05 PM)deanhystad Wrote: You question is unclear. Forget about regex and Python and describe your problem. Something like:
Quote:This is an example of some text I want to parse (provide example). I want my program to find strings that look like this (provide example) and I want to do this with the matching string(describe processing performed on string).
I have no idea what you mean by "group them" or "useful for me to be able to set a number of variables".
Reply
#5
Brilliant, thanks @rob101 good work, nice guy

(Aug-02-2022, 05:28 PM)rob101 Wrote: You may be over complicating this (or maybe I'm over simplifying it) because you could achieve the objective with a simple sting slice:

text_string = "For some reason x:is in this string"

varX = "x:"

index = text_string.find(varX)+len(varX)

print(text_string[index:index+10])
rob101 likes this post
Reply
#6
Honest, I did not get what you were trying to say (still don't really). Are you trying to parse text that contains patterns like this?

"This is my pattern x:0123456789 and I want to get the 0123456789"

If that is the case, what if the text contains something like this?

"This has an x:0123 but there is a blank. Do I just take the next 10 characters or do I stop at the blank, or is this not a match?"
"This does not have enough characters after the pattern. Should it match x:1234"
"These patterns overlap. Should they count? x:123 y:4657"

And I really don't get what do you mean by "group them"? Do you want a regex that looks for "x:", "y:" or "z:"? You could do that with [xyz]:\w{10}
import re

text = "this is x:0123,  some things y:0123456, that may or may not z:0123456789, match the pattern x:abcdefghijk, y:123 z:4567"
pattern = re.compile("[xyz]:\w{10}")

print(re.findall(pattern, text))
Output:
['z:0123456789', 'x:abcdefghij']
This grabs the two strings that start with x:, y: or z: and are followed by 10 non-whitespace characters.

If changed to just grab the next 10 characters, it does this:
import re

text = "this is x:0123,  some things y:0123456, that may or may not z:0123456789, match the pattern x:abcdefghijk, y:123 z:4567"
pattern = re.compile("[xyz]:.{10}")

print(re.findall(pattern, text))
Output:
['x:0123, som', 'y:0123456, t', 'z:0123456789', 'x:abcdefghij']
Now I get all the x:, y: and z:'s that are followed by 10 characters and don't overlap with another pattern.
['x:0123, som', 'y:0123456, t', 'z:0123456789', 'x:abcdefghij', 'y:123 z:4567']
Note that z:4567 at the end does not count as a match because z: is not followed by 10 characters.

And what did you mean by this?
varX = “x: “
varY = “y: “
varZ = “z: “
Is varY supposed to be a variable referencing the "y:" pattern, or is it a variable used to reference the 10 characters following "y:" in the text you are searching?

If your posts on the Arduino forum are this vague I understand why you get more questions than answers. Remember, you have all the details and we know nothing except the info in your post. Write posts as though the audience has no idea what you are talking about, because that is True. My favorite format is "Here is my input, this is my expected output, additional information." Then I can test my reply against the input and output to see if I'm answering the question correctly. If you can't do that, provide lots of details. Details are important, especially with something like regular expressions where two patterns that look almost the same will give radically different results.
Reply
#7
You can get the start and end positions of any pattern with re

pattern = re.compile('x:')
# this returns a tuple with the position of the pattern in your string
start_stop = pattern.search(astring).span()
You could do it like this:

import re
astring = 'aaaax:bbbbbbbbbbbbbbby:cccccccccccccccccccz:ddddddddddddddddd'
myvars = ['x:', 'y:', 'z:', 'dog']
length = 10
results = []
for item in myvars:
    pattern = re.compile(item)
    # find where the search string starts
    start = pattern.search(astring)
    # if the pattern is not found search returns None
    if not start == None:
        #get the span tuple
        tup = start.span()
        # change the tuple to get what you want
        newtup = (tup[0] + 2, tup[1] + length)
        # slice the string
        wanted = astring[newtup[0]:newtup[1]]
        results.append(wanted)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Efficient method to find phrase in string Tuxedo 6 3,000 Feb-25-2021, 07:23 PM
Last Post: Tuxedo
  A function to return only certain columns with certain string illmattic 2 2,203 Jul-24-2020, 12:57 PM
Last Post: illmattic
  Match string return different cell string Kristenl2784 0 1,428 Jul-20-2020, 07:54 PM
Last Post: Kristenl2784
  How to return values from For Loop via return in a function AykutRobotics 14 8,264 Jan-08-2019, 04:12 AM
Last Post: stullis

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020