Python Forum
Recursive regular expressions in Python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Recursive regular expressions in Python
#1
Hi!

Today I ran into a problem where I had to extract a JSON file passed in a YAML file within a string (don't ask me why LOL). Curly braces indicate a parameter that should be captured, so the string Variable {var} should have the value { {"test": {"a": 1}} }[ should, let's say, add that value as an embedded object in a Python dictionary. While doing some research on how to come up with a regex that could match balanced curly braces, I ran into this one: ((?:[^()]|(?R))*\), which is supposed to match balanced parentheses. The way I see it, every time we encounter either an opening or a closing parenthesis, the recursive call will be made, and the regex will be evaluated from the beginning (it will get back to trying to match an opening parenthesis at \(). However, if we start a recursive call for when we find a closing parenthesis as well, that recursive call will not match, because it won't start with an opening parenthesis. It doesn't make sense to me. What would make sense to me would be if we didn't have that closing parenthesis inside the [^()] character group, but then the regex does not capture balanced parentheses only. Can someone please help me understand why?

P.S.: I'm using Python's regex module.
Reply
#2
I'm pretty sure you saw the explanation here:

https://stackoverflow.com/questions/2638...n-in-regex

I thought it was pretty good.
Reply
#3
(Jul-25-2023, 11:07 AM)deanhystad Wrote: I'm pretty sure you saw the explanation here:

https://stackoverflow.com/questions/2638...n-in-regex

I thought it was pretty good.

Hi!

Yes, I came across that at some point. However, what I don’t understand is why we need the closing bracket in the character group. Why can’t the regex be /{(?:[^{]|(?R)*)/}? Aren’t we starting a recursive call every time we encounter a closing bracket as well? Why is that desired? Shouldn’t we start a recursive call only when we find an opening bracket and expect it to be matched by the \} part of the regex? I can’t understand why the regex with the closing bracket in the character group does not lead to infinite recursion, since, when we encounter the closing bracket, we make a recursive call, and that closing bracket is not matched again, leading to a new recursive call and so on. What am I missing here?

Thanks a lot for replying.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Sad Regular Expressions - so close yet so far bigpapa 5 967 May-03-2023, 08:18 AM
Last Post: bowlofred
  Python Regular expression, small sample works but not on file Acernz 5 2,955 Jun-09-2021, 08:27 PM
Last Post: bowlofred
  Having trouble with regular expressions mikla 3 2,597 Mar-16-2021, 03:44 PM
Last Post: bowlofred
  Statements and Expressions Julie 1 1,639 Feb-26-2021, 05:19 PM
Last Post: nilamo
  Combine Two Recursive Functions To Create One Recursive Selection Sort Function Jeremy7 12 7,402 Jan-17-2021, 03:02 AM
Last Post: Jeremy7
  Regular Expressions pprod 4 3,093 Nov-13-2020, 07:45 AM
Last Post: pprod
  Format phonenumbers - regular expressions Viking 2 1,908 May-11-2020, 07:27 PM
Last Post: Viking
  regular expressions in openpyxl. format picnic 0 2,488 Mar-28-2020, 09:47 PM
Last Post: picnic
  Unexpected (?) result with regular expressions guraknugen 2 2,223 Jan-18-2020, 02:33 PM
Last Post: guraknugen
  Strange output with regular expressions newbieAuggie2019 1 1,939 Nov-04-2019, 07:06 PM
Last Post: newbieAuggie2019

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020