Python Forum
Who enjoys Py RegEx? re.sub() isn't working
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Who enjoys Py RegEx? re.sub() isn't working
#1
From all I've read, these two functions should produce the same results, given a regex pattern with two groups. I'd like to know if I'm using re.sub() incorrectly or if I've found some bug.

match = re.search(pattern, input)
result1 = match.group(1) + match.group(2)
result2 = re.sub(pattern, replace with groups 1 & 2, input)


Can you think of any reason re.sub() would pull in a bunch of garbage that isn't in either of the groups? Given a statement like
import re
re.sub( regexpattern, "\g<1>\g<2>", SourceText)
For instance this is a line of source text

Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf.md

where the Red text is identified as groups 1 and 2. The re.sub() should put it together as Bodywork.md but it doesn't! I've used match.groups() from the same library as a sanity check.

I've put together some sample code with some text to search, based on a conversion I'm trying to do for a small project.

Here's the output first. Thanks for looking! Smile

Output:
index: 1 Source : Projects bf587944624a417c83475fdb67c176ba.md Groups : ('Projects', '.md') Result1: Projects.md Result2: Projects.md index: 3 Source : Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf.md Groups : ('Bodywork', '.md') Result1: Bodywork.md Result2: Projects bf587944624a417c83475fdb67c176ba/Bodywork.md index: 5 Source : Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Home Exercise 4871ab1851074a1cb7aebe0851669345.csv Groups : ('Home Exercise', '.csv') Result1: Home Exercise.csv Result2: Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Home Exercise.csv
import re

paths = ['Projects bf587944624a417c83475fdb67c176ba/',
 'Projects bf587944624a417c83475fdb67c176ba.md',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf.md',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Home Exercise 4871ab1851074a1cb7aebe0851669345/',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Home Exercise 4871ab1851074a1cb7aebe0851669345.csv',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Home Exercise 4871ab1851074a1cb7aebe0851669345/Abs da0050d8459345419d1a16062273cfac.md',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Home Exercise 4871ab1851074a1cb7aebe0851669345/Core 82039eb85d5d46bc99e8504427d203c4.md',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Micronutrient Smoothie 21e2b0c0922d46f387c8b353a17ff734.md',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Self Bodywork 0045821b69f445678e07d49b5c80b9d0/',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Self Bodywork 0045821b69f445678e07d49b5c80b9d0.csv',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Self Bodywork 0045821b69f445678e07d49b5c80b9d0/Cuboid physical therapy ff8d7937722a4af6aa2ce1ce8c45672b.md',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Self Bodywork 0045821b69f445678e07d49b5c80b9d0/Extending hamstrings faeba9f5302340f1945b898c6291aa86.md',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Self Bodywork 0045821b69f445678e07d49b5c80b9d0/Knee care 8265d491502a49b0abf2922d9e7764e3.md',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Self Bodywork 0045821b69f445678e07d49b5c80b9d0/Shoulder therapy massage motion 49e3a56cbbfc4733a0ddda272c504912.md',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Self care weekly splits 861d60286a1e48dbb7ed7556d4214622/',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Self care weekly splits 861d60286a1e48dbb7ed7556d4214622.md',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Self care weekly splits 861d60286a1e48dbb7ed7556d4214622/Self Bodywork 5c1104d3456a4106872eee9dc531e182.csv',
 'Projects bf587944624a417c83475fdb67c176ba/Bodywork 731fe478ea6048e1ac0df8c7f7ed95bf/Workout weekly splits 4b9d808f79544f5489ef063f1048109a.md',
 'Projects bf587944624a417c83475fdb67c176ba/PROJECTS TEMPLATE a6292c48f0d343c9a2913c0adf97bbf2.md',
 'Routine 60ee969daa894c4d9abdb0d58166f5d4/',
 'Routine 60ee969daa894c4d9abdb0d58166f5d4.csv',
 'Routine 60ee969daa894c4d9abdb0d58166f5d4/Evening Routine 29e2c5282db04c76a59d0053eb9e85ee.md',
 'Routine 60ee969daa894c4d9abdb0d58166f5d4/Morning Routine 4570adb138b7412a8bbe948746585924.md',
 'Routine 60ee969daa894c4d9abdb0d58166f5d4/Physical Activity 8b8ba3700a194ba7ad6330802ecccdf5.md']

filenamepattern = "([\w\s]+)\s\w{32}(\.md|\.csv)$" #regex capture groups 1 & 2
# Create an indexed list of new filenames
index = []
fname1 = []
fname2 = []

for line in enumerate(paths):
    match = re.search(filenamepattern,line[1]) #Search &
    if match:
        index.append(line[0]) #save index for paths changes
        
        fname1.append( match.group(1) + match.group(2) ) #Replace 1 using re.group()
        fname2.append( re.sub(filenamepattern, "\g<1>\g<2>", line[1]) ) #Search & Replace 2 using re.sub()

        if len(index) <= 3: #print a few for comparison
            print("index:",index[-1])
            print("Source : "+line[1])
            print("Groups :",match.groups())
            print("Result1: "+fname1[-1])
            print("Result2: "+fname2[-1])
            print()
I've put up the regex with the same sample data at regexr dot com. I don't think I'm allowed to add links here as a new member but if you want to modify it and see results right away just add /568jc to the end of the URL. I'm not at all affiliated. Just a cool website!
Reply


Messages In This Thread
Who enjoys Py RegEx? re.sub() isn't working - by goodsignal - Jun-07-2020, 10:11 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Regex to extract IPs between () not working mrapple2020 5 3,474 Apr-12-2019, 08:03 AM
Last Post: DeaD_EyE

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020