Python Forum
Unexpected (?) result with regular expressions
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Unexpected (?) result with regular expressions
#1
Hi!

I'm running Python3 in Manjaro Linux 18.1.5 and I want to ”emulate” sed. Here's an example in Bash using sed, this is the result I want:
Output:
~ $ echo "libreoffice-still 6.2.8-4" | sed -r 's/[^0-9]*([0-9\.\-]*)/LibreOffice \1/' LibreOffice 6.2.8-4
I thought this would work in Python, but obviously it doesn't:
import subprocess, re
print(re.sub('[^0-9]*([0-9\.\-]*)', r'LibreOffice \1', "libreoffice-still 6.2.8-4"))
Output:
LibreOffice 6.2.8-4LibreOffice
As you can see I used the exact same regular expressions, but the result is different and I don't understand why and what to do about it.
What am I doing wrong that makes ”LibreOffice” appear at the end of the result? I'm obviously missing something here…
Reply
#2
Try this regex:
re.sub('libreoffice-still [^0-9]*([0-9\.\-]*)', r'LibreOffice \1', "libreoffice-still 6.2.8-4")
The regex begins now with libreoffice-still
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#3
I found that if I just add one or more or the original string at the beginning of the search string, I get my desired output:
import subprocess, re
print(re.sub('l[^0-9]*([0-9\.\-]*)', r'LibreOffice \1', 'libreoffice-still 6.2.8-4'))
Output:
LibreOffice 6.2.8-4
But why?

(Jan-18-2020, 02:33 PM)DeaD_EyE Wrote: Try this regex:
re.sub('libreoffice-still [^0-9]*([0-9\.\-]*)', r'LibreOffice \1', "libreoffice-still 6.2.8-4")
The regex begins now with libreoffice-still

Yes, I found that out too, but you were a bit faster. It also works when just adding the first letter:
re.sub('l[^0-9]*([0-9\.\-]*)', r'LibreOffice \1', "libreoffice-still 6.2.8-4")
I just don't see why. ”[^0-9]*” should take care of everything ahead of the first digit that appears, shouldn't it?
I read it as ”0 or more non-digits followed by 0 or more digits or periods or dashes and remember those digits, periods and dashes”, and that seems to be how sed reads it as well.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Recursive regular expressions in Python risu252 2 1,121 Jul-25-2023, 12:59 PM
Last Post: risu252
Sad Regular Expressions - so close yet so far bigpapa 5 892 May-03-2023, 08:18 AM
Last Post: bowlofred
  Having trouble with regular expressions mikla 3 2,541 Mar-16-2021, 03:44 PM
Last Post: bowlofred
  Statements and Expressions Julie 1 1,591 Feb-26-2021, 05:19 PM
Last Post: nilamo
  Regular Expressions pprod 4 3,016 Nov-13-2020, 07:45 AM
Last Post: pprod
  Pandas's regular expression function result is so strange cools0607 6 3,095 Jun-15-2020, 07:34 AM
Last Post: cools0607
  Format phonenumbers - regular expressions Viking 2 1,856 May-11-2020, 07:27 PM
Last Post: Viking
  Unexpected result linton 4 1,936 May-02-2020, 01:15 PM
Last Post: linton
  regular expressions in openpyxl. format picnic 0 2,448 Mar-28-2020, 09:47 PM
Last Post: picnic
  list sum gives unexpected result Nesso 0 1,668 Feb-04-2020, 08:31 AM
Last Post: Nesso

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020