Python Forum
regX find XYZ when it occurs after ABC with stuff inbetween?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
regX find XYZ when it occurs after ABC with stuff inbetween?
#1
given a string with characters, numbers, spaces, symbols... like a web page source

strStuff = "lmnoqABCrstuvwXYZdefghijk)
and ignoring case

how would you create a Regular Expression to return XYZ when it occurs after ABC with stuff in between?

thanks for any help.
Reply
#2
/ABC[^XYZ]*(XYZ)/
I really like using regex pal to test these things out, while you play around with them to find something that works: http://www.regexpal.com/

Starting with ABC, any number of things that aren't XYZ, and ending with XYZ (there's probably a way to do it with forward-lookahead match groups, but my regex fu isn't that powerful).

Testing it out:
>>> import re
>>> tests = [
... 'lmnoqABCrstuvwXYZdefghijk)',
... 'missing_start_groupXYZ',
... 'missing_end_ABC_group',
... 'missing_both'
... ]
>>> for test in tests:
...   match = re.search("ABC[^XYZ]*(XYZ)", test, re.IGNORECASE)
...   print(match)
...   if match:
...     print(match.groups())
...
<_sre.SRE_Match object; span=(5, 17), match='ABCrstuvwXYZ'>
('XYZ',)
None
None
None
>>>
Reply
#3
(Aug-13-2017, 11:41 PM)Fran_3 Wrote: like a web page source
Web page source is a bad example for regex,
because for HTML/XML should use a parser eg like BeautifulSoup, lxml.
As i have tutorial about here Web-Scraping part-1.
Reply
#4
/ABC[^XYZ]*(XYZ)/ fails on 'ABCwZYXwXYZ'. What about /ABC.*?(XYZ)/?
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply
#5
/ABC[^\1]*(XYZ)/ ?

I try to avoid .* if I can, as it doesn't really make it clear right away what you're expecting to happen.
Reply
#6
Thanks, guys. I'll play with ichabod801's sample later today.

Meanwhile I came up with...
- using the regx search method and () to end up with groups
or
- the regx findall method to return tuples

This way I can get to the data I want and ignore the other parts.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Web Scraping Stuff Korgik 2 1,105 Dec-08-2022, 02:21 PM
Last Post: Korgik

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020