Python Forum
Regular Expression for matching words
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Regular Expression for matching words
#1
Hello. I'm really not sure about the defined re pattern for this exercise, especially how to define the string words between == on both sides in order to remove it as a whole header. Could you please provide me with any tips if you happen to know that. Thank you! Angel

My exercise:

Wikipedia uses two or more equal signs == to mark headers and subheaders in the articles (e.g. ==, ===, ====).

In all cases, the equal signs and the actual header text are separated by spaces on both sides, e.g. == History == or === Further reading ===.

Import the re module and define a regular expression that removes all headers and subheaders from the articles. Store this regular expression under a variable named pattern.

Apply the regular expression under pattern to each article (string object) in the list wiki_articles. Store each processed article into a new list named cleaned_articles.


My answer:

import re
cleaned_articles=[]
for string_object in wiki_articles:
    pattern = re.compile(r'={2,}.+')     #not sure about the defined pattern
    processed = pattern.sub(repl='', string=string_object)
    cleaned_articles.append(processed)
Reply
#2
You most make test strings to see what happens,and no loop before have test this first.
import re

string_object = '''\
== History ==
=== Further reading ===
My car is blue
2 + 2 = 4
++= & hello='''

pattern = re.compile(r'={2,}.+')
processed = pattern.sub(repl='', string=string_object)
print(processed.strip())
Output:
My car is blue 2 + 2 = 4 ++= & hello=
So your regex should work fine and added .strip() to remove the new line that sub leave.
xinyulon likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  data validation with specific regular expression shaheen07 0 342 Jan-12-2024, 07:56 AM
Last Post: shaheen07
  Regular Expression search to comment lines of code Gman2233 5 1,696 Sep-08-2022, 06:57 AM
Last Post: ndc85430
  List Creation and Position of Continue Statement In Regular Expression Code new_coder_231013 3 1,674 Jun-15-2022, 12:00 PM
Last Post: new_coder_231013
  Need help with my code (regular expression) shailc 5 1,942 Apr-04-2022, 07:34 PM
Last Post: shailc
  regular expression question Skaperen 4 2,508 Aug-23-2021, 06:01 PM
Last Post: Skaperen
  Generate a string of words for multiple lists of words in txt files in order. AnicraftPlayz 2 2,817 Aug-11-2021, 03:45 PM
Last Post: jamesaarr
  How can I find all combinations with a regular expression? AlekseyPython 0 1,681 Jun-23-2021, 04:48 PM
Last Post: AlekseyPython
  Python Regular expression, small sample works but not on file Acernz 5 2,953 Jun-09-2021, 08:27 PM
Last Post: bowlofred
  Regular expression: cannot find 1st number in a string Pavel_47 2 2,427 Jan-15-2021, 04:39 PM
Last Post: bowlofred
  Regular expression: return string, not list Pavel_47 3 2,501 Jan-14-2021, 11:49 AM
Last Post: Pavel_47

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020