Python Forum
regex pattern to extract relevant sentences
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
regex pattern to extract relevant sentences
#1
Hi All,

I am looking to extract sentences which contains combination of given words and separated by "n" number of words between them. When i run the below code the output gives only two sentences, however, there are four sentences which satisfy the given regex pattern.

What i observed is when the regex pattern is matched two or more times in a given sentence, it is not given in the output. I am not sure how to fix this issue. Can anyone please let me know how to get the "Desired Output" which is shown below.

import re
txt = "The present disclosure is directed to an electrosurgical pencil with integrated ligasure tweezers. In accordance with one aspect of the present disclosure the electrosurgical pencils includes an elongated housing having an open distal end and including an actuator operatively associated therewith. First and second jaws members extend distally through the open distal end of the elongated housing and are transitionable between a closed position and an open position upon actuation of an actuator. One or both of the jaw members are configured to treat tissue with monopolar energy and both jaw members are configured to treat tissue with bipolar energy. One or more switches are operably coupled to a controller disposed in the housing and configured to activate the first and second jaw members to treat tissue with monopolar and bipolar energy. FIGS. 17-22 show an inner shaft 2310 that includes one or more levers 2316 attached at a fulcrum point 2318 for assisting in opening jaw members 2330, 2340. Lever 2316 may extend through housing 2200 and may be pivotably mounted to housing 2200 at a pivot point 2315 such that when a physician actuates lever 2316 at an end 2319, lever 2316 pivots about pivot point 2315 and applies force to inner shaft 2310 at fulcrum point 2318 for opening and closing jaw members 2330 and 2340. Lever 2316 allows a physician to generate additional force at fulcrum point 2318 for opening jaw members 2330, 2340."
sentences = txt.strip().split('.')
for n, sentence in enumerate(sentences):
    sentence = sentence.strip()
    if len(sentence):
        reg_compiler = re.compile(r'\b(jaw[a-z]+|electrosurgical|pencil[a-z]+)(?:\W+\w+){1,15}?\W+(monopolar|bipolar|open[a-z]+|hous[a-z]+)\b')
        rel_sent = reg_compiler.search(sentence)
        if rel_sent:
           print(f"\n{sentence}")
Code Output

Output:
In accordance with one aspect of the present disclosure the electrosurgical pencils includes an elongated housing having an open distal end and including an actuator operatively associated therewith First and second jaws members extend distally through the open distal end of the elongated housing and are transitionable between a closed position and an open position upon actuation of an actuator
Desired Output which i want:

Output:
In accordance with one aspect of the present disclosure the electrosurgical pencils includes an elongated housing having an open distal end and including an actuator operatively associated therewith First and second jaws members extend distally through the open distal end of the elongated housing and are transitionable between a closed position and an open position upon actuation of an actuator One or both of the jaw members are configured to treat tissue with monopolar energy and both jaw members are configured to treat tissue with bipolar energy One or more switches are operably coupled to a controller disposed in the housing and configured to activate the first and second jaw members to treat tissue with monopolar and bipolar energy
Reply
#2
I replaced jaw[a-z]+ with jaw[a-z]* and it seems to work better.
Reply
#3
(Jul-05-2021, 08:00 PM)Gribouillis Wrote: I replaced jaw[a-z]+ with jaw[a-z]* and it seems to work better.

Thank you so much @Gribouillis
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Regex pattern match WJSwan 2 1,192 Feb-07-2023, 04:52 AM
Last Post: WJSwan
  seaching for a library: nondeterministic letter manipulation in sentences Myron 2 883 Dec-05-2022, 03:53 PM
Last Post: Myron
  [SOLVED] Alternative to regex to extract date from whole timestamp? Winfried 6 1,777 Nov-16-2022, 01:49 PM
Last Post: carecavoador
  Python modules for accessing the configuration of relevant paths Imago 1 1,324 May-07-2022, 07:28 PM
Last Post: Larz60+
  Extract text based on postion and pattern guddu_12 2 1,581 Sep-27-2021, 08:32 PM
Last Post: guddu_12
  Extract specific sentences from text file Bubly 3 3,339 May-31-2021, 06:55 PM
Last Post: Larz60+
  Counting the most relevant words in a text file caiomartins 2 2,446 Sep-21-2020, 08:39 AM
Last Post: caiomartins
  Bulk Generating Cloze Deletions based on Tatoeba sentences and word frequency lists wizzie 10 5,060 Dec-23-2019, 12:16 PM
Last Post: wizzie
  Regex Pattern NewBeie 5 2,989 May-13-2019, 01:27 PM
Last Post: michalmonday
  Reading a Regex pattern stahorse 12 5,098 Apr-25-2019, 10:21 AM
Last Post: NewBeie

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020