Python Forum
Extract specific sentences from text file - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Extract specific sentences from text file (/thread-33832.html)



Extract specific sentences from text file - Bubly - May-31-2021

Hi all,
I am a beginner in python. How to extract sentences, which have specific set of words (or combination of words), from a text file. For example, the text file contains the following text.

"Extracorporeal therapies have been used to remove toxins from the body for over 50 years and have a greater role than ever before in the treatment of poisonings. Improvements in technology have resulted in increased efficacy of removing drugs and other toxins with hemodialysis, and newer extracorporeal therapy modalities have expanded the role of extracorporeal supportive care of poisoned patients. However, despite these changes, for at least the past three decades the most frequently dialyzed poisons remain salicylates, toxic alcohols, and lithium; in addition, the extracorporeal treatment of choice for therapeutic removal of nearly all poisonings remains intermittent hemodialysis. For the clinician, consideration of extracorporeal therapy in the treatment of a poisoning depends upon the characteristics of toxins amenable to extracorporeal removal (e.g., molecular mass, volume of distribution, protein binding), choice of extracorporeal treatment modality for a given poisoning, and when the benefit of the procedure justifies additive risk. Given the relative rarity of poisonings treated with extracorporeal therapies, the level of evidence for extracorporeal treatment of poisoning is not robust; however, extracorporeal treatment of a number of individual toxins have been systematically reviewed within the current decade by the Extracorporeal Treatment in Poisoning workgroup, which has published treatment recommendations with an improved evidence base. Some of these recommendations are discussed, as well as management of a small number of relevant poisonings where extracorporeal therapy use may be considered."

Task: I want to extract sentences which have these three words in it: extracorporeal, therapy/therapies, treatment

Output: Below are the three sentences which contains above three words:

Extracorporeal therapies have been used to remove toxins from the body for over 50 years and have a greater role than ever before in the treatment of poisonings.

For the clinician, consideration of extracorporeal therapy in the treatment of a poisoning depends upon the characteristics of toxins amenable to extracorporeal removal (e.g., molecular mass, volume of distribution, protein binding), choice of extracorporeal treatment modality for a given poisoning, and when the benefit of the procedure justifies additive risk

Given the relative rarity of poisonings treated with extracorporeal therapies, the level of evidence for extracorporeal treatment of poisoning is not robust; however, extracorporeal treatment of a number of individual toxins have been systematically reviewed within the current decade by the Extracorporeal Treatment in Poisoning workgroup, which has published treatment recommendations with an improved evidence base.


RE: Extract specific sentences from text file - Larz60+ - May-31-2021

What have you tried? show code.


RE: Extract specific sentences from text file - Bubly - May-31-2021

(May-31-2021, 03:29 PM)Larz60+ Wrote: What have you tried? show code.

Hi Larz60,

I have absolutely no idea how to do this or which library to use. I only know how to do this for single word. I don't know for combination of words. I would greatly appreciate if you could give some idea then i can try and come back with my code which i have tried.

Appreciate your help.


RE: Extract specific sentences from text file - Larz60+ - May-31-2021

start by spliting the file into sentences.

for example:
mydoc = "On the other hand, we denounce with righteous indignation and " \
    "dislike men who are so beguiled and demoralized by the charms of " \
    "pleasure of the moment, so blinded by desire, that they cannot foresee " \
    "the pain and trouble that are bound to ensue; and equal blame belongs to " \
    "those who fail in their duty through weakness of will, which is the same " \
    "as saying through shrinking from toil and pain. These cases are " \
    "perfectly simple and easy to distinguish. In a free hour, when our " \
    "power of choice is untrammelled and when nothing prevents our being " \
    "able to do what we like best, every pleasure is to be welcomed and " \
    "every pain avoided. But in certain circumstances and owing to the claims " \
    "of duty or the obligations of business it will frequently occur that " \
    "pleasures have to be repudiated and annoyances accepted. The wise man " \
    "therefore always holds in these matters to this principle of selection: " \
    "he rejects pleasures to secure other greater pleasures, or else he " \
    "endures pains to avoid worse pains."

sentences = mydoc.strip().split('.')

for n, sentence in enumerate(sentences):
    sentence = sentence.strip()
    if len(sentence):
        print(f"\nsentence {n}: {sentence}")
Next, search for all sentences that contain all three words and that's it

This produces:
Output:
sentence 0: On the other hand, we denounce with righteous indignation and dislike men who are so beguiled and demoralized by the charms of pleasure of the moment, so blinded by desire, that they cannot foresee the pain and trouble that are bound to ensue; and equal blame belongs to those who fail in their duty through weakness of will, which is the same as saying through shrinking from toil and pain sentence 1: These cases are perfectly simple and easy to distinguish sentence 2: In a free hour, when our power of choice is untrammelled and when nothing prevents our being able to do what we like best, every pleasure is to be welcomed and every pain avoided sentence 3: But in certain circumstances and owing to the claims of duty or the obligations of business it will frequently occur that pleasures have to be repudiated and annoyances accepted sentence 4: The wise man therefore always holds in these matters to this principle of selection: he rejects pleasures to secure other greater pleasures, or else he endures pains to avoid worse pains