Python Forum

Full Version: Extracting Text
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I am trying to create some functionality that will harvest text date from a text file. The text file is basically a series of questions and answers. So I have a lot of this:

Does the patient abuse alcohol?
YES

Does the patient see the doctor as directed?
YES

Does the patient have suicidal ideation?
NO


These questions are not numbers, and some have a ":" after the question, others not. Ideally I would like to be able to "flag" answers that are significant. So, an answer of "NO" for alcohol abuse is just not significant to my analysis, but a "NO" to the second question (failing to see a doctor) is significant. In other words, there is a default answer to each question that means basically, this is not significant. Any ideas? Thanks.
You showed us the input, what should the output look like?
Perhaps part of my problems! I want to use this data to prepare a word document, a report. So as I said, some of the answers are very signi
ficant, others not so much. So I guess if the individual answered "YES" to suicidal ideation, I would like that noted in the report.

(Nov-01-2021, 08:59 PM)Gribouillis Wrote: [ -> ]You showed us the input, what should the output look like?
Why not use
filtered = [(question, answer) for question, answer in series if is_significant(question, answer)]
Sorry, I am not following how that would play out specifically.
I mean
  1. Write a function that reads the file and produces a sequence of pairs (question, answer)
  2. Write a function is_significant(question, answer) that returns True or False depending on the answer being or not significant for that question.
  3. Use the above list comprehension to remove the superfluous answers and keep only the significant ones.