Python Forum

Full Version: Data structure question
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I am creating an app that will take a pdf that represents a set of medical treatment records for a given patient. I want to submit the text to processing in medspacy. The desired output is to present a chronological history of the patent - his disease, his treatment. I also want to be able to present the results of key diagnostic tests, the efficacy of treatment, the severity fo symptoms. In pre-processing the text, before going through medspacy, I want to save the text as chunks that represent the text of a given section in the medical record. I want to preserve information about where that section was in the original pdf (page, and the pdf comes bookmarked) and well as the date of the treatment note from which that text came.

My app would have a high priority on speed. It would be kind of an "impulse buy" for the user (see a detailed analysis of the claimant's case). So what type of data structures would be best to use in python? So I would be stored the actual text as information about it (page, date of of treatment, etc)?
A quick glance at MedSpaCy suggests it works with strings. Sounds like constructing a class with string elements would be useful.