Python Forum
Search Results
Post Author Forum Replies Views Posted [asc]
    Thread: Address Extraction
Post: RE: Address Extraction

Sorry I am a bit confused. Yes I can always rely on the text "PROVIDER:" and can always get its coordinates, but the the text I need, the provider, will be at different coordinates and even in a diff...
standenman Data Science 7 490 Apr-10-2024, 04:00 PM
    Thread: Address Extraction
Post: RE: Address Extraction

I am just using Adobe Acrobat Pro to OCR the file. (Apr-07-2024, 09:36 AM)DPaul Wrote: If "PROVIDER" is not always in the exact same place, bbox won't help. What module are you using to OCR the pdf...
standenman Data Science 7 490 Apr-07-2024, 12:43 PM
    Thread: Address Extraction
Post: Address Extraction

I am trying to map out a strategy for extracting text data from a pdf files. The files are semi-structured and would be created by the social security administration using iText. I have attached an ...
standenman Data Science 7 490 Apr-06-2024, 03:47 PM
    Thread: Strategy for data extraction
Post: Strategy for data extraction

I am trying to come up with a strategy for extracting key data from generic letters for different clients. This is the format of the letter I want to parse. It should look the same for every client,...
standenman Data Science 1 532 Feb-22-2024, 10:52 PM
    Thread: Transform a list
Post: RE: Transform a list

Yes thank you I have done something like that. I now have a python dictionary result: {'Title': 'Medical Evidence of Record (MER) Src.: HELEN HASKELL HOBBS Tmt. Dt.: Unknown - Unknown (10 pages)'...
standenman Data Science 3 498 Feb-19-2024, 11:23 PM
    Thread: Transform a list
Post: Transform a list

I have a list of the bookmarks in pdf that I wish to transform. The list prints out in the form: [[2, 'Medical Evidence of Record (MER) Src.: HELEN HASKELL HOBBS Tmt. Dt.: Unknown - Unknown (10 ...
standenman Data Science 3 498 Feb-19-2024, 07:53 PM
    Thread: Lost Modules
Post: RE: Lost Modules

(Jun-21-2023, 09:58 PM)snippsat Wrote: Read my post here again. You are now not using virtual environment,but python 3.11.4 64-bir from Microsoft store. When it use a virtual environment it will sho...
standenman General Coding Help 2 740 Jun-22-2023, 12:18 PM
    Thread: Lost Modules
Post: Lost Modules

I cannot access the modules I have pip installed in my virtual environment. I have checked the interpreter and it seems to be ok. In the example, cannot import pyPDF2 thought a pip list shows it is ...
standenman General Coding Help 2 740 Jun-21-2023, 09:44 PM
    Thread: Splitt PDF at regex value
Post: RE: Splitt PDF at regex value

Here's my output: Error:Page 1 unknown widths : [0, IndirectObject(3121, 0, 2905784995472)] unknown widths : [0, IndirectObject(3115, 0, 2905784995472)] unknown widths : [0, IndirectObject(3110, 0, 2...
standenman Data Science 11 2,203 Jun-13-2023, 09:37 PM
    Thread: Splitt PDF at regex value
Post: RE: Splitt PDF at regex value

OK. I will try it. (Jun-13-2023, 07:14 PM)deanhystad Wrote: What "gives you this stuff"? Those are not error messages. Is your program printing something, or are this a message from PyPDF2? Whe...
standenman Data Science 11 2,203 Jun-13-2023, 07:16 PM
    Thread: Development Environment Problems
Post: Development Environment Problems

I am having so many problems with my python development I am about to pull my hair out. I have created a virtual environment. I have accessing it vai VScode. All of a sudden stuff stops working - s...
standenman General Coding Help 1 534 Jun-13-2023, 07:15 PM
    Thread: Splitt PDF at regex value
Post: RE: Splitt PDF at regex value

Gives me this stuff: Error:[0, IndirectObject(3121, 0, 2465755860368)] unknown widths : [0, IndirectObject(3115, 0, 2465755860368)] unknown widths : [0, IndirectObject(3110, 0, 2465755860368)] unknow...
standenman Data Science 11 2,203 Jun-13-2023, 07:03 PM
    Thread: Splitt PDF at regex value
Post: RE: Splitt PDF at regex value

Interesting! Thanks so much for your help and feedback. I just found that the first set of code you gave me just to see if I am getting dates fails on one pdf, but works on another, leading me to qu...
standenman Data Science 11 2,203 Jun-13-2023, 06:00 PM
    Thread: Splitt PDF at regex value
Post: RE: Splitt PDF at regex value

OK. Thanks very much for your help. So eliminating the "/" in file name yes code runs but makes only one new file. But in ths target pdf we have 4 or 5 office visits - changes in the value of "Visi...
standenman Data Science 11 2,203 Jun-13-2023, 02:41 PM
    Thread: Splitt PDF at regex value
Post: Splitt PDF at regex value

I am trying to create code that will split a pdf into multiple files based upon a regex value in the pdf text. Specifically, I want to split this pdf based into discrete PDFs that represent a patient...
standenman Data Science 11 2,203 Jun-13-2023, 12:39 PM
    Thread: Langchain
Post: Langchain

I am trying to use langchain to query a pdf document with chatgpt. import os import openai from langchain.document_loaders import PyPDFLoader from langchain.text_splitter import RecursiveCharacterTex...
standenman Data Science 2 1,602 Jun-07-2023, 06:02 PM
    Thread: Data structure question
Post: Data structure question

I am creating an app that will take a pdf that represents a set of medical treatment records for a given patient. I want to submit the text to processing in medspacy. The desired output is to presen...
standenman Data Science 1 654 Jun-02-2023, 09:54 PM
    Thread: Index out of range error
Post: Index out of range error

I am trying to simply load a pdf doc and use langchain to process so I could query it with ChatGPT. I cannot get past loading the pdf doc. Cannot figure out what I am doing wrong here. Tried this co...
standenman Data Science 0 1,108 May-22-2023, 10:35 PM
    Thread: Cannot import Langchain and openai.
Post: RE: Cannot import Langchain and openai.

Thanks snippsat - that worked! Thanks for your help
standenman Data Science 2 5,112 May-22-2023, 03:00 PM
    Thread: Cannot import Langchain and openai.
Post: Cannot import Langchain and openai.

I have a python virtual environment set up. I have installed langchain and openai. The cmd command "(MedSpacyVenv) C:\Users\stand\MedSpacyVenv>pip freeze grep" yields: openai==0.27.7, openapi-sc...
standenman Data Science 2 5,112 May-21-2023, 09:39 PM

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020