Search Results
|
Post |
Author |
Forum |
Replies |
Views |
Posted
[asc]
|
|
|
Thread: Address Extraction
Post: RE: Address Extraction
Sorry I am a bit confused. Yes I can always rely on the text "PROVIDER:" and can always get its coordinates, but the the text I need, the provider, will be at different coordinates and even in a diff... |
|
standenman |
Data Science |
7 |
490 |
Apr-10-2024, 04:00 PM |
|
|
Thread: Address Extraction
Post: RE: Address Extraction
I am just using Adobe Acrobat Pro to OCR the file.
(Apr-07-2024, 09:36 AM)DPaul Wrote: If "PROVIDER" is not always in the exact same place, bbox won't help.
What module are you using to OCR the pdf... |
|
standenman |
Data Science |
7 |
490 |
Apr-07-2024, 12:43 PM |
|
|
Thread: Address Extraction
Post: Address Extraction
I am trying to map out a strategy for extracting text data from a pdf files. The files are semi-structured and would be created by the social security administration using iText.
I have attached an ... |
|
standenman |
Data Science |
7 |
490 |
Apr-06-2024, 03:47 PM |
|
|
Thread: Strategy for data extraction
Post: Strategy for data extraction
I am trying to come up with a strategy for extracting key data from generic letters for different clients. This is the format of the letter I want to parse. It should look the same for every client,... |
|
standenman |
Data Science |
1 |
532 |
Feb-22-2024, 10:52 PM |
|
|
Thread: Transform a list
Post: RE: Transform a list
Yes thank you I have done something like that. I now have a python dictionary result:
{'Title': 'Medical Evidence of Record (MER) Src.: HELEN HASKELL HOBBS Tmt. Dt.: Unknown - Unknown (10 pages)'... |
|
standenman |
Data Science |
3 |
498 |
Feb-19-2024, 11:23 PM |
|
|
Thread: Transform a list
Post: Transform a list
I have a list of the bookmarks in pdf that I wish to transform. The list prints out in the form:
[[2, 'Medical Evidence of Record (MER) Src.: HELEN HASKELL HOBBS Tmt. Dt.: Unknown - Unknown (10 ... |
|
standenman |
Data Science |
3 |
498 |
Feb-19-2024, 07:53 PM |
|
|
Thread: Lost Modules
Post: RE: Lost Modules
(Jun-21-2023, 09:58 PM)snippsat Wrote: Read my post here again.
You are now not using virtual environment,but python 3.11.4 64-bir from Microsoft store.
When it use a virtual environment it will sho... |
|
standenman |
General Coding Help |
2 |
740 |
Jun-22-2023, 12:18 PM |
|
|
Thread: Lost Modules
Post: Lost Modules
I cannot access the modules I have pip installed in my virtual environment. I have checked the interpreter and it seems to be ok. In the example, cannot import pyPDF2 thought a pip list shows it is ... |
|
standenman |
General Coding Help |
2 |
740 |
Jun-21-2023, 09:44 PM |
|
|
Thread: Splitt PDF at regex value
Post: RE: Splitt PDF at regex value
Here's my output:
Error:Page 1
unknown widths :
[0, IndirectObject(3121, 0, 2905784995472)]
unknown widths :
[0, IndirectObject(3115, 0, 2905784995472)]
unknown widths :
[0, IndirectObject(3110, 0, 2... |
|
standenman |
Data Science |
11 |
2,203 |
Jun-13-2023, 09:37 PM |
|
|
Thread: Splitt PDF at regex value
Post: RE: Splitt PDF at regex value
OK. I will try it.
(Jun-13-2023, 07:14 PM)deanhystad Wrote: What "gives you this stuff"?
Those are not error messages. Is your program printing something, or are this a message from PyPDF2? Whe... |
|
standenman |
Data Science |
11 |
2,203 |
Jun-13-2023, 07:16 PM |
|
|
Thread: Development Environment Problems
Post: Development Environment Problems
I am having so many problems with my python development I am about to pull my hair out. I have created a virtual environment. I have accessing it vai VScode. All of a sudden stuff stops working - s... |
|
standenman |
General Coding Help |
1 |
534 |
Jun-13-2023, 07:15 PM |
|
|
Thread: Splitt PDF at regex value
Post: RE: Splitt PDF at regex value
Gives me this stuff:
Error:[0, IndirectObject(3121, 0, 2465755860368)]
unknown widths :
[0, IndirectObject(3115, 0, 2465755860368)]
unknown widths :
[0, IndirectObject(3110, 0, 2465755860368)]
unknow... |
|
standenman |
Data Science |
11 |
2,203 |
Jun-13-2023, 07:03 PM |
|
|
Thread: Splitt PDF at regex value
Post: RE: Splitt PDF at regex value
Interesting! Thanks so much for your help and feedback. I just found that the first set of code you gave me just to see if I am getting dates fails on one pdf, but works on another, leading me to qu... |
|
standenman |
Data Science |
11 |
2,203 |
Jun-13-2023, 06:00 PM |
|
|
Thread: Splitt PDF at regex value
Post: RE: Splitt PDF at regex value
OK. Thanks very much for your help. So eliminating the "/" in file name yes code runs but makes only one new file. But in ths target pdf we have 4 or 5 office visits - changes in the value of "Visi... |
|
standenman |
Data Science |
11 |
2,203 |
Jun-13-2023, 02:41 PM |
|
|
Thread: Splitt PDF at regex value
Post: Splitt PDF at regex value
I am trying to create code that will split a pdf into multiple files based upon a regex value in the pdf text. Specifically, I want to split this pdf based into discrete PDFs that represent a patient... |
|
standenman |
Data Science |
11 |
2,203 |
Jun-13-2023, 12:39 PM |
|
|
Thread: Langchain
Post: Langchain
I am trying to use langchain to query a pdf document with chatgpt. import os
import openai
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTex... |
|
standenman |
Data Science |
2 |
1,602 |
Jun-07-2023, 06:02 PM |
|
|
Thread: Data structure question
Post: Data structure question
I am creating an app that will take a pdf that represents a set of medical treatment records for a given patient. I want to submit the text to processing in medspacy. The desired output is to presen... |
|
standenman |
Data Science |
1 |
654 |
Jun-02-2023, 09:54 PM |
|
|
Thread: Index out of range error
Post: Index out of range error
I am trying to simply load a pdf doc and use langchain to process so I could query it with ChatGPT. I cannot get past loading the pdf doc. Cannot figure out what I am doing wrong here. Tried this co... |
|
standenman |
Data Science |
0 |
1,108 |
May-22-2023, 10:35 PM |
|
|
Thread: Cannot import Langchain and openai.
Post: RE: Cannot import Langchain and openai.
Thanks snippsat - that worked! Thanks for your help |
|
standenman |
Data Science |
2 |
5,112 |
May-22-2023, 03:00 PM |
|
|
Thread: Cannot import Langchain and openai.
Post: Cannot import Langchain and openai.
I have a python virtual environment set up. I have installed langchain and openai. The cmd command "(MedSpacyVenv) C:\Users\stand\MedSpacyVenv>pip freeze grep" yields: openai==0.27.7, openapi-sc... |
|
standenman |
Data Science |
2 |
5,112 |
May-21-2023, 09:39 PM |