Oct-03-2023, 06:38 AM
Hi,
I could use some advice ! Prayer cards are an endless source of challenges.
In our centre, over the years, zillions of documents have been filed on a server,
according to a certain (dir/subdir/etc...) system, that made sense at the time.
Today new OCR techniques have lead to new insights and a different approach.
I have an SQL database with OCRred content
of a certain % of the documents on the server, and growing.
To make it simple :
the tables in the database have a field: - the document filename, but not the server-path to it.
(Of course I have a temp solution for current users, but that is not sustainable, given the volumes involved)
My plan:
I can write a program (style "for root, dirs, subdirs,files...") that inserts the path to every document into an indexed table.
That table is then used as a GPS to find the doc and show it to the user.
The question: is this a good plan, or does python offer other (better) possibilities.
I don't need a document management system: i don't care about fields like " author, size, issue date, number of pages...etc".
I just want to know where it is on the server .
Any insights you'd liketo share?
Thanks
Paul
I could use some advice ! Prayer cards are an endless source of challenges.
In our centre, over the years, zillions of documents have been filed on a server,
according to a certain (dir/subdir/etc...) system, that made sense at the time.
Today new OCR techniques have lead to new insights and a different approach.
I have an SQL database with OCRred content
of a certain % of the documents on the server, and growing.
To make it simple :
the tables in the database have a field: - the document filename, but not the server-path to it.
(Of course I have a temp solution for current users, but that is not sustainable, given the volumes involved)
My plan:
I can write a program (style "for root, dirs, subdirs,files...") that inserts the path to every document into an indexed table.
That table is then used as a GPS to find the doc and show it to the user.
The question: is this a good plan, or does python offer other (better) possibilities.
I don't need a document management system: i don't care about fields like " author, size, issue date, number of pages...etc".
I just want to know where it is on the server .
Any insights you'd liketo share?
Thanks
Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.