Python Forum
What Python skills for a fraud detective?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
What Python skills for a fraud detective?
#4
(Mar-06-2022, 07:28 AM)MaartenRo Wrote: Can i also use the os module or pathlib for searching keyword in files with text, like Word, Excel or PDF? Or can i use another module for this?
You will need addition module used alone or in combination with tool mention,
these are binary files so need modules that can covert into text.
Example for .pdf in this Thread
import pdfplumber
 
pdf_file = "sample.pdf"
search_word = 'text'
with pdfplumber.open(pdf_file) as pdf:
    pages = pdf.pages
    for page_nr, pg in enumerate(pages, 1):
        content = pg.extract_text()
        if search_word in content:
            print(f'<{search_word}> found at page number <{page_nr}> '\
                    f'at index <{content.index(search_word)}>')
Output:
<text> found at page number <1> at index <119> <text> found at page number <2> at index <56>
Also regex is tool you should look more into,you see me use it last post.
Regex is very powerful for all kind of thing,eg like eg finding exact match of a word or part of it in a file.
grep dos similar stuff from command line

For word python-docx

For Excel i use Pandas that is easy to use(pd.read_excel()) and write(df.fo_excel()).
Also get similar look DataFrame as Excel when have read it in.

Other modules eg openpyxl | pyexcel .
Reply


Messages In This Thread
RE: What Python skills for a fraud detective? - by snippsat - Mar-06-2022, 10:49 AM

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020