Jan-17-2022, 09:16 PM
Part 1: Accomplished (Prepping my 1611 scrapes [folders & files]) files list; for parsing the HTML to Python and then Payload to MariaDB (Part 2)
Source/Tutorials:
https://unix.stackexchange.com/questions/47858/how-can-i-search-a-wild-card-name-in-all-subfolders
IRC Network - Libera.Chat - #linux:
My working directory for WGET with target folder (with over 160k files/folders - contents) "www.kingjamesbibleonline.org"
I executed the following command to create "1611_3.txt" which now has all the files and folders containing "1611".
What is the best way to start a python script to loop through each directory listed in "1611_3.txt" and run bs4 against it?
Thank you everyone for this forum!
Best Regards,
Brandon Kastning
Source/Tutorials:
https://unix.stackexchange.com/questions/47858/how-can-i-search-a-wild-card-name-in-all-subfolders
IRC Network - Libera.Chat - #linux:
- Dan39: Keyed me into the find command w/ sample syntax
Danilo82: Keyed me into an alternative to find command called "fzf" which I haven't explored yet.
My working directory for WGET with target folder (with over 160k files/folders - contents) "www.kingjamesbibleonline.org"
I executed the following command to create "1611_3.txt" which now has all the files and folders containing "1611".
/WGET-11.02.2021.www.kingjamesbibleonline.org$ find www.kingjamesbibleonline.org/ -name '*1611*' > 1611_3.txtNow here is a sample output of the "1611_3.txt" file generated:
www.kingjamesbibleonline.org/Luke-Chapter-24_Original-1611-KJV www.kingjamesbibleonline.org/Iohn_13_1611 www.kingjamesbibleonline.org/The-Epistle-to-the-Romanes_12_1611 www.kingjamesbibleonline.org/Reuelation_21_1611 www.kingjamesbibleonline.org/Prouerbs_22_1611 www.kingjamesbibleonline.org/Ecclesiastes_3_1611 www.kingjamesbibleonline.org/1-Corinthians_13_1611 www.kingjamesbibleonline.org/Psalmes_16_1611 www.kingjamesbibleonline.org/Ephesians_5_1611 www.kingjamesbibleonline.org/Discussion-Thread-101611 www.kingjamesbibleonline.org/John-Chapter-15_Original-1611-KJV www.kingjamesbibleonline.org/Discussion-Thread-161149 www.kingjamesbibleonline.org/Revelation-Chapter-22_Original-1611-KJV www.kingjamesbibleonline.org/2-Samuel-Chapter-22_Original-1611-KJV www.kingjamesbibleonline.org/1-Chronicles-Chapter-16_Original-1611-KJV www.kingjamesbibleonline.org/Job-Chapter-19_Original-1611-KJV www.kingjamesbibleonline.org/Job-Chapter-22_Original-1611-KJV www.kingjamesbibleonline.org/Psalms-Chapter-23_Original-1611-KJV www.kingjamesbibleonline.org/Romans-Chapter-12_Original-1611-KJV www.kingjamesbibleonline.org/Revelation-Chapter-21_Original-1611-KJV www.kingjamesbibleonline.org/Galatians-Chapter-6_Original-1611-KJV www.kingjamesbibleonline.org/Isaiah-Chapter-26_Original-1611-KJV www.kingjamesbibleonline.org/Colossians-Chapter-3_Original-1611-KJV www.kingjamesbibleonline.org/Lamentations-Chapter-3_Original-1611-KJV www.kingjamesbibleonline.org/2-Corinthians-Chapter-3_Original-1611-KJVEach folder has 1 "index.html" inside for parsing.
What is the best way to start a python script to loop through each directory listed in "1611_3.txt" and run bs4 against it?
Thank you everyone for this forum!
Best Regards,
Brandon Kastning
“And one of the elders saith unto me, Weep not: behold, the Lion of the tribe of Juda, the Root of David, hath prevailed to open the book,...” - Revelation 5:5 (KJV)
“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)
#LetHISPeopleGo
“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)
#LetHISPeopleGo