Jan-13-2022, 10:52 PM
(This post was last modified: Jan-13-2022, 10:52 PM by BrandonKastning.
Edit Reason: forgot code + gratitude
)
snippsat,
Thank you for this new knowledge and sharing this code. Wonderful example!
Without applying your teaching to my code (yet); I found a work around that managed to pull the data in.
I disabled:
Then I had to do the following using Libre Calc
Once disabled; Python picked up the correct table with tables[0] (Strange; I am still unsure what index=False truly does at this time).
The output against the .ods it generated an additional row of header names. One that is different from another.
I tried removing the duplicate row w/ 3 column headings that were under a single header name. Libre Calc gave me an error; so I decided to try copy and paste and the following worked great for a CSV save in 7 steps.
Default Settings on Save As Dialogs were used and worked fine!
then I use step2.py as the Payload after Manually Naming the Column Headers in Libre Calc!
step1.py code (to generate the .ods):
Best Regards,
Brandon Kastning
Thank you for this new knowledge and sharing this code. Wonderful example!
Without applying your teaching to my code (yet); I found a work around that managed to pull the data in.
I disabled:
index=FalseFlag/Parameter.
Then I had to do the following using Libre Calc
Once disabled; Python picked up the correct table with tables[0] (Strange; I am still unsure what index=False truly does at this time).
The output against the .ods it generated an additional row of header names. One that is different from another.
I tried removing the duplicate row w/ 3 column headings that were under a single header name. Libre Calc gave me an error; so I decided to try copy and paste and the following worked great for a CSV save in 7 steps.
Default Settings on Save As Dialogs were used and worked fine!
then I use step2.py as the Payload after Manually Naming the Column Headers in Libre Calc!
step1.py code (to generate the .ods):
import pandas as pd url = "https://en.wikipedia.org/wiki/List_of_municipalities_in_Alabama" tables = pd.read_html(url) df = tables[0] #df = df.drop('Map', axis=1) #df.to_excel("AL_Alabama_Cities.ods", index=False, engine="odf") df.to_excel("AL_Alabama_Cities.ods", engine="odf")step2.py code:
import pandas as pd import mysql.connector from sqlalchemy import create_engine myd = pd.read_csv('AL_Alabama_Cities.CSV.csv') engine = create_engine('mysql+mysqlconnector://brandon:[email protected]/Exodus_J3x_Dev_Bronson') myd.to_sql(name='AL_Cities_CSV', con=engine, if_exists='replace', index=False)Thank you again snippsat and everyone for this forum time/expertise!
Best Regards,
Brandon Kastning
(Jan-13-2022, 02:46 PM)snippsat Wrote:(Jan-13-2022, 04:56 AM)BrandonKastning Wrote: Thank you again for this forum! How do I determine the tables[#]? Is it a guessing game or is is there an attribute or property within the browser code that could aid me in finding the correct tables[#]?A web site can have many tables,so have to look at site(count) or test out like tables[0], tables[1],tables[6].... and see if get wanted result.
There ismatch
in pandas.read_html that can use string or regex to match something i table wanted.
Example Timeline of programming languages ,let say we want Python table we can match name Guido van Rossum.
import pandas as pd df = pd.read_html('https://en.wikipedia.org/wiki/Timeline_of_programming_languages', match='Guido van Rossum') df[0].head(13)So if a match it will always be
Output:Year Name Chief developer, company Predecessor(s) 0 1990 Sather Steve Omohundro Eiffel 1 1990 AMOS BASIC François Lionet and Constantin Sotiropoulos STOS BASIC 2 1990 AMPL Robert Fourer, David Gay and Brian Kernighan a... NaN 3 1990 Object Oberon H Mössenböck, J Templ, R Griesemer Oberon 4 1990 J Kenneth E. Iverson, Roger Hui at Iverson Software APL, FP 5 1990 Haskell NaN Miranda 6 1990 EuLisp NaN Common Lisp, Scheme 7 1990 Z Shell (zsh) Paul Falstad at Princeton University ksh 8 1991 GNU E David J. DeWitt, Michael J. Carey C++ 9 1991 Oberon-2 Hanspeter Mössenböck, Wirth Object Oberon 10 1991 Oz Gert Smolka and his students Prolog 11 1991 Q Albert Gräf NaN 12 1991 Python Guido van Rossum ABC, Cdf[0]
.
Withoutmatch
it would be table 9:
df[9].head(13)
“And one of the elders saith unto me, Weep not: behold, the Lion of the tribe of Juda, the Root of David, hath prevailed to open the book,...” - Revelation 5:5 (KJV)
“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)
#LetHISPeopleGo
“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)
#LetHISPeopleGo