Python Forum
Scraping Columns with Pandas (Column Entries w/ more than 1 word writes two columns)
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Scraping Columns with Pandas (Column Entries w/ more than 1 word writes two columns)
#7
(Jan-13-2022, 04:56 AM)BrandonKastning Wrote: Thank you again for this forum! How do I determine the tables[#]? Is it a guessing game or is is there an attribute or property within the browser code that could aid me in finding the correct tables[#]?
A web site can have many tables,so have to look at site(count) or test out like tables[0], tables[1],tables[6].... and see if get wanted result.

There is match in pandas.read_html that can use string or regex to match something i table wanted.
Example Timeline of programming languages ,let say we want Python table we can match name Guido van Rossum.
import pandas as pd
 
df = pd.read_html('https://en.wikipedia.org/wiki/Timeline_of_programming_languages', match='Guido van Rossum')
df[0].head(13)
Output:
Year Name Chief developer, company Predecessor(s) 0 1990 Sather Steve Omohundro Eiffel 1 1990 AMOS BASIC François Lionet and Constantin Sotiropoulos STOS BASIC 2 1990 AMPL Robert Fourer, David Gay and Brian Kernighan a... NaN 3 1990 Object Oberon H Mössenböck, J Templ, R Griesemer Oberon 4 1990 J Kenneth E. Iverson, Roger Hui at Iverson Software APL, FP 5 1990 Haskell NaN Miranda 6 1990 EuLisp NaN Common Lisp, Scheme 7 1990 Z Shell (zsh) Paul Falstad at Princeton University ksh 8 1991 GNU E David J. DeWitt, Michael J. Carey C++ 9 1991 Oberon-2 Hanspeter Mössenböck, Wirth Object Oberon 10 1991 Oz Gert Smolka and his students Prolog 11 1991 Q Albert Gräf NaN 12 1991 Python Guido van Rossum ABC, C
So if a match it will always be df[0].
Without match it would be table 9:
df[9].head(13)
BrandonKastning likes this post
Reply


Messages In This Thread
RE: Scraping Columns with Pandas (Column Entries w/ more than 1 word writes two columns) - by snippsat - Jan-13-2022, 02:46 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
Question Scraping Wikipedia Article (Name in 1 column & URL in 2nd column) ->CSV! Anyone? BrandonKastning 4 2,059 Jan-27-2022, 04:36 AM
Last Post: Larz60+
  Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? BrandonKastning 21 7,071 Mar-23-2020, 05:51 PM
Last Post: ndc85430
  Display blog posts in two columns saladgg 3 3,398 Dec-28-2018, 05:17 AM
Last Post: saladgg

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020