Python Forum
web scraping to csv formatting problems
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
web scraping to csv formatting problems
#3
I have no understanding about specifics of this task. It seems to me that html table is needed to scrape and in this case I would skip the low level coding and let pandas handle that. Something along those lines:

>>> import pandas as pd
>>> df = pd.read_html("https://en.wikipedia.org/wiki/Comparison_of_programming_languages")
>>> df[1].to_csv('comparison_table.csv')
This code grabs second table (135 rows and 11 columns) on webpage https://en.wikipedia.org/wiki/Comparison..._languages and writes it to comparison_table.csv in present working directory.

Output:
,Language,Intended use,Imperative,Object-oriented,Functional,Procedural,Generic,Reflective,Event-driven,Other paradigm(s),Standardized? 0,1C:Enterprise,"Application, RAD, business, general, web, mobile",Yes,,Yes,Yes,Yes,Yes,Yes,"Object-based, Prototype-based programming",No 1,ActionScript 3.0,"Application, client-side, web",Yes,Yes,Yes,,,,Yes,,"1996, ECMA" 2,Ada,"Application, embedded, realtime, system",Yes,Yes[2],,Yes[3],Yes[4],,,"concurrent,[5] distributed,[6]","1983, 2005, 2012, ANSI, ISO, GOST 27831-88[7]" 3,Aldor,"Highly domain-specific, symbolic computing",Yes,Yes,Yes,,,,,,No 4,ALGOL 58,Application,Yes,,,,,,,,No /.../
For further information you can look at forum thread pandas library tricks
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply


Messages In This Thread
RE: web scraping to csv formatting problems - by perfringo - Jul-03-2019, 09:47 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Scraping problems with Python requests. gtlhbkkj 1 1,907 Jan-22-2020, 11:00 AM
Last Post: gtlhbkkj
  Scraping problems. Pls help with a correct request query. gtlhbkkj 0 1,533 Oct-09-2019, 12:00 PM
Last Post: gtlhbkkj
  Scraping problems. Pls help with a correct request query. gtlhbkkj 6 3,137 Oct-01-2019, 09:22 PM
Last Post: gtlhbkkj
  Formatting Output After Web Scraping yoitspython 3 2,951 Aug-01-2019, 01:22 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020