I have no understanding about specifics of this task. It seems to me that html table is needed to scrape and in this case I would skip the low level coding and let pandas handle that. Something along those lines:
>>> import pandas as pd >>> df = pd.read_html("https://en.wikipedia.org/wiki/Comparison_of_programming_languages") >>> df[1].to_csv('comparison_table.csv')This code grabs second table (135 rows and 11 columns) on webpage https://en.wikipedia.org/wiki/Comparison..._languages and writes it to comparison_table.csv in present working directory.
Output:,Language,Intended use,Imperative,Object-oriented,Functional,Procedural,Generic,Reflective,Event-driven,Other paradigm(s),Standardized?
0,1C:Enterprise,"Application, RAD, business, general, web, mobile",Yes,,Yes,Yes,Yes,Yes,Yes,"Object-based, Prototype-based programming",No
1,ActionScript 3.0,"Application, client-side, web",Yes,Yes,Yes,,,,Yes,,"1996, ECMA"
2,Ada,"Application, embedded, realtime, system",Yes,Yes[2],,Yes[3],Yes[4],,,"concurrent,[5] distributed,[6]","1983, 2005, 2012, ANSI, ISO, GOST 27831-88[7]"
3,Aldor,"Highly domain-specific, symbolic computing",Yes,Yes,Yes,,,,,,No
4,ALGOL 58,Application,Yes,,,,,,,,No
/.../
For further information you can look at forum thread pandas library tricks
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.