Python Forum
Thread Rating:
  • 1 Vote(s) - 2 Average
  • 1
  • 2
  • 3
  • 4
  • 5
BeautifulSoup - Table
#1
Hi,

I ran the following code and got the error message: "IndexError: list index out of range"

import csv
from urllib.request import urlopen 
from bs4 import BeautifulSoup

html = urlopen("http://en.wikipedia.org/wiki/Comparison_of_text_editors") 
bsObj = BeautifulSoup(html) 
table = bsObj.findAll("table",{"class":"wikitable"})[0]
so I rerun the following code except for the last line of code changing it to

table = bsObj.findAll("table")
However, when I call the object table I got an empty [], when I run len(table), I got back 0. Shouldn't I get the a table or content of the table?
Reply
#2
hello,

I tried

table = bsObj.findAll("table",{"class":"wikitable"})
and it worked (python 3.4 win 7 pro)
Reply
#3
I can confirm that your code works without modification, on 3.5.2. What version of python are you using? Also, try printing out the contents of the html, before you even get to pass it over to BS... in case there was a network issue and the doc is just empty.
Reply
#4
Hi,

I returned to my computer the next day and ran the same code and it worked. Never restarted my computer and I could confirm that there wasn't typo or syntax error the first time around when I ran the code. I use Python 3.5.1.

Nevertheless, I also have problem with the following code. Wonder if it is due to memory and/or if both issues are related.

import csv
csvFile = open("C:/Users/AppData/Local/Programs/Python/Python35-32/myScript/editors.csv", 'wt')
writer = csv.writer(csvFile)
When I ran the above code line by line on Python Shell, they would execute perfectly. However, when I saved the code as a .py file, when I tried to run it, I would get the following error message:

Quote:Warning (from warnings module):
  File "C:\Users\AppData\Local\Programs\Python\Python35-32\lib\site-packages\bs4\__init__.py", line 166
    markup_type=markup_type))
UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

To get rid of this warning, change this:

 BeautifulSoup([your markup])

to this:

 BeautifulSoup([your markup], "html.parser")

Traceback (most recent call last):
  File "C:\Users\AppData\Local\Programs\Python\Python35-32\myScript\csv2.py", line 1, in <module>
    import csv
  File "C:\Users\AppData\Local\Programs\Python\Python35-32\myScript\csv.py", line 11, in <module>
    writer = csv.writer(csvFile)
AttributeError: module 'csv' has no attribute 'writer'

Nowhere in there code was I calling for BeautifulSoup library. Nevertheless, it was part of the error msg.
Reply
#5
Quote:Nowhere in there code was I calling for BeautifulSoup library. Nevertheless, it was part of the error msg

I cant think of a situation that occurs often that would cause this.
Are you sure its not in your source file? Because your OP has BS in it. Are you importing this source file? What IDE are you running and/or how are you running the source files?
Quote:When I ran the above code line by line on Python Shell, they would execute perfectly. However, when I saved the code as a .py file, when I tried to run it, I would get the following error message:
This is a big difference. Interpreter that you open manually is completely different that an interpreter ran from an IDE, for example. IDE's might do something automatically....and it might even be a different interpreter. 
Quote:File "C:\Users\AppData\Local\Programs\Python\Python35-32\myScript\csv.py", line 11, in <module>
Based on this line it looks like you named your file the same as the module....which is a big no no.
Recommended Tutorials:
Reply
#6
(Oct-12-2016, 02:38 AM)metulburr Wrote: Are you sure its not in your source file? Because your OP has BS in it.

This is a big difference. Interpreter that you open manually is completely different that an interpreter ran from an IDE, for example. IDE's might do something automatically....and it might even be a different interpreter. 
Hi,

it worked now.
After I restarted my computer. I deleted the csv.py and renamed it simply as test.py. I ran the code again. It then worked.

Anyhow, what does OP means?

Also, what do u mean that "Interpreter that you open manually is completely different that an interpreter ran from an IDE?

I downloaded Python from https://www.python.org/downloads/windows/
I normally bring up the Python 3.5.1 Shell, which I think it is the Python Interpreter.

From every individual piece of code that I run on the Shell, I then copy and paste them to a New File that I opened from the Shell and saved them as .py

I self-studied Python, there are gaps in my knowledge. Hope to fill those gaps out.

Thank you!
Reply
#7
(Oct-21-2016, 01:19 AM)tkj80 Wrote: I deleted the csv.py and renamed it simply as test.py
Yeah, dont name files the same as something you are going to import.

(Oct-21-2016, 01:19 AM)tkj80 Wrote: Anyhow, what does OP means?
original post...the first post of hte thread

(Oct-21-2016, 01:19 AM)tkj80 Wrote: Also, what do u mean that "Interpreter that you open manually is completely different that an interpreter ran from an IDE?
an interpreter that you use on a command line/terminal, does not mean its the same that is linked with whatever IDE you are using. Many people have multiple python versions.
Recommended Tutorials:
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Beautifulsoup table question tantony 5 2,796 Sep-30-2019, 03:26 PM
Last Post: tantony
  BeautifulSoup: Error while extracting a value from an HTML table kawasso 3 3,219 Aug-25-2019, 01:13 AM
Last Post: kawasso
  BeautifulSoup - extract table but not using ID jonesin1974 5 29,127 Apr-27-2018, 07:22 PM
Last Post: NinoBaus
  How to get hyperlinks in to the table extracted by BeautifulSoup KenniT 2 4,935 Apr-04-2018, 10:05 AM
Last Post: DeaD_EyE

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020