Python Forum
NameError: name 'bsObj' is not defined
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
NameError: name 'bsObj' is not defined
#1
I am not sure what this error is.
here is the line of code it is talking about:

for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")):

Here it the whole error:

Traceback (most recent call last):
  File "C:\Users\renny and kite\Desktop\web scraping the book\example_11\example
_11\example_11.py", line 15, in <module>
    for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")):
NameError: name 'bsObj' is not defined
Press any key to continue . . .

How is the name not defined? Huh
Reply
#2
please, post the whole code, exactly as you try to run it.
The error is clear bsObj is not defined at the time when you try to use it in that line.
Reply
#3
We cant tell because the rest of the code where it has not been defined is not shown.
Reply
#4
(Oct-22-2016, 07:24 PM)Blue Dog Wrote: How is the name not defined?

That's what you would have to go back through your code and figure out. The error is telling you that when Python gets to that line of code, it has not seen the name bsObj before, at least in that scope. When I get that error, one of three things has happened: I mistyped the variable name when it was first mentioned, some if/elif logic skipped over the first mention of that name, or I didn't scope it correctly (it should be self.bsObj or something).
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply
#5
Here is the code:

from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
pages = set()
def getLinks(pageUrl):
 global pages
 html = urlopen("http://en.wikipedia.org"+pageUrl)
 try:
    print(bsObj.h1.get_text())
    print(bsObj.find(id ="mw-content-text").findAll("p")[0])
    print(bsObj.find(id="ca-edit").find("span").find("a").attrs['href'])
 except AttributeError:
   
    print("This page is missing something! No worries though!")
for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")):
   if 'href' in link.attrs:
       if link.attrs['href'] not in pages:
#We have encountered a new page
         newPage = link.attrs['href']
         print("----------------\n"+newPage)
         pages.add(newPage)
         getLinks(newPage)
getLinks("")
That it, I found this on the web and change some of it, the findAll is defined I think = print(bsObj.find(id ="mw-content-text").findAll("p")[0])

Thank you Dodgy
Reply
#6
Go back and look at the code you copied and find where it is you changed the definition of bsObj out of the code.
Reply
#7
Fix indention the code is a mess.
The first error is because no soup object is defined
bsObj = BeautifulSoup(html, 'html.parser')
Reply
#8
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
pages = set()
def getLinks(pageUrl):
 global pages
 html = urlopen("http://en.wikipedia.org"+pageUrl)
 try:
    print(bsObj.h1.get_text())
    print(bsObj.find(id ="mw-content-text").findAll("p")[0])
    print(bsObj.find(id="ca-edit").find("span").find("a").attrs['href'])
 except AttributeError:
    
    print("This page is missing something! No worries though!")
    for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")):
       if 'href' in link.attrs:
           if link.attrs['href'] not in pages:
#We have encountered a new page
         newPage = link.attrs['href']
         print("----------------\n"+newPage)
         pages.add(newPage)
         getLinks(newPage)
getLinks("")
Is that better, still will not work, but I am have for the problem
Reply
#9
It looks like all you have done is changed the indentation and still left out the bsObj
Reply
#10
indentation still wrong (hint line 17) I presume you are nesting the 'if' statement
If it ain't broke, I just haven't gotten to it yet.
OS: Windows 10, openSuse 42.3, freeBSD 11, Raspian "Stretch"
Python 3.6.5, IDE: PyCharm 2018 Community Edition
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  NameError :name 'name' is not defined Suzzy_ 4 18,961 Dec-27-2019, 09:04 AM
Last Post: LeanbridgeTech
  NameError: Name 'path' is not defined aniyanetworks 9 59,749 Jun-29-2018, 03:21 PM
Last Post: gontajones
  NameError: name 'download' is not defined ntdropper 3 11,330 Jan-13-2018, 07:18 AM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020