Posts: 126
Threads: 27
Joined: Feb 2018
I'm very new to Python. I've studied a lot of books but putting all of it into practice is taxing to me. Any help you can provide is much appreciated!
When I run this...
from bs4 import BeautifulSoup
import requests
import re
import urllib
import zipfile
import os
apis = ["49025219260000",
"49025059260000",
"49025206640000",
"49025203350000",
"49025213300000",
"49025061090000",
"49025062840000",
]
def wogcc_completions_scraper():
x = 0
while x < len(apis):
wogcc_url = """http://wogcc.state.wy.us/wyocompletions.cfm?nApino=""" + str(apis[x][3:10])
print (apis[x])
print str(apis[x][3:10])
las_only = []
wogcc_request = requests.get(wogcc_url)
soup = BeautifulSoup(wogcc_request.content, "html.parser")
href_tags = soup.find_all('a')
### This section of code will data scrape the WOGCC for the completion reports
completion_regex = """<td align="left" bgcolor="white" nowrap="" valign="Middle" width="450"><strong><font size="1">(.+?)</font></strong></td>"""
completion_pattern = re.compile(completion_name_regex)
completion_file = re.findall(completion_name_pattern, str(soup))
CNF = completion_name_file I get this...
Traceback (most recent call last):
File "C:\WOGCC Well Completions Scraper Lil Scraper 1.2b.py", line 85, in <module>
wogcc_completions_scraper()
File "C:\WOGCC Well Completions Scraper Lil Scraper 1.2b.py", line 42, in wogcc_completions_scraper
completion_pattern = re.compile(completion_name_regex)
NameError: global name 'completion_name_regex' is not defined
Posts: 3,458
Threads: 101
Joined: Sep 2016
Feb-21-2018, 04:59 PM
(This post was last modified: Feb-21-2018, 04:59 PM by nilamo.)
(Feb-21-2018, 04:36 PM)tjnichols Wrote: Error: NameError: global name 'completion_name_regex' is not defined
The error is very descriptive. You're trying to use a variable that doesn't exist.
Here's your code, with the string removed. Hopefully you see the issue :)
completion_regex = ""
completion_pattern = re.compile(completion_name_regex) Unrelated, but for your sanity, you should avoid having spaces in your file names. As is, how would you import your file into a different script?
Posts: 126
Threads: 27
Joined: Feb 2018
Thank you for your help! No nilamo - I don't see my error. I tried copying what you had and it still gave me the error I had before. Will the name cause issues for me? How? Please excuse my ignorance but I thought I was working with a 'module' not a 'script'. How can I import it into a different script? Is this better than a 'module'?
I would like to upgrade this to Python 3.* and Beautiful Soup 4.* but I need to understand my issues with this first.
Posts: 7
Threads: 0
Joined: Jan 2018
(Feb-21-2018, 04:59 PM)nilamo Wrote: Unrelated, but for your sanity, you should avoid having spaces in your file names. As is, how would you import your file into a different script?
I agree here, haha
Posts: 126
Threads: 27
Joined: Feb 2018
I saw two responses from SeabassG33 (I agree here, haha). Nilamo I couldn't see your message?
Posts: 3,458
Threads: 101
Joined: Sep 2016
You're defining the variable:
completion_regex , but trying to use the variable:
completion_name_regex .
You're getting the variable not found error, because you're trying to use a variable that doesn't exist (completion_name_regex). So, either define that, or change it to use the variable you already have but never use (completion_regex).
Posts: 126
Threads: 27
Joined: Feb 2018
Ok Nilamo - that was awesome! I think that fixed a lot of things because it came up with several errors after that where I used your method to fix - THANK YOU!!
Now - it doesn't work at all though. Here is the entire thing...
from bs4 import BeautifulSoup
import requests
import re
import urllib
import zipfile
import os
apis = ["49025219260000",
"49025059260000",
"49025206640000",
"49025203350000",
"49025213300000",
"49025061090000",
"49025062840000",
]
def wogcc_completions_scraper():
x = 0
while x < len(apis):
wogcc_url = """http://wogcc.state.wy.us/wyocompletions.cfm?nApino=""" + str(apis[x][3:10])
print (apis[x])
print str(apis[x][3:10])
las_only = []
wogcc_request = requests.get(wogcc_url)
soup = BeautifulSoup(wogcc_request.content, "html.parser")
href_tags = soup.find_all('a')
### This section of code will data scrape the WOGCC for the completion report
completion_name_regex = ""
completion_pattern = ""
completion_name_regex = re.compile(completion_name_regex)
completion_file = "" = re.findall(completion_name_pattern, str(soup))
CNF = completion_name_file
CNF1 = []
for q in CNF:
q1 = str(q)
q2 = q1.decode('unicode_escape').encode('ascii','ignore')
q3 = q2.replace(" ","")
q4 = q3.replace('"',"")
q5 = q4.replace("#","")
CNF1.append(q5)
b = 0 ### This is a new counter for the CFN1 and CFEL lists. It will keep consistency for when we loop and download logs using this list
completion_link_regex = """<a href="http://wugiwus.state.wy.us/(.+?)"><img border="0" height="14" src="search.gif" width="15"/></a>"""
completion_link_pattern = re.compile(completion_link_regex)
completion_file_end_link = re.findall(completion_link_pattern, str(href_tags))
CFEL = completion_file_end_link
final_completion_link = []
for p in CFEL:
final_completion_link.append("http://wugiwus.state.wy.us/" + str(p))
print final_completion_link
if final_completion_link == 0:
pass
else:
while b < len(RFEL): ### This loop will loop through everything and download all of the completion reports from the well and name them
download1 = requests.get(final_completion_link[b])
with open((str(apis[x]) + "_" + str(CNF1[b].replace("/","")) + ".pdf"), "wb") as code:
code.write(download1.content)
b += 1
x +=1
wogcc_completions_scraper() This is the error I get -
"There's an error in your program: *** can't assign to literal (C:/WOGCC_Well_Completions_Scraper_Lil_Scraper_12b.py, line 34)"
I truly appreciate your help!
Thank you!
Posts: 3,458
Threads: 101
Joined: Sep 2016
I promise the error messages are trying to help you :)
Quote:There's an error in your program: *** can't assign to literal (C:/WOGCC_Well_Completions_Scraper_Lil_Scraper_12b.py, line 34)
Here's your line 34: Quote: completion_file = "" = re.findall(completion_name_pattern, str(soup))
You've got two equal signs there, which means you're trying to do double-assignment. Which is sometimes useful, but normally more confusing that it's worth. For example, you could use it like so: >>> x = y = 5
>>> x
5
>>> y
5 But one of the things you're assigning to is an empty string. >>> "" = 4
File "<stdin>", line 1
SyntaxError: can't assign to literal That's a strict no-no. I'm not sure what your intention is, probably that was just old code that got left over.
Posts: 126
Threads: 27
Joined: Feb 2018
I really appreciate your help! IT WORKS!!! THANK YOU!!!
Posts: 7
Threads: 0
Joined: Jan 2018
(Feb-21-2018, 06:56 PM)tjnichols Wrote: I saw two responses from SeabassG33 (I agree here, haha). Nilamo I couldn't see your message?
Really? I only see one.  Sorry for the accidental spam.
|