Python Forum
"EOL While Scanning String Literal"
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
"EOL While Scanning String Literal"
#1
from bs4 import BeautifulSoup
import re
import urlib3

"Define the name of my web scraper. We want the scraper to continue to look for completions until it reaches a zero value."

apis = ["49009229900000”,“49009226390000”,“49009278600000”,“49009226340000”,“49009200210000”,“49009065760000”,“49009201380000”,“49009230130000”,“49009278800000”,“49009222250000”,“49009225900000”,“49009219970000”,“49009225890000”,“49009225140000”,“49009225760000”,“49009212630000”,“49009205440000”,“49009211590000”,“49009203660000”,“49009203940000”,“49009204340000”,“49009226780000”,“49009220310000”,“49009229730000”,“49009212240000”,“49009214450000”,“49009213790000”,“49009222660000”,“49009227960000”,“49009222100000”,“49009228020000”,“49009228260000”,“49009228290000”,“49009229090000”,“49009228250000”,“49009229340000”,“49009229360000”,“49009227890000”,“49009228010000”,“49009228030000”,“49009228450000”,“49009224160000”,“49009221890000”,“49009222760000”,“49009214980000”,“49009214620000”,“49009213800000”,“49009214380000”,“49009214730000”,“49009228150000”,“49009228190000”,“49009227710000”,“49009215280000”,“49009228940000”,“49009227920000”,“49009227980000”,“49009228170000”,“49009219540000”,“49009227870000”,“49009228370000”,“49009204330000”,“49009205120000”,“49009227860000”,“49009228360000”,“49009228160000”,“49009216100000”,“49009229000000”,“49009229150000”,“49009229490000”,“49009215680000”,“49009229350000”,“49009215210000”,“49009217070000”,“49009216610000”,“49009206800000”,“49009205590000”,“49009206310000”,“49009217960000”,“49009223190000”,“49009210640000”,“49009209260000”,“49009213710000”,“49009212360000”,“49009212740000”,“49009218680000”,“49009210130000”,“49009211420000”,“49009224280000”,“49009213750000”,“49009220880000”,“49009225300000”,“49009218090000”,“49009227720000”,“49009225830000”,“49009223170000”,“49009209370000”,“49009214990000”,“49009207260000”,“49009211540000”,“49009227380000”]

def completions_scraper():


	x = 0
	
	while x < len(apis):
	
		##When you put your mouse over the completions link the wogcc_url is what you find. The str(api[x][3:10] strips the first 3 digits from the API (UWI) and the last 4 as well.##
		
		wogcc_url = "http://wogcc.state.wy.us/wyocompletions.cfm?nApino=" + str(api[x][3:10])
		
		##We are calling on Beautiful Soup to do its magic.##
		
		soup = BeautifulSoup(html_doc, 'html.parser')

		href_tags = soup.find_all('a')
		
		
		print (apis[x])
		
		print str(apis[x][3:10])

                wogcc_request = requests.get(wogcc_url)
		
				
                b = 0 ## This is the counter to keep consistency for our loops.##
        
                CNF = completion_name_file

		completion_name = ""
		
		completion_pattern = re.compile(completion_name_1)
		
		completion_file = re.findall(completion_name, str(href_tags))
		
                CFEL = completion_file_end_link

		final_completion_link = []

                if final_completion_link == 0:
                    pass
                else:
                    while b < len(CFEL): ##This will download the reports.##
                        download1 = requests.get(final_raster_link[b])
                        with open((str(apis[x]) + "_" str(RNF[b].replace("/","")), "IDK" as code:
                        b +=1
              
        x +=1
This is what I get...

Error:
Syntax Error EOL While Scanning String Literal
The error seems to happen at the end of my apis.

Ideas? Any help / ideas are most appreciated!
Reply
#2
Looking at the code, I think you are being overly complicated in the
building of url's and filenames.
Could you please describe simply how you:
  • create each URL from apis
  • create file name from same
just a before and after is sufficient (and desired)
also when making entries in a list, please use standard single or double quotes.
Reply
#3
Post the entire error.
Where did you get the 'api' variable on line 18?

There is something wrong with quotes on line 53. Make sure that you use " not some other symbol which just looks like that.
Wrong open vs. close brackets number on the same line.
This doesn't make sense: "_" str(RNF[b].replace("/","")
Perhaps you missed the '+' symbol. Or... ?
Libe 54 must be indented.
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#4
rather than using:
while x < len(apis):
use:
for entry in apis
example (this just prints a list of all urls):
apis = ['49009229900000','49009226390000','49009278600000','49009226340000','49009200210000','49009065760000',
        '49009201380000','49009230130000','49009278800000','49009222250000','49009225900000','49009219970000',
        '49009225890000','49009225140000','49009225760000','49009212630000','49009205440000','49009211590000',
        '49009203660000','49009203940000','49009204340000','49009226780000','49009220310000','49009229730000',
        '49009212240000','49009214450000','49009213790000','49009222660000','49009227960000','49009222100000',
        '49009228020000','49009228260000','49009228290000','49009229090000','49009228250000','49009229340000',
        '49009229360000','49009227890000','49009228010000','49009228030000','49009228450000','49009224160000',
        '49009221890000','49009222760000','49009214980000','49009214620000','49009213800000','49009214380000',
        '49009214730000','49009228150000','49009228190000','49009227710000','49009215280000','49009228940000',
        '49009227920000','49009227980000','49009228170000','49009219540000','49009227870000','49009228370000',
        '49009204330000','49009205120000','49009227860000','49009228360000','49009228160000','49009216100000',
        '49009229000000','49009229150000','49009229490000','49009215680000','49009229350000','49009215210000',
        '49009217070000','49009216610000','49009206800000','49009205590000','49009206310000','49009217960000',
        '49009223190000','49009210640000','49009209260000','49009213710000','49009212360000','49009212740000',
        '49009218680000','49009210130000','49009211420000','49009224280000','49009213750000','49009220880000',
        '49009225300000','49009218090000','49009227720000','49009225830000','49009223170000','49009209370000',
        '49009214990000','49009207260000','49009211540000','49009227380000']

def get_url():
    for entry in apis:
        yield("http://wogcc.state.wy.us/wyocompletions.cfm?nApino={}".format(entry[3:10]))

def get_all_urls():
    for url in get_url():
        print(url)

get_all_urls()
Reply
#5
Ok - I will test this ASAP! It looks so simple! Thank you! Smile
Reply
#6
Oh LARZ60 - that is sooo cool! Thank you!

I tried it and it worked beautifully. My solution for saving the files is a totally different story. My clunky code referenced things your code removed. As such, I still have clunky code to save the completion files to my computer.

I've looked up the "yield" function and I understand (I think) why its effective here. Now I'm just at a loss as to how to save the files.

Something else I noticed, you don't import anything. I was under the impression you had to use BeautifulSoup among others to scrape data. re? requests? Everything I've read has said I needed these things. How do you get around it? Are these things not required?

Thank you for your time!
Reply
#7
My code was an example of how to handle the list, nothing more, it was not meant to replace your entire script.
As far as saving code, you still have to fetch the files.
I tried one of the URL's and it doesn't find the file. Are you sure this is the proper format, and that the files are where you say they are at.
You may also be able to simplify the download as well. To help with this I need 1 (just one) working URL for one file only.
I tried: http://wogcc.state.wy.us/wyocompletions....o='0922276'
and get 500 error There is a problem with the resource you are looking for, and it cannot be displayed.
Reply
#8
The code I gave you is what I see when I hover over the completions link. Here are the steps I go through when doing it manually:

http://wogcc.state.wy.us/legacywogcce.cfm - click on "well" lower right corner
click on the horse by API number
takes you to a screen to enter the api number (here's where the api[3:10] comes in) I've entered "0922936" and hit enter
If you hover with your mouse over the "Completions" link top right center, WOGCC.state.wy.us/wyocomp.cfm?nAPI=922936. If you click on the completions, it takes you to the place of the file I need to download.

I really admire the way you do this! Can you help me understand why it works? Or better yet, point me in the direction of some learning material that will show me the right way to do it? I've been using a colleague's example because I couldn't find another way.

I appreciate your help!
Reply
#9
Ok, I'll take a look and get back

I'm finding info, but think I can give links to direct pdf download.
this may take a while. If I don't get back within the next few hours, I will in the AM
Reply
#10
You can watch this 'oldie but goodie' video on generators which is what I used in the code above.

https://www.youtube.com/watch?v=5-qadlG7tWo

It still works the same way.
David Beazley is the author of several python books including the python essential reference and the python cookbook.
His videos are fun to watch
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Literal beginner - needs help warriordazza 2 1,753 Apr-27-2020, 11:15 AM
Last Post: warriordazza

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020