Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Unexpected indent / invalid syntax
#21
What's the current error message, in it's entirety? Line 52 looks totally fine (unless url isn't an iterable, which it might be if it's None).
Reply
#22
nilamo - Thanks for responding! Ok - we are now at line 54 (UGH!!)
Error:
C:\Users\toliver\AppData\Local\Programs\Python\Python36\WOGCC_File_Downloads.py File "WOGCC_File_Downloads.py". line 54 except KeyError (there is a caret under the r) SyntaxError: Invalid Syntax
Thanks again!

(May-15-2018, 09:55 PM)nilamo Wrote: What's the current error message, in it's entirety? Line 52 looks totally fine (unless url isn't an iterable, which it might be if it's None).

I've gotten past this one - I had too many spaces on one and no + sign on another. Thanks though! I appreciate your time!
Reply
#23
The error tracebacks are rarely wrong.
Just in case you're not sure how they work, the last displayed line is the culprit most of the time.
All the lines shown before that are the execution order before the error occurred, so you can follow the logic backwards.
Learn to rely on this error display, it's your friend.

By the way, on line 54:
except KeyError
Is missing a ':' on the end.
Reply
#24
Ok - I've tried everything I could think of. In line 65 "for td in tds:" I've tried 'tds():', 'td.txt', and a couple more that I should have kept track of. All of which didn't work.

Any ideas?

Thank you all - I appreciate it!

Error:
c:\Users\toliver\appdata\local\programs\python\python36>python WOGCC_File_Downloads.py File "WOGCC_File_Downloads_.py. line 66 if td.txt SyntaxError: invalid syntax - the caret is above the last t in txt
import requests
from bs4 import BeautifulSoup
from pathlib import Path

class GetCompletions:
    def __init__(self, infile):
        """Above will create a folder called comppdf, and wsgeo wherever the WOGCC
           File Downloads file is run from as well as a text file for my api file to
           reside.
        """
        self.homepath = Path('.')
        self.log.pdfpath = self.homepath / 'comppdf'
        self.log.pdfpath.mkdir(exist_ok=True)
        self.log.pdfpath = self.homepath / 'geocorepdf'
        self.log.pdfpath.mkdir(exist_ok=True)
        self.textpath = self.homepath / 'text'
        self.text.mkdir(exist_ok=True)

        self.infile = self.textpath / infile
        self.api = []

        self.parse_and_save(getpdfs=True)



    def get_url(self):
        for entry in self.apis:
            yield (entry, "http://wogcc.state.wy.us/wyocomp.cfm?nAPI=[]".format(entry[3:10]))
            yield (entry, "http://wogcc.state.wy.us/whatupcores.cfm?autonum=[]".format(entry[3:10]))

        """Above will get the URL that matches my API numbers."""

    def parse_and_save(self, getpdfs=False):
        for file in filelist:
            with file.open('r') as f:
                soup = BeautifulSoup(f.read(), 'lxml')
            if getpdfs:
                links = soup.find_all('a')
                for link in links:
                    url in link['href']
                    if 'www' in url:
                        continue
                    print('downloading pdf at: {}'.format(url))
                    p = url.index('=')
                    response = requests.get(url, stream=True, allow_redirects=False)
                    if response.status_code == 200:
                        try:
                            header_info = response.headers['Content-Disposition']
                            idx = header_info.index('filename')
                            filename = self.log_pdfpath / header[idx+9:]
                        except ValueError:
                            filename = self.log_pdfpath / 'comp{}'.format(url[p+1:])
                            print("couldn't locate filename for {} will use: {}".format(file, filename))
                        except KeyError:
                            filename = self.log_pdfpath / 'comp{}.pdf'.format(url[p+1:])
                            print('got KeyError on {}, respnse.headers = {}'.format(file, response.headers))
                            print('will use name: {}'.format(filename))
                            print(repsonse.headers)
                        with filename.open('wb') as f:
                            f.write(respnse.content)

            sfname = self.textpath / 'summary_{}.txt'.format((file.name.split('_'))[1].split('.')[0][3:10])
            tds = soup.find_all('td')
            with sfname.open('w') as f:
                for td in tds:
                    if td.txt
                        if any(field in td.text for field in self.fields):
                            f.write('{}\n'.format(td.text)

if __name__ == '__main__':
    GetCompletions('api.txt')
Reply
#25
Missing colons on line 66.
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#26
Awesome wavic! I've made that change. Rather than re-post all of the code, I will just post my current issue.

Error:
The caret is on the ':'.
As always - I appreciate your time and help!

if __name__ == '__main__':
    GetCompletions('api.txt')
Reply
#27
                for td in tds:
                    if td.txt
should be (as in original code):
                for td in tds:
                    if td.text:
Reply
#28
Count the parenthesis on line 68. If there is no mistake on the pointed line look for errors on the previous lines.
Post the whole code. It's not so big.
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#29
wavic is correct, missing close parentheses on line 68
Reply
#30
All right! Lars60+ - I've made your change - awesome!

wavic - I've made your change - awesome!

Now I've got a traceback error.

Error:
RESTART: C:\Users\toliver\AppData\Local\Programs\Python\Python36\WOGCC_File_Downloads.py Traceback (most recent call last): File "C:\Users\toliver\AppData\Local\Programs\Python\Python36\WOGCC_File_Downloads.py", line 71, in <module> GetCompletions('api.txt') File "C:\Users\toliver\AppData\Local\Programs\Python\Python36\WOGCC_File_Downloads.py", line 12, in __init__ self.log.pdfpath = self.homepath / 'comppdf' AttributeError: 'GetCompletions' object has no attribute 'log'
Does this mean I need to make the folders? When I installed Python, I selected the 'Add Path' feature. Does this have something to do with this error?

Here is the completely changed code -

import requests
from bs4 import BeautifulSoup
from pathlib import Path

class GetCompletions:
    def __init__(self, infile):
        """Above will create a folder called comppdf, and wsgeo wherever the WOGCC
           File Downloads file is run from as well as a text file for my api file to
           reside.
        """
        self.homepath = Path('.')
        self.log.pdfpath = self.homepath / 'comppdf'
        self.log.pdfpath.mkdir(exist_ok=True)
        self.log.pdfpath = self.homepath / 'geocorepdf'
        self.log.pdfpath.mkdir(exist_ok=True)
        self.textpath = self.homepath / 'text'
        self.text.mkdir(exist_ok=True)

        self.infile = self.textpath / infile
        self.api = []

        self.parse_and_save(getpdfs=True)



    def get_url(self):
        for entry in self.apis:
            yield (entry, "http://wogcc.state.wy.us/wyocomp.cfm?nAPI=[]".format(entry[3:10]))
            yield (entry, "http://wogcc.state.wy.us/whatupcores.cfm?autonum=[]".format(entry[3:10]))

        """Above will get the URL that matches my API numbers."""

    def parse_and_save(self, getpdfs=False):
        for file in filelist:
            with file.open('r') as f:
                soup = BeautifulSoup(f.read(), 'lxml')
            if getpdfs:
                links = soup.find_all('a')
                for link in links:
                    url in link['href']
                    if 'www' in url:
                        continue
                    print('downloading pdf at: {}'.format(url))
                    p = url.index('=')
                    response = requests.get(url, stream=True, allow_redirects=False)
                    if response.status_code == 200:
                        try:
                            header_info = response.headers['Content-Disposition']
                            idx = header_info.index('filename')
                            filename = self.log_pdfpath / header[idx+9:]
                        except ValueError:
                            filename = self.log_pdfpath / 'comp{}'.format(url[p+1:])
                            print("couldn't locate filename for {} will use: {}".format(file, filename))
                        except KeyError:
                            filename = self.log_pdfpath / 'comp{}.pdf'.format(url[p+1:])
                            print('got KeyError on {}, respnse.headers = {}'.format(file, response.headers))
                            print('will use name: {}'.format(filename))
                            print(repsonse.headers)
                        with filename.open('wb') as f:
                            f.write(respnse.content)

            sfname = self.textpath / 'summary_{}.txt'.format((file.name.split('_'))[1].split('.')[0][3:10])
            tds = soup.find_all('td')
            with sfname.open('w') as f:
                for td in tds:
                    if td.text:
                        if any(field in td.text for field in self.fields):
                            f.write('{}\n'.format(td.text))

if __name__ == '__main__':
    GetCompletions('api.txt')
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  IndentationError: unexpected indent in views.py ift38375 1 2,501 Dec-08-2019, 02:33 PM
Last Post: michael1789
  IndentationError: unexpected indent salahhadjar 2 4,347 Nov-04-2018, 06:10 PM
Last Post: salahhadjar

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020