Python Forum
Trying to Tabulate Information from an Aircraft Website Link(s)
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Trying to Tabulate Information from an Aircraft Website Link(s)
#21
Thankyou so much, for all your help snippsat.

I will work, on the Months element of the code, you supplied in your last reply.

How can I if possible ? Carry the Code shown below, over onto another Line. I have typed the following Code,

So that the following Data is shown :- Displays only, and only showing Dakota Spitfire and Hurricane or Dakota and Spitfire or Dakota and Two Spitfires, here is the Code :-

Southport = df[df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'S') & (df['Hurricane'] == 'H') | (df[df['Location'].str.contains('- Display') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'S') | df[df['Location'].str.contains('- Display') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'SS')] 
Southport     
I have the same "Invalid Syntax Error" occuring, I think this is due to the Line being very long. Can the same Code, be carried over, onto another line ? With the same Code I have typed ? How do I do this, if that is the cause of the Error ?

Eddie
Reply
#22
That's way to long Hand
Split it up you eg in step 1 assign dataframe to a variable,that's get used in step 2.
Then in step 2-3 and can continue to do stuff.
# Step 1
display = df[(df['Location'].str.contains('- Display')) & (df['Spitfire'].str.contains('S'))]
display ## look at result,comment out and countiune to step 2

# Step 2
#new_display = display[(display['Lancaster'].str.contains('L')) & (display['Hurricane'].str.contains('H'))]
#new_display ## look at result

# Step 3 eg clean up.
#display_clean = new_display.dropna(axis='columns')
#display_clean ## look at result
Reply
#23
Hi snippsat,

Is there a way I can shorten the line :-

df[df['Location'].str.contains
Also how do you type does not Contain ?

If I get that written shorter, the code will run I think. As When I removed the second df[df['Location'].str.contains('- Display') part of the code, and shortened another part, it ran. My Code is now :-

Southport = df[df['Location'].str.contains('- Display') & (df['Lancaster'] == '') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'S') & (df['Hurricane'] == 'H') | df[df['Location'].str.contains('- Display') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'S') | (df['Spitfire'] == 'SS')] 
Southport
Eddie
Reply
#24
(Jun-20-2019, 01:24 PM)eddywinch82 Wrote: Also how do you type does not Contain ?
~ for not,on multiple condition expression.
!= for not equal on Serie
Selecting Subsets of Data in Pandas.
Quote:If I get that written shorter, the code will run I think. As When I removed the second df[df['Location'].str.contains('- Display') part of the code, and shortened another part, it ran. My Code is now :-
It's still way to long,the line length is 282 characters,hard to read and i get syntax error if test.
PEP-8 has advice for 79 characters,but most think that for short,so when starting to get over 100 characters think of split it up.
Reply
#25
Hi snippsat,

I have had help with this Python Code on www.stackoverflow.com, In terms of the layout of the Code Line, that was very long, and some other ammendments.

My Code now looks like the Following :-

from bs4 import BeautifulSoup
import requests
from pandas import pd

res = request.urlopen("http://web.archive.org/web/20070826230746/http://www.bbmf.co.uk/july07.html")
soup = BeautifulSoup(res)
print(soup)


#Your work from here
table = soup.find_all('table')[0]
df = pd.read_html(str(table))
df = df[1]
df = df.rename(columns=df.iloc[0])
df = df.iloc[2:]
df.head(15)

Southport = df[
        (
            ((df['Location'].str.contains('- Display') & 
            df['Lancaster'] != 'L' & 
            df['Dakota'] == 'D' & 
            df['Spitfire'] == 'S' & 
            df['Hurricane'] == 'H'))
        )
    ] | df[
        (
            ((df['Location'].str.contains('- Display') & 
            df['Lancaster'] != 'L' & 
            df['Dakota'] == 'D' & 
            df['Spitfire'] == 'S' &
            df['Hurricane'] != 'H'))
        )
    ] | df[
        (
            ((df['Location'].str.contains('- Display') & 
            df['Lancaster'] != 'L' & 
            df['Dakota'] == 'D' & 
            df['Spitfire'] == 'SS' &
            df['Hurricane'] != 'H'))
        )
    ]
But when I run the Code, I get the following Traceback Error :-

Error:
KeyError Traceback (most recent call last) c:\python37\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2656 try: -> 2657 return self._engine.get_loc(key) 2658 except KeyError: pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 'Location' During handling of the above exception, another exception occurred: KeyError Traceback (most recent call last) <ipython-input-6-660b90cc0ca2> in <module> 32 df['Hurricane'] == '')) 33 ) ---> 34 ] | df[ 35 ( 36 ((df['Location'].str.contains('- Display') & c:\python37\lib\site-packages\pandas\core\frame.py in __getitem__(self, key) 2925 if self.columns.nlevels > 1: 2926 return self._getitem_multilevel(key) -> 2927 indexer = self.columns.get_loc(key) 2928 if is_integer(indexer): 2929 indexer = [indexer] c:\python37\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2657 return self._engine.get_loc(key) 2658 except KeyError: -> 2659 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2660 indexer = self.get_indexer([key], method=method, tolerance=tolerance) 2661 if indexer.ndim > 1 or indexer.size > 1: pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 'Location' In [ ]: ​ In [ ]: ​ In [ ]: ​ In [ ]: ​ In [ ]: ​ In [ ]: ​ In [ ]: ​
I am not sure what has gone wrong here, any ideas ?

Regards

Eddie
Reply
#26
I am not gone fix that,i still mean you should take in step.
It's like asking 10 question in one sentence Doh
Reply
#27
What does this error mean ? :-
Error:
TypeError Traceback (most recent call last) <ipython-input-20-80c674c582c8> in <module> 29 df['Hurricane'] != 'H')) 30 ) ---> 31 ] | df[ 32 ( 33 ((df['Location'].str.contains('- Display') & TypeError: cannot compare a dtyped [object] array with a scalar of type [bool]
Reply
#28
Hi snippsat,

Could you describe, how I should take it in steps ? Could you give me
a method, showing how I would go about this ?
Reply
#29
(Jun-23-2019, 01:14 PM)eddywinch82 Wrote: Could you describe, how I should take it in steps ?
I have already showed you that in post#22.
Reply
#30
Hi snippsat, Yes sorry forgot about that, getting decent results now, many thanks for your help.

How do you type something or something ? I.e.
(df['Spitfire'].str.contains('S' | '') 
I know what I have typed there, won't work, but how would I type, what I am getting at there ?

Eddie
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to insert Dashed Lines in between Rows of a tabulate output Mudassir1987 0 496 Sep-27-2023, 10:09 AM
Last Post: Mudassir1987
  pandas, tabulate, and alignment menator01 3 7,249 Feb-05-2022, 07:04 AM
Last Post: menator01
  display the result of Dataframe in tabulate format alex80 0 1,387 Sep-09-2020, 02:22 PM
Last Post: alex80
  How to tabulate correctly repeated blocks? Xiesxes 4 2,927 Mar-21-2020, 04:57 PM
Last Post: Xiesxes
  Obtain Geometric Information and name from a map-design website fyec 2 2,422 Aug-08-2018, 05:11 AM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020