![]() |
Trying to Tabulate Information from an Aircraft Website Link(s) - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Trying to Tabulate Information from an Aircraft Website Link(s) (/thread-19158.html) |
RE: Trying to Tabulate Information from an Aircraft Website Link(s) - snippsat - Jun-17-2019 You have to mouse and drag over code in cell,when all is blue copy it. There is also View as Code button at top,this show code in clear text.
RE: Trying to Tabulate Information from an Aircraft Website Link(s) - eddywinch82 - Jun-17-2019 Quote:You have to mouse and drag over code in cell,when all is blue copy it. Yes that is what I did, and it doesn't work. I suppose I may have to type, all the Code out ? RE: Trying to Tabulate Information from an Aircraft Website Link(s) - snippsat - Jun-17-2019 Quote:Yes that is what I did, and it doesn't work. I suppose I may have to type, all the Code out ?What doesn't work,this is just a standard copy of text,an no you shall not type any code at all. You mark so code so is blue then Ctrl+c or right click mouse over text an copy.Then Ctrl+v or right mouse paste text.This is basic commands that work everywhere ![]() If you want only text in browser click on View as Code button on top.Also there is standard copy and paste of text/code. RE: Trying to Tabulate Information from an Aircraft Website Link(s) - eddywinch82 - Jun-17-2019 Hi snippsat, Copying text, then Holding down Ctrl and V, worked, Many thanks ![]() Eddie Hi snippsat, I have got the jist, of what I needed to do :- import pandas as pd import requests from bs4 import BeautifulSoup #from tabulate import tabulate res = requests.get("http://web.archive.org/web/20070701133815/http://www.bbmf.co.uk/june07.html") soup = BeautifulSoup(res.content,'lxml') table = soup.find_all('table')[0] df = pd.read_html(str(table)) #print( tabulate(df[0], headers='keys', tablefmt='psql') ) # Clean up,put index(Date location...) at top,delete 2 first row df = df[1] df = df.rename(columns=df.iloc[0]) df = df.iloc[2:] df.head(15) # Lydd - Display. And that only, had the Spitfire Hurricane and Dakota booked # Here Lydd -- Spitfire Hurricane,there where none where all where booked Southport = df[(df['Dakota'] == "D") & (df['Spitfire'] == 'S') & (df['Hurricane'] == 'H')] SouthportSo I got the data from the month of June for example, showing all Spitfire Hurricane and Dakota only, no other combinations of booked appearances. Just wondering how do I have, only events with - Display next to them showing ? and have all the Dates showing for the events, most say NaN next to them ? Eddie RE: Trying to Tabulate Information from an Aircraft Website Link(s) - eddywinch82 - Jun-18-2019 Tried adding on to the end :- # Lydd - Display. And that only, had the Spitfire Hurricane and Dakota booked # Here Lydd -- Spitfire Hurricane,there where none where all where booked Southport = df[(df['Dakota'] == "D") & (df['Spitfire'] == 'S') & (df['Hurricane'] == 'H'].str.contains("- Display")] SouthportSo that only Locations, with - Display next to them show. Didn't work when I ran the Code though "Invalid Syntax". I have sorted part of the Code, to only show the Displays, here is the end part of the Code :- Southport = df[df['Location'].str.contains('- Display') & (df['Dakota'] == "D") & (df['Spitfire'] == 'S') & (df['Hurricane'] == 'H')] Southport RE: Trying to Tabulate Information from an Aircraft Website Link(s) - eddywinch82 - Jun-18-2019 I Want the table not to show, LSHD, i.e. Lancaster Spitfire Hurricane and Dakota, i.e. Locations that have all 4 booked. So only SHD Displays are shown in the table, here is the end part, of the Code. So when the Lancaster Column value, is NaN i.e. nothing :- Southport = df[df['Location'].str.contains('- Display') & df[df['Lancaster'].str.contains('') & (df['Dakota'] == 'D') & (df['Spitfire'] == 'S') & (df['Hurricane'] == 'H')] SouthportBut I get the following Traceback Error :- Where have I gone wrong ?Eddie RE: Trying to Tabulate Information from an Aircraft Website Link(s) - eddywinch82 - Jun-18-2019 I found the following part of a code :- urlbase = "https://www.olx.in/coimbatore/?&page=" for x in range (4)[1:]: res = requests.get(urlbase + str(x))How can I adapt that Code, so requests will go through all links, to produce the necessary Data, i.e. display all SHD Bookings only, for the Whole year ? I have :- res = requests.get("http://web.archive.org/web/20070701133815/http://www.bbmf.co.uk/september07.html")And it is the end bit, that is the only part that is different, in each Url. There are 7 months i.e. March to September, and the Url differs only by the following ending, i.e. march07.html april07.html may07.html etc. Eddie RE: Trying to Tabulate Information from an Aircraft Website Link(s) - eddywinch82 - Jun-18-2019 I also found the following Code, works to delete the Lancaster Column :- del df['Lancaster'] RE: Trying to Tabulate Information from an Aircraft Website Link(s) - eddywinch82 - Jun-19-2019 Hi, can anyone help me ? RE: Trying to Tabulate Information from an Aircraft Website Link(s) - snippsat - Jun-19-2019 Hi i have been busy lately and do have much time for answers. Can take a little now. (Jun-18-2019, 01:14 PM)eddywinch82 Wrote: But I get the following Traceback Error :-Have to be careful with () [] count when have so long line.Southport = df[(df['Location'].str.contains('- Display')) & (df['Lancaster'].str.contains('L')) | (df['Dakota'] == 'D') & (df['Spitfire'] == 'S') & (df['Hurricane'] == 'H')] SouthportI did tow in a | (mean or ) to get some matches.& (means and )Quote:There are 7 months i.e. March to September, and the Url differs only by the following ending, i.e. march07.html april07.html may07.html etc.Make months. >>> import calendar >>> >>> months = filter(None, calendar.month_name) >>> months = list(months) >>> months ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'] >>> months[2:5] ['March', 'April', 'May'Use in url. import pandas as pd import requests from bs4 import BeautifulSoup import calendar months = filter(None, calendar.month_name) months = list(months) new_table = [] for month in months[2:5]: res = requests.get(f"http://web.archive.org/web/20070701133815/http://www.bbmf.co.uk/{month}07.html") soup = BeautifulSoup(res.content,'lxml') table = soup.find_all('table')[0] new_table.append(soup.find_all('table')[0]) print(new_table)In Pandas eg May month. import pandas as pd import requests from bs4 import BeautifulSoup import calendar months = filter(None, calendar.month_name) months = list(months) new_table = [] for month in months[2:5]: res = requests.get(f"http://web.archive.org/web/20070701133815/http://www.bbmf.co.uk/{month}07.html") soup = BeautifulSoup(res.content,'lxml') table = soup.find_all('table')[0] new_table.append(soup.find_all('table')[0]) df = pd.read_html(str(new_table[2])) Quote:How can I adapt that Code, so requests will go through all links, to produce the necessary Data, i.e. display all SHD Bookings only, for the Whole year ?Same way as i showed you over with that other url,and as this is a new task you should try on or own or make a new Thread.These task(not the easiest) you will get stuck a lot,when have missing basic Python knowledge ![]() |