Posts: 10
Threads: 1
Joined: Mar 2019
import requests
import urllib.request
from bs4 import BeautifulSoup
from flask import Flask
with requests.session()as c:
url="https://cplus.hit.com.hk/enquiry/vesselScheduleEnquiryAction.do"
vesselName='WAN+HAI+102'
ETBFrom='29-03-2019'
ETBTo='11-04-2019'
query='%E6%90%9C%E7%B4%A2'
c.get(url)
new_url = url+"?vesselName="+vesselName+"&ETBFrom="+ETBFrom+"&ETBTo="+ETBTo+"&query="+query
print(new_url)
r = urllib.request.urlopen(new_url)
soup = BeautifulSoup(r,"html.parser") i am a beginner ,brother can you help ,next step what can i can to get data belew web
https://cplus.hit.com.hk/enquiry/vesselS...C%E7%B4%A2
Posts: 3,458
Threads: 101
Joined: Sep 2016
Looks like you're already getting data from that url.
Are you getting errors?
Posts: 7,324
Threads: 123
Joined: Sep 2016
Mar-29-2019, 04:29 PM
(This post was last modified: Mar-29-2019, 04:29 PM by snippsat.)
You are not using requests.session and when have Requests don't use urllib.
Here with removed import that's not used,and eg get first line data in url.
import requests
from bs4 import BeautifulSoup
url="https://cplus.hit.com.hk/enquiry/vesselScheduleEnquiryAction.do"
vesselName='WAN+HAI+102'
ETBFrom='29-03-2019'
ETBTo='11-04-2019'
query='%E6%90%9C%E7%B4%A2'
#new_url = url+"?vesselName="+vesselName+"&ETBFrom="+ETBFrom+"&ETBTo="+ETBTo+"&query="+query
# Better
new_url = f'{url}?vesselName={vesselName}&ETBFrom={ETBFrom}&ETBTo={ETBTo}&query={query}'
response = requests.get(new_url)
soup = BeautifulSoup(response.content, 'html.parser')
vessel = soup.find('tr', class_="nobgcolorvalue")
for item in vessel.find_all('td'):
print(item.text.strip()) Output: 1
WAN HAI 102
S174
HIT4
2019-04-06
2019-04-07
Posts: 10
Threads: 1
Joined: Mar 2019
Dear Brother
Thanks,how about record2 to last record.
many many thanks
Posts: 7,324
Threads: 123
Joined: Sep 2016
(Mar-29-2019, 04:47 PM)yimchiwai Wrote: Thanks,how about record to last record. You have to inspect page then is normal to use Chrome/Firefox developer tools to look for values needed.
Eg one way to second record.
vessel = soup.find('tr', class_="colorvalue") Web-Scraping part-1
Posts: 10
Threads: 1
Joined: Mar 2019
(Mar-29-2019, 06:07 PM)snippsat Wrote: (Mar-29-2019, 04:47 PM)yimchiwai Wrote: Thanks,how about record to last record. You have to inspect page then is normal to use Chrome/Firefox developer tools to look for values needed.
Eg one way to second record.
vessel = soup.find('tr', class_="colorvalue") Web-Scraping part-1 soup.find just get first row,i? need use soup.find_all?
Posts: 7,324
Threads: 123
Joined: Sep 2016
Mar-29-2019, 11:10 PM
(This post was last modified: Mar-29-2019, 11:11 PM by snippsat.)
(Mar-29-2019, 06:29 PM)yimchiwai Wrote: soup.find just get first row,i? need use soup.find_all? You have to try yourself
Eg get both.
vessel = soup.find_all('td', class_="body")[1]
for item in vessel.find_all('td')[6:-3]:
print(item.text.strip()) Output: 1
WAN HAI 102
S174
HIT4
2019-04-06
2019-04-07
2
WAN HAI 102
W174
HIT4
2019-04-06
2019-04-07
Posts: 10
Threads: 1
Joined: Mar 2019
import requests
from bs4 import BeautifulSoup
url="https://cplus.hit.com.hk/enquiry/vesselScheduleEnquiryAction.do"
vesselName='WAN+HAI+102'
ETBFrom='29-03-2019'
ETBTo='11-05-2019'
query='%E6%90%9C%E7%B4%A2'
new_url = f'{url}?vesselName={vesselName}&ETBFrom={ETBFrom}&ETBTo={ETBTo}&query={query}'
response = requests.get(new_url)
soup = BeautifulSoup(response.content, 'html.parser')
vessel = soup.find_all('td', class_="body")[1]
record=[]
for item in vessel.find_all('td')[6:-3]:
record.append(item.text.strip())
for i in range(0,len(record),6):
print(record[i:i+6]) ['1', 'WAN HAI 102', 'S174', 'HIT4', '2019-04-06', '2019-04-07']
['2', 'WAN HAI 102', 'W174', 'HIT4', '2019-04-06', '2019-04-07']
['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21']
['4', 'WAN HAI 102', 'W175', 'HIT4', '2019-04-20', '2019-04-21']
['5', 'WAN HAI 102', 'S176', 'HIT4', '2019-05-04', '2019-05-05']
['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']
Thanks so much!!brother
any suggestion?
Posts: 7,324
Threads: 123
Joined: Sep 2016
(Mar-30-2019, 05:06 AM)yimchiwai Wrote: any suggestion? Can make new list to get some structure,then can call individual record in list.
new_record = []
for i in range(0,len(record),6):
new_record.append(record[i:i+6]) Test:
>>> new_record
[['1', 'WAN HAI 102', 'S174', 'HIT4', '2019-04-06', '2019-04-07'],
['2', 'WAN HAI 102', 'W174', 'HIT4', '2019-04-06', '2019-04-07'],
['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21'],
['4', 'WAN HAI 102', 'W175', 'HIT4', '2019-04-20', '2019-04-21'],
['5', 'WAN HAI 102', 'S176', 'HIT4', '2019-05-04', '2019-05-05'],
['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']]
>>> new_record[2]
['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21']
>>> new_record[5]
['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']
Posts: 10
Threads: 1
Joined: Mar 2019
(Mar-30-2019, 07:37 AM)snippsat Wrote: (Mar-30-2019, 05:06 AM)yimchiwai Wrote: any suggestion? Can make new list to get some structure,then can call individual record in list.
new_record = []
for i in range(0,len(record),6):
new_record.append(record[i:i+6]) Test:
>>> new_record
[['1', 'WAN HAI 102', 'S174', 'HIT4', '2019-04-06', '2019-04-07'],
['2', 'WAN HAI 102', 'W174', 'HIT4', '2019-04-06', '2019-04-07'],
['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21'],
['4', 'WAN HAI 102', 'W175', 'HIT4', '2019-04-20', '2019-04-21'],
['5', 'WAN HAI 102', 'S176', 'HIT4', '2019-05-04', '2019-05-05'],
['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']]
>>> new_record[2]
['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21']
>>> new_record[5]
['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']
Brother,how to call record1 the third field s174
|