Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
how to get data in web
#1
import requests
import urllib.request
from bs4 import BeautifulSoup
from flask import Flask

with requests.session()as c:
    url="https://cplus.hit.com.hk/enquiry/vesselScheduleEnquiryAction.do"
    vesselName='WAN+HAI+102'
    ETBFrom='29-03-2019'
    ETBTo='11-04-2019'
    query='%E6%90%9C%E7%B4%A2'
    c.get(url)


    new_url = url+"?vesselName="+vesselName+"&ETBFrom="+ETBFrom+"&ETBTo="+ETBTo+"&query="+query

    print(new_url)

    r = urllib.request.urlopen(new_url)
    soup = BeautifulSoup(r,"html.parser")
i am a beginner ,brother can you help ,next step what can i can to get data belew web
https://cplus.hit.com.hk/enquiry/vesselS...C%E7%B4%A2
Reply
#2
Looks like you're already getting data from that url.
Are you getting errors?
Reply
#3
You are not using requests.session and when have Requests don't use urllib.
Here with removed import that's not used,and eg get first line data in url.
import requests
from bs4 import BeautifulSoup

url="https://cplus.hit.com.hk/enquiry/vesselScheduleEnquiryAction.do"
vesselName='WAN+HAI+102'
ETBFrom='29-03-2019'
ETBTo='11-04-2019'
query='%E6%90%9C%E7%B4%A2'

#new_url = url+"?vesselName="+vesselName+"&ETBFrom="+ETBFrom+"&ETBTo="+ETBTo+"&query="+query
# Better
new_url = f'{url}?vesselName={vesselName}&ETBFrom={ETBFrom}&ETBTo={ETBTo}&query={query}'
response = requests.get(new_url)
soup = BeautifulSoup(response.content, 'html.parser')
vessel = soup.find('tr', class_="nobgcolorvalue")
for item in vessel.find_all('td'):
    print(item.text.strip())
Output:
1 WAN HAI 102 S174 HIT4 2019-04-06 2019-04-07
Reply
#4
Dear Brother

Thanks,how about record2 to last record.

many many thanks
Reply
#5
(Mar-29-2019, 04:47 PM)yimchiwai Wrote: Thanks,how about record to last record.
You have to inspect page then is normal to use Chrome/Firefox developer tools to look for values needed.
Eg one way to second record.
vessel = soup.find('tr', class_="colorvalue")
Web-Scraping part-1
Reply
#6
(Mar-29-2019, 06:07 PM)snippsat Wrote:
(Mar-29-2019, 04:47 PM)yimchiwai Wrote: Thanks,how about record to last record.
You have to inspect page then is normal to use Chrome/Firefox developer tools to look for values needed.
Eg one way to second record.
vessel = soup.find('tr', class_="colorvalue")
Web-Scraping part-1
soup.find just get first row,i? need use soup.find_all?
Reply
#7
(Mar-29-2019, 06:29 PM)yimchiwai Wrote: soup.find just get first row,i? need use soup.find_all?
You have to try yourself Undecided
Eg get both.
vessel = soup.find_all('td', class_="body")[1]
for item in vessel.find_all('td')[6:-3]:
    print(item.text.strip())
Output:
1 WAN HAI 102 S174 HIT4 2019-04-06 2019-04-07 2 WAN HAI 102 W174 HIT4 2019-04-06 2019-04-07
Reply
#8
import requests
from bs4 import BeautifulSoup
 
url="https://cplus.hit.com.hk/enquiry/vesselScheduleEnquiryAction.do"
vesselName='WAN+HAI+102'
ETBFrom='29-03-2019'
ETBTo='11-05-2019'
query='%E6%90%9C%E7%B4%A2'
 
new_url = f'{url}?vesselName={vesselName}&ETBFrom={ETBFrom}&ETBTo={ETBTo}&query={query}'
response = requests.get(new_url)
soup = BeautifulSoup(response.content, 'html.parser')
vessel = soup.find_all('td', class_="body")[1]
record=[]
for item in vessel.find_all('td')[6:-3]:
    record.append(item.text.strip())

for i in range(0,len(record),6):
    print(record[i:i+6])
['1', 'WAN HAI 102', 'S174', 'HIT4', '2019-04-06', '2019-04-07']
['2', 'WAN HAI 102', 'W174', 'HIT4', '2019-04-06', '2019-04-07']
['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21']
['4', 'WAN HAI 102', 'W175', 'HIT4', '2019-04-20', '2019-04-21']
['5', 'WAN HAI 102', 'S176', 'HIT4', '2019-05-04', '2019-05-05']
['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']

Thanks so much!!brother

any suggestion?
Reply
#9
(Mar-30-2019, 05:06 AM)yimchiwai Wrote: any suggestion?
Can make new list to get some structure,then can call individual record in list.
new_record = []
for i in range(0,len(record),6):
     new_record.append(record[i:i+6])
Test:
>>> new_record
[['1', 'WAN HAI 102', 'S174', 'HIT4', '2019-04-06', '2019-04-07'],
 ['2', 'WAN HAI 102', 'W174', 'HIT4', '2019-04-06', '2019-04-07'],
 ['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21'],
 ['4', 'WAN HAI 102', 'W175', 'HIT4', '2019-04-20', '2019-04-21'],
 ['5', 'WAN HAI 102', 'S176', 'HIT4', '2019-05-04', '2019-05-05'],
 ['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']]
>>> new_record[2]
['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21']
>>> new_record[5]
['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']
Reply
#10
(Mar-30-2019, 07:37 AM)snippsat Wrote:
(Mar-30-2019, 05:06 AM)yimchiwai Wrote: any suggestion?
Can make new list to get some structure,then can call individual record in list.
new_record = []
for i in range(0,len(record),6):
     new_record.append(record[i:i+6])
Test:
>>> new_record
[['1', 'WAN HAI 102', 'S174', 'HIT4', '2019-04-06', '2019-04-07'],
 ['2', 'WAN HAI 102', 'W174', 'HIT4', '2019-04-06', '2019-04-07'],
 ['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21'],
 ['4', 'WAN HAI 102', 'W175', 'HIT4', '2019-04-20', '2019-04-21'],
 ['5', 'WAN HAI 102', 'S176', 'HIT4', '2019-05-04', '2019-05-05'],
 ['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']]
>>> new_record[2]
['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21']
>>> new_record[5]
['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']

Brother,how to call record1 the third field s174
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020