Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
web scrap currency
#1
good day
please help me
i wanna get some data about currency rates from our local site "http://nationalbank.kz/?furl=cursFull&switch=rus".
and can't get data under "td" tag such as: currency name, currency ratio, currency costs.
i can't filter it by class or other tag.
can anyone tell how extract that data preferably separately by fields?
Reply
#2
We are glad to help, but noone will write code for you.
Post what you have tried in Python code tags (you can find help here). Also include the actual result you get vs. desired result.
Reply
#3
nobody must write for me. that is my task.
maybe someone can give a direction?
i tried to use lambda function, class2_2 = class2[0].get_text() and simple cycle like for ... in ... : print

part of actual result output:
<td align="center" class="gen7"></td>
<td align="left" class="gen7">

1 ДОЛЛАР США

</td>
<td align="center" class="gen7">USD / KZT</td>
<td align="center" class="gen7">330.78</td>
<td align="left" class="gen7" valign="middle" width="10">
</td>
<td align="center" class="gen7"></td>
<td align="left" class="gen7">

1 ЕВРО

</td>
<td align="center" class="gen7">EUR / KZT</td>
<td align="center" class="gen7">387.31</td>
<td align="left" class="gen7" valign="middle" width="10">
</td>

desired result:
1 ДОЛЛАР США USD / KZT 330.78
Reply
#4
(Jun-04-2018, 08:07 AM)ikeen Wrote: nobody must write for me. that is my task.
maybe someone can give a direction?
i tried to use lambda function, class2_2 = class2[0].get_text() and simple cycle like for ... in ... : print

as @j.crater said - we need to see your code in python tags.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#5
from urllib.request import urlopen #as url
import pprint
from bs4 import BeautifulSoup, Comment #as bsobj
from xml.etree import ElementTree as ET
import string as s
import pandas
 
idValI = (5, 6)
id_cur = idValI
dollar = "1 ДОЛЛАР США"
id_dollar = '5'
evro = "1 ЕВРО"
id_evro = '6'

def getTitle(url):
    try:
        html = urlopen(url)
    except HTTPError as e:
        print (e)
        return None
    try:
        obj = BeautifulSoup(html.read())
        title = obj.body.h1
    except AttributeError as e:
        return None
    return title

html = urlopen("http://nationalbank.kz/?furl=cursFull&switch=rus")
bsobj = BeautifulSoup(html,"html.parser")

class1 = bsobj.select("div h1")
doc = class1[0].get_text() ## date of courses
print (doc.replace('\t', ''))
#toc =  table of courses
 
cur_search = {'dollar', 'evro'}
class2 = bsobj.find_all("td", {"class":"gen7"})
result = {}
#class2_2 = class2[0].get_text()
for toc in class2:
    if 'USD / KZT' in class2: print (toc)
    elif class2 == 'EUR / KZT': print (toc)
    print (toc)
Reply
#6
import requests
from bs4 import BeautifulSoup
url = 'http://nationalbank.kz/?furl=cursFull&switch=rus'

resp = requests.get(url)
soup = BeautifulSoup(resp.text, 'html.parser')
tbl = soup.find('table', {'class':'gen4'})
for tr in tbl.find_all('tr'):
    my_data = tr.find_all('td', {'class':'gen7'})
    print(','.join(td.text.strip() for td in my_data))
Output:
1 АВСТРАЛИЙСКИЙ ДОЛЛАР,AUD / KZT,249.54,, 1 АЗЕРБАЙДЖАНСКИЙ МАНАТ,AZN / KZT,195.38,, 10 АРМЯНСКИЙ ДРАМ,AMD / KZT,6.86,, 1 БЕЛОРУССКИЙ РУБЛЬ,BYN / KZT,165.14,, 1 БРАЗИЛЬСКИЙ РЕАЛ,BRL / KZT,88.86,, 10 ВЕНГЕРСКИХ ФОРИНТОВ,HUF / KZT,12.12,, 1 ГОНКОНГСКИЙ ДОЛЛАР,HKD / KZT,42.16,, 1 ГРУЗИНСКИЙ ЛАРИ,GEL / KZT,133.92,, 1 ДАТСКАЯ КРОНА,DKK / KZT,52.05,, 1 ДИРХАМ ОАЭ,AED / KZT,90.06,, 1 ДОЛЛАР США,USD / KZT,330.78,, 1 ЕВРО,EUR / KZT,387.31,, 1 ИНДИЙСКАЯ РУПИЯ,INR / KZT,4.93,, 1000 ИРАНСКИЙ РИАЛ,IRR / KZT,7.85,, 1 КАНАДСКИЙ ДОЛЛАР,CAD / KZT,255.63,, 1 КИТАЙСКИЙ ЮАНЬ,CNY / KZT,51.56,, 1 КУВЕЙТСКИЙ ДИНАР,KWD / KZT,1094.94,, 1 КЫРГЫЗСКИЙ СОМ,KGS / KZT,4.84,, 1 МАЛАЗИЙСКИЙ РИНГГИТ,MYR / KZT,83.17,, 1 МЕКСИКАНСКИЙ ПЕСО,MXN / KZT,16.64,, 1 МОЛДАВСКИЙ ЛЕЙ,MDL / KZT,19.71,, 1 НОРВЕЖСКАЯ КРОНА,NOK / KZT,40.64,, 1 ПОЛЬСКИЙ ЗЛОТЫЙ,PLN / KZT,89.8,, 1 РИЯЛ САУДОВСКОЙ АРАВИИ,SAR / KZT,88.2,, 1 РОССИЙСКИЙ РУБЛЬ,RUB / KZT,5.32,, 1 СДР,XDR / KZT,468.62,, 1 СИНГАПУРСКИЙ ДОЛЛАР,SGD / KZT,247.37,, 1 ТАДЖИКСКИЙ СОМОНИ,TJS / KZT,36.7,, 1 ТАЙСКИЙ БАТ,THB / KZT,10.34,, 1 ТУРЕЦКАЯ ЛИРА,TRY / KZT,71.8,, 100 УЗБЕКСКИХ СУМОВ,UZS / KZT,4.14,, 1 УКРАИНСКАЯ ГРИВНА,UAH / KZT,12.67,, 1 ФУНТ СТЕРЛИНГОВ СОЕДИНЕННОГО КОРОЛЕВСТВА,GBP / KZT,440.24,, 1 ЧЕШСКАЯ КРОНА,CZK / KZT,15.01,, 1 ШВЕДСКАЯ КРОНА,SEK / KZT,37.68,, 1 ШВЕЙЦАРСКИЙ ФРАНК,CHF / KZT,335.99,, 1 ЮЖНО-АФРИКАНСКИЙ РАНД,ZAR / KZT,26.13,, 100 ЮЖНО-КОРЕЙСКИХ ВОН,KRW / KZT,30.8,, 1 ЯПОНСКАЯ ЙЕНА,JPY / KZT,3.03,, >>>
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#7
thank you very much!
your solution much better than my attempts. and now i can go forward.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Web scrap --Need help Lizardpython 4 953 Oct-01-2023, 11:37 AM
Last Post: Lizardpython
  I tried every way to scrap morningstar financials data without success so far sparkt 2 8,170 Oct-20-2020, 05:43 PM
Last Post: sparkt
  Web scrap multiple pages anilacem_302 3 3,783 Jul-01-2020, 07:50 PM
Last Post: mlieqo
  Need logic on how to scrap 100K URLs goodmind 2 2,569 Jun-29-2020, 09:53 AM
Last Post: goodmind
  Scrap a dynamic span hefaz 0 2,659 Mar-07-2020, 02:56 PM
Last Post: hefaz
  scrap by defining 3 functions zarize 0 1,833 Feb-18-2020, 03:55 PM
Last Post: zarize
  Skipping anti-scrap zarize 0 1,853 Jan-17-2020, 11:51 AM
Last Post: zarize
  Cannot get selenium to scrap past the first two pages newbie_programmer 0 4,132 Dec-12-2019, 06:19 AM
Last Post: newbie_programmer
  Scrap data from not standarized page? zarize 4 3,241 Nov-25-2019, 10:25 AM
Last Post: zarize
  page impossible to scrap? :O zarize 2 3,881 Oct-03-2019, 02:44 PM
Last Post: zarize

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020