Python Forum

Full Version: Extracting data from a website
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello,

I have a very basic problem and would be happy if someone could help me out.

I would like to extract price data of a stock from a website. So far I have used the following code to retrieve the data from the website.

import pandas as pd

url = 'https://www.ariva.de/varta-aktie/kurs'
df = pd.read_html(url)
print(df)
Output:
[ 0 1 2 3 4 0 NaN NaN X-DAX 14.064 +0,47% Dow Jones 31.056 +0,05% ... NaN NaN, 0 1 2 3 4 5 0 X-DAX 14.064 +0,47% Dow Jones 31.056 +0,05% 1 L-TecDAX 3.281 +0,79% Dollarkurs 1222.000 -0,35%, 0 1 2 3 0 NaN NaN NaN NaN 1 124,20 € NaN -1,51% -1,90 €, Handelsplatz Letzter Unnamed: 2 Änderung Änderung.1 Vortag \ 0 Tradegate 124,20 € NaN -1,51% NaN 126,10 € 1 Gettex 124,50 € NaN -1,66% NaN 126,60 € 2 Quotrix 124,40 € NaN -1,58% NaN 126,40 € 3 L&S RT 124,45 € NaN -1,27% NaN 126,05 € 4 HypoVereinsbank 122,05 € NaN -3,21% NaN 126,10 € 5 Xetra 126,90 € NaN -0,08% NaN 127,00 € 6 Stuttgart 124,30 € NaN -1,43% NaN 126,10 € 7 Frankfurt 121,00 € NaN -4,20% NaN 126,30 € 8 Hamburg 121,00 € NaN -4,35% NaN 126,50 € 9 München 125,80 € NaN -0,32% NaN 126,20 € 10 Berlin 125,80 € NaN -0,16% NaN 126,00 € 11 Düsseldorf 125,40 € NaN -0,24% NaN 125,70 € 12 Nasdaq OTC Other 146,34 $ NaN +8,40% NaN 135,00 $ 13 Hannover 128,50 € NaN +2,23% NaN 125,70 € 14 Wien 126,50 € NaN +1,36% NaN 124,80 €
Now, I want to create a variable that shows only the first price of "Xetra" (= 126,90 €) and nothing else. But I do not know how to reference this variable in the table in this specific context.

Any help would be highly appreciated!

Many thanks,

Tim
Something like this.
import pandas as pd

url = 'https://www.ariva.de/varta-aktie/kurs'
df = pd.read_html(url)
# Get right table from html
df = df[3]
>>> df_new = df.iloc[[5]]
>>> df_new
  Handelsplatz   Letzter  Unnamed: 2  ...      Zeit  Unnamed: 10  Unnamed: 11
5        Xetra  126,90 €         NaN  ...  08.01.21          NaN  Hist. Kurse

[1 rows x 12 columns]

>>> value = df_new['Letzter'].values[0]
>>> value
'126,90\xa0€'
>>> print(value)
126,90 €
(Jan-08-2021, 11:14 PM)snippsat Wrote: [ -> ]Something like this.
import pandas as pd

url = 'https://www.ariva.de/varta-aktie/kurs'
df = pd.read_html(url)
# Get right table from html
df = df[3]
>>> df_new = df.iloc[[5]]
>>> df_new
  Handelsplatz   Letzter  Unnamed: 2  ...      Zeit  Unnamed: 10  Unnamed: 11
5        Xetra  126,90 €         NaN  ...  08.01.21          NaN  Hist. Kurse

[1 rows x 12 columns]

>>> value = df_new['Letzter'].values[0]
>>> value
'126,90\xa0€'
>>> print(value)
126,90 €

Many thanks snippsat, really helpful to know!