Sep-18-2021, 10:25 AM
Hi all,
I'm not an expert coding in python and have used it a few times to write things like the following example code which is meant to extract some information via scrapping from Yahoo finance.
The code is:
Hence - my interpretation of the issue - the code is pulling back a different result using urllib3 than I get from the browser, and I'm a little baffled as to why. I assume the web server is noticing it is being called from Python somehow based on the values in the request, and then returning a different response.
Does anyone know why this is occurring and how I can work around it? Is it actually caused for the reason I said above? I am actually very interested in this intellectually as well as resolving the issue in the script. I find it very odd that a server script would respond in different ways purposefully like this (if that is indeed the cause).
TIA!
Steve.
I'm not an expert coding in python and have used it a few times to write things like the following example code which is meant to extract some information via scrapping from Yahoo finance.
The code is:
from bs4 import BeautifulSoup import requests as req import re import urllib3 import sys # Parse input command line parameters. if (len(sys.argv) > 1): sStockTicker = sys.argv[1] else: # TODO: Change this to uncomment the exit when debug is finished print('Invalid usage. A stock ticker must be passed as parameter.') sStockTicker='FMG.AX' #sys.exit(2) url='https://finance.yahoo.com/quote/' + sStockTicker req = urllib3.PoolManager() res = req.request("GET", url) soup3 = BeautifulSoup(res.data,'lxml') print (soup3.find(id="quote-header-info").contents[2].contents[0].contents[0].contents[0].text)Now, depending on what stock I run this for (the parameter), the code will generate data the has the same content as the web page that I would see via accessing the same URL in a web browser, such as Chrome/Firefox, and then extract the stock price. It does not always do this though. If I run the script for "IBM", it will work fine. If I run it for "FMG.AX" it will work fine. However, if I run it for "IOZ.AX" it will fail. If I paste the same URL into the web browser, it will load perfectly fine and show the expected results.
Hence - my interpretation of the issue - the code is pulling back a different result using urllib3 than I get from the browser, and I'm a little baffled as to why. I assume the web server is noticing it is being called from Python somehow based on the values in the request, and then returning a different response.
Does anyone know why this is occurring and how I can work around it? Is it actually caused for the reason I said above? I am actually very interested in this intellectually as well as resolving the issue in the script. I find it very odd that a server script would respond in different ways purposefully like this (if that is indeed the cause).
TIA!
Steve.