![]() |
Scraping Data from Singapore Turf Club - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Scraping Data from Singapore Turf Club (/thread-31497.html) |
Scraping Data from Singapore Turf Club - singaporeman - Dec-15-2020 Hi Guys, I am currently trying to acquire Singapore racing data currently and I am currently entering it in manually which is time consuming to say the least. I am trying to explore all avenues of trying to automate the data acquisition to make this as convenient as possible. Most basic race meeting data is available via excel spreadsheets. I do not mind copying and pasting this data into my CSV file. However, the data I am interested in is in the "T-Chart" section. The website is dynamic and with my very little coding knowledge I am struggling to automate the scraping of this data. I am interested in the early sectional data, the finishing sectional times (in brackets), the peak speed in km/h and average speed in km/h. To view the peak and average speed you need to click the dropdown menu and pick the highest value in the menu. Here is a link to an example of the results https://racing.turfclub.com.sg/en/race-results/?date=2011-09-09 I am trying to acquire data from current date back to September 9, 2011. Any tips or advice on how to automate this would be greatly appreciated. Thanks guys RE: Scraping Data from Singapore Turf Club - MrBitPythoner - Dec-15-2020 When you download data, you download it as xls. Here is a good module for parsing XLS: https://pypi.org/project/xlrd/ RE: Scraping Data from Singapore Turf Club - MrBitPythoner - Dec-15-2020 Example Code: import xlrd book = xlrd.open_workbook("myfile.xls") print("The number of worksheets is {0}".format(book.nsheets)) print("Worksheet name(s): {0}".format(book.sheet_names())) sh = book.sheet_by_index(0) print("{0} {1} {2}".format(sh.name, sh.nrows, sh.ncols)) print("Cell D30 is {0}".format(sh.cell_value(rowx=29, colx=3))) for rx in range(sh.nrows): print(sh.row(rx)) |