Python Forum
Scraping Data from Singapore Turf Club - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Scraping Data from Singapore Turf Club (/thread-31497.html)



Scraping Data from Singapore Turf Club - singaporeman - Dec-15-2020

Hi Guys, I am currently trying to acquire Singapore racing data currently and I am currently entering it in manually which is time consuming to say the least. I am trying to explore all avenues of trying to automate the data acquisition to make this as convenient as possible. Most basic race meeting data is available via excel spreadsheets. I do not mind copying and pasting this data into my CSV file. However, the data I am interested in is in the "T-Chart" section. The website is dynamic and with my very little coding knowledge I am struggling to automate the scraping of this data.

I am interested in the early sectional data, the finishing sectional times (in brackets), the peak speed in km/h and average speed in km/h. To view the peak and average speed you need to click the dropdown menu and pick the highest value in the menu. Here is a link to an example of the results https://racing.turfclub.com.sg/en/race-results/?date=2011-09-09

I am trying to acquire data from current date back to September 9, 2011. Any tips or advice on how to automate this would be greatly appreciated.

Thanks guys


RE: Scraping Data from Singapore Turf Club - MrBitPythoner - Dec-15-2020

When you download data, you download it as xls.
Here is a good module for parsing XLS:

https://pypi.org/project/xlrd/


RE: Scraping Data from Singapore Turf Club - MrBitPythoner - Dec-15-2020

Example Code:

import xlrd
book = xlrd.open_workbook("myfile.xls")
print("The number of worksheets is {0}".format(book.nsheets))
print("Worksheet name(s): {0}".format(book.sheet_names()))
sh = book.sheet_by_index(0)
print("{0} {1} {2}".format(sh.name, sh.nrows, sh.ncols))
print("Cell D30 is {0}".format(sh.cell_value(rowx=29, colx=3)))
for rx in range(sh.nrows):
    print(sh.row(rx))