Scraping number in % from website

santax · (This post was last modified: Mar-19-2017, 10:49 AM by santax.)

Hi all, for my chessclub I'm trying to automate collecting Timeout percentages.

It hidden in this code: <aside>7.69%</aside>

 <ul class="stats-list no-border">
     <li>
       Winning Streak        <aside>19</aside>
     </li>
             <li>
         Time per Move          <aside>14 hours 15 minutes</aside>
       </li>
                   <li>
         Timeouts          <span class="stats-list-info" tip="Last 3 Months" tip-popup-delay="0"><i class="icon-circle-question"
     
     
         ></i></span>
         <aside>7.69%</aside>
</li>
           <li>
       Glicko RD        <aside>
          73         </aside>
     </li>
             <li>
         Top Opponent          <aside>N/A</aside>
       </li>
         </ul>
 </div>

 <div class="col-md-6">

   <div class="chart-box live">
     <span class="ui-select-search-container">
       <ui-select class="chess-select"
           ng-model="model.selectedOpponent"
           on-select="selectOpponent($item)" ng-cloak>
         <ui-select-match
           placeholder="vs. All Opponents"
           allow-clear="true">
           [[ $select.selected.id ]]
         </ui-select-match>
         <ui-select-choices repeat="opponent in UI.opponents"
           refresh="findOpponents($select.search)"
           refresh-delay="0">
           [[ opponent.id ]]
         </ui-select-choices>
       </ui-select>
     </span>

*******************************************************************************

It's not always a number with decimals, but when it is I can only collect the last 2 decimals, which is a problem.
I need the first digits, or the complete number and it also has to work when the number is 0% or 10% or 100% instead of 24.76%
The code I have is here:

import sys
import fileinput
import requests
from bs4 import BeautifulSoup
import pandas as dataset
import string
import re
from decimal import *

static_profile_url= REMOVED DUE TO ANTISPAM MEASURES
namen = []
timeouts = []


# Zoek tussen stringpatronen en return waarde als string.
# Dit haalt het TO percentage zonder % uit de html
def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start)
        timeout = re.compile(r'(\d+)$').search(s[start:end]).group(1)
        #timeout = (s.split(first))[1].split(last)[0]
        print (timeout)
        return (timeout)
    except ValueError:
        return "error parsing"



def retrieve_timeouts(speler_stats_url):
    try:
        r = requests.get(speler_stats_url)
        soup = BeautifulSoup(r.text, 'lxml')
        #  stats = stat_soup.findAll(class_='stats-list no-border')
        stats = soup.findAll('ul', class_='stats-list no-border')
        timeout_percentage = find_between( str(stats), '<aside>', '%</aside>' )
        print (timeout_percentage)
        return int(timeout_percentage)
    except ValueError:
        return "error parsing"


print('processing, please wait... this may take a long time!')
fnamen = open('namen.txt', 'r')
tnamen = fnamen.read().splitlines()
for naam in tnamen:
    print (naam)
    namen.append(naam)
    timeouts.append(retrieve_timeouts(static_profile_url + str(naam)))
    print (retrieve_timeouts(static_profile_url + str(naam)))

spelersdata = { 'naam': namen, 'timeout': timeouts }
ds = dataset.DataFrame(spelersdata)
f = open('timouts.csv', 'w')
f.writelines(ds.to_csv())
f.close()

I don't know why it's not working, I'm not used too coding in Python, let alone building scrapers.
So my code is made up of a lot of copy pasta...

Could someone please help me out with this one or point me in the right direction?

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	web scraping for new additions/modifed website?	kingoman123	4	2,244	Apr-14-2022, 04:46 PM Last Post: snippsat
	Scraping lender data from Ren Ren Dai website using Python. I will pay for that 200$	Hafedh_2021	1	2,755	May-18-2021, 08:41 PM Last Post: snippsat
	Scraping all website text using Python	MKMKMKMK	1	2,089	Nov-26-2020, 10:35 PM Last Post: Larz60+
	Scraping a Website (HELP)	LearnPython2	1	1,756	May-08-2020, 03:20 PM Last Post: Larz60+
	scraping from a website that hides source code	PIWI_Protein	1	1,965	Mar-27-2020, 05:08 PM Last Post: Larz60+
	Scraping not moving to the next pages in a website	jithin123	0	1,957	Mar-23-2020, 06:10 PM Last Post: jithin123
	Random Loss of Control of Website When Scraping	bmccollum	0	1,516	Aug-30-2019, 04:04 AM Last Post: bmccollum
	MaxRetryError while scraping a website multiple times	kawasso	6	17,464	Aug-29-2019, 05:25 PM Last Post: kawasso
	scraping multiple pages of a website.	Blue Dog	14	22,422	Jun-21-2018, 09:03 PM Last Post: Blue Dog

Scraping number in % from website

User Panel Messages

Announcements