Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
html data cell attribute issue
#1
hi all,

an attribute that i need to use to identify a <td> contains the keyword 'data'...

for cell in row.find_all('td',data-ending_ = 'RPR'):
SyntaxError: keyword can't be an expression

is there a way around this?
Reply
#2
you need to provide more information.
The URL, and either an xpath, or css selector would be handy
Reply
#3
(May-29-2020, 02:17 AM)Larz60+ Wrote: you need to provide more information.
The URL, and either an xpath, or css selector would be handy

Thanks for your time.

Here's my table data cell

<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85<!----</td>

Unsurprisingly there are other data cells with this class name. But the other attributes(?), which would uniquely identify this cell, can't seem to be referenced in Python because they have the keyword(?) data in their names...
Reply
#4
what is the URL?
Reply
#5
delahug Wrote:for cell in row.find_all('td',data-ending_ = 'RPR'):
Can not add data-ending_= 'RPR' for this attribute,here have to use dictionary in search.
it work for class attribute class_="rp-horseTable__spanNarrow"
Quick test.
from bs4 import BeautifulSoup

html = '''\
<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85<!----</td>'''

soup = BeautifulSoup(html, 'lxml')
Usage test:
>>> td_tag = soup.find('td')
>>> td_tag.attrs
{'class': ['rp-horseTable__spanNarrow'],
 'data-ending': 'RPR',
 'data-test-selector': 'full-result-rpr'}

# Search with data-ending
 td_tag = soup.find('td', {'data-ending': 'RPR'})
<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85</td>

# Search with class
td_tag = soup.find('td', class_="rp-horseTable__spanNarrow")
<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85</td>

# Get text and attributes
>>> td_tag.text
'85'
>>> 
>>> td_tag.get('data-ending')
'RPR'
>>> td_tag.get('class')
['rp-horseTable__spanNarrow']
Reply
#6
(May-30-2020, 04:01 PM)snippsat Wrote:
delahug Wrote:for cell in row.find_all('td',data-ending_ = 'RPR'):
Can not add data-ending_= 'RPR' for this attribute,here have to use dictionary in search.
it work for class attribute class_="rp-horseTable__spanNarrow"
Quick test.
from bs4 import BeautifulSoup

html = '''\
<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85<!----</td>'''

soup = BeautifulSoup(html, 'lxml')
Usage test:
>>> td_tag = soup.find('td')
>>> td_tag.attrs
{'class': ['rp-horseTable__spanNarrow'],
 'data-ending': 'RPR',
 'data-test-selector': 'full-result-rpr'}

# Search with data-ending
 td_tag = soup.find('td', {'data-ending': 'RPR'})
<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85</td>

# Search with class
td_tag = soup.find('td', class_="rp-horseTable__spanNarrow")
<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85</td>

# Get text and attributes
>>> td_tag.text
'85'
>>> 
>>> td_tag.get('data-ending')
'RPR'
>>> td_tag.get('class')
['rp-horseTable__spanNarrow']

Thumbs Up
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Trying to scrape data from HTML with no identifiers pythonpaul32 2 795 Dec-02-2023, 03:42 AM
Last Post: pythonpaul32
  Post HTML Form Data to API Endpoints Dexty 0 1,382 Nov-11-2021, 10:51 PM
Last Post: Dexty
  HTML multi select HTML listbox with Flask/Python rfeyer 0 4,536 Mar-14-2021, 12:23 PM
Last Post: rfeyer
  Cleaning HTML data using Jupyter Notebook jacob1986 7 4,052 Mar-05-2021, 10:44 PM
Last Post: snippsat
  Any way to remove HTML tags from scraped data? (I want text only) SeBz2020uk 1 3,414 Nov-02-2020, 08:12 PM
Last Post: Larz60+
Thumbs Up Issue facing while scraping the data from different websites in single script. Balamani 1 2,077 Oct-20-2020, 09:56 AM
Last Post: Larz60+
  POST request with form data issue web scraping hoff1022 1 2,649 Aug-14-2020, 10:25 AM
Last Post: kashcode
  Extracting html data using attributes WiPi 14 5,335 May-04-2020, 02:04 PM
Last Post: snippsat
  extrat data from a button html windows11 1 1,952 Mar-24-2020, 03:39 PM
Last Post: Larz60+
  Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row BrandonKastning 0 2,329 Mar-22-2020, 06:10 AM
Last Post: BrandonKastning

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020