Python Forum
Extracting html data using attributes
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extracting html data using attributes
#11
Check html again and use the correct class
Reply
#12
sorry not sure what you mean - the html I posted was a cut and paste from the raw html. I checked again and the 'class' is correct.
Reply
#13
This look like the same problem as you other Thread.
There did't find text because of the strange somewhat strange CSS class= naming.
So there used CSS selector to find text,BS can aslo use CSS selector with select() and select_one()
from bs4 import BeautifulSoup

html = '''\
<tbody class="explorer_tradeslist__tbody">
  <tr id="trade_349236564" data-ticket="349236564" class="explorer_tradeslist__row ">
    <td class="slidetable__cell slidetable__cell--fixed" style="width: 63px; min-width: 63px;">
      <a id="snap_180400_trade_349236564" class="explorer__anchor explorer__anchor--trade"></a>
      NZD/CAD
    </td>
    <td style="width: 20px; min-width: 20px;"></td>
    <td style="width: 103px; min-width: 103px;">
'''

soup = BeautifulSoup(html, 'lxml')
print(soup.select_one('td.slidetable__cell').text.strip()) 
Output:
NZD/CAD
So here i am writing CSS selector bye looking at tags,as mention in other Thread can copy selector from from Browser.
It may be longer,but as as it work it's should be okay.
Reply
#14
actually I have just realised what is going on - the html content I am looking at is .php content and has a load of '\\' stuff in it! If I strip all that out the original code seems to work.
Interestingly if use your code on the raw html that also produces 'Develop' as the output!
Reply
#15
(May-04-2020, 12:12 PM)WiPi Wrote: Interestingly if use your code on the raw html that also produces 'Develop' as the output!
Clearly the problem is that we don't have access to all of the source data,just the data you posted.
Can of course fail in larger source code,where my need to to give more specific search parameter to get correct data.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Trying to scrape data from HTML with no identifiers pythonpaul32 2 795 Dec-02-2023, 03:42 AM
Last Post: pythonpaul32
  Post HTML Form Data to API Endpoints Dexty 0 1,382 Nov-11-2021, 10:51 PM
Last Post: Dexty
  HTML multi select HTML listbox with Flask/Python rfeyer 0 4,536 Mar-14-2021, 12:23 PM
Last Post: rfeyer
  Cleaning HTML data using Jupyter Notebook jacob1986 7 4,052 Mar-05-2021, 10:44 PM
Last Post: snippsat
Smile Extracting the Address tag from multiple HTML files using BeautifulSoup Dredd 8 4,803 Jan-25-2021, 12:16 PM
Last Post: Dredd
  Any way to remove HTML tags from scraped data? (I want text only) SeBz2020uk 1 3,414 Nov-02-2020, 08:12 PM
Last Post: Larz60+
  Easy HTML Parser: Validating trs by attributes several tags deep? runswithascript 7 3,501 Aug-14-2020, 10:58 PM
Last Post: runswithascript
  html data cell attribute issue delahug 5 3,087 May-31-2020, 09:18 AM
Last Post: delahug
  extrat data from a button html windows11 1 1,953 Mar-24-2020, 03:39 PM
Last Post: Larz60+
  Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row BrandonKastning 0 2,329 Mar-22-2020, 06:10 AM
Last Post: BrandonKastning

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020