Python Forum
Extracting html data using attributes
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extracting html data using attributes
#11
Check html again and use the correct class
Reply
#12
sorry not sure what you mean - the html I posted was a cut and paste from the raw html. I checked again and the 'class' is correct.
Reply
#13
This look like the same problem as you other Thread.
There did't find text because of the strange somewhat strange CSS class= naming.
So there used CSS selector to find text,BS can aslo use CSS selector with select() and select_one()
from bs4 import BeautifulSoup

html = '''\
<tbody class="explorer_tradeslist__tbody">
  <tr id="trade_349236564" data-ticket="349236564" class="explorer_tradeslist__row ">
    <td class="slidetable__cell slidetable__cell--fixed" style="width: 63px; min-width: 63px;">
      <a id="snap_180400_trade_349236564" class="explorer__anchor explorer__anchor--trade"></a>
      NZD/CAD
    </td>
    <td style="width: 20px; min-width: 20px;"></td>
    <td style="width: 103px; min-width: 103px;">
'''

soup = BeautifulSoup(html, 'lxml')
print(soup.select_one('td.slidetable__cell').text.strip()) 
Output:
NZD/CAD
So here i am writing CSS selector bye looking at tags,as mention in other Thread can copy selector from from Browser.
It may be longer,but as as it work it's should be okay.
Reply
#14
actually I have just realised what is going on - the html content I am looking at is .php content and has a load of '\\' stuff in it! If I strip all that out the original code seems to work.
Interestingly if use your code on the raw html that also produces 'Develop' as the output!
Reply
#15
(May-04-2020, 12:12 PM)WiPi Wrote: Interestingly if use your code on the raw html that also produces 'Develop' as the output!
Clearly the problem is that we don't have access to all of the source data,just the data you posted.
Can of course fail in larger source code,where my need to to give more specific search parameter to get correct data.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Post HTML Form Data to API Endpoints Dexty 0 292 Nov-11-2021, 10:51 PM
Last Post: Dexty
  HTML multi select HTML listbox with Flask/Python rfeyer 0 1,808 Mar-14-2021, 12:23 PM
Last Post: rfeyer
  Cleaning HTML data using Jupyter Notebook jacob1986 7 1,673 Mar-05-2021, 10:44 PM
Last Post: snippsat
Smile Extracting the Address tag from multiple HTML files using BeautifulSoup Dredd 8 1,855 Jan-25-2021, 12:16 PM
Last Post: Dredd
  Any way to remove HTML tags from scraped data? (I want text only) SeBz2020uk 1 1,310 Nov-02-2020, 08:12 PM
Last Post: Larz60+
  Easy HTML Parser: Validating trs by attributes several tags deep? runswithascript 7 1,484 Aug-14-2020, 10:58 PM
Last Post: runswithascript
  html data cell attribute issue delahug 5 1,361 May-31-2020, 09:18 AM
Last Post: delahug
  extrat data from a button html windows11 1 1,053 Mar-24-2020, 03:39 PM
Last Post: Larz60+
  Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row BrandonKastning 0 1,197 Mar-22-2020, 06:10 AM
Last Post: BrandonKastning
  How to POST html data to be handled by a route endpoint nikos 1 1,236 Mar-07-2020, 03:14 PM
Last Post: nikos

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020