Python Forum
How can I get the Middle English and Modern English from this page?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How can I get the Middle English and Modern English from this page?
#2
(Feb-02-2022, 04:45 AM)Pedroski55 Wrote: That is, all the content of p html tags is missing!
No, look carefully, not all of them are empty. I don't know what the empty P tags are for, but this one contains data you need. I am showing a fragment.
Output:
<p><span style="font-family:'book antiqua', palatino">859 <strong>Whilom, as olde stories tellen us,</strong></span><br /><span style="font-family:'book antiqua', palatino"> Once, as old histories tell us,</span><br /><span style="font-family:'book antiqua', palatino"> 860 <strong>Ther was a duc that highte Theseus;</strong></span><br /><span style="font-family:'book antiqua', palatino"> There was a duke who was called Theseus;</span><br /> ... <span style="font-family:'book antiqua', palatino"> 1000 <strong>But shortly for to telle is myn entente.</strong></span><br /><span style="font-family:'book antiqua', palatino"> But briefly to tell is my intent.</span></p>
You need to look inside the P tags for SPAN tags with attribute style="font-family:'book antiqua', palatino". Then the "olde storie" will be in STRONG tags. I believe you wanted to skip this.
Reply


Messages In This Thread
RE: How can I get the Middle English and Modern English from this page? - by ibreeden - Feb-03-2022, 08:33 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  use Xpath in Python :: libxml2 for a page-to-page skip-setting apollo 2 3,688 Mar-19-2020, 06:13 PM
Last Post: apollo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020