Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Extract text between bold headlines from HTML
I need to extract text from company transcripts. The files are in HTML format, saved locally in my PC. What I need to do is extract each executive's text. To do this, I would like to have a code which will extract the text after the name of each executive (which is in Bold). Each executive appears many times in the file. So, I would like to have the text of each executive grouped together.
I have found a solution to a similar concept but I do not know how to adapt this to my case as I am really new at Python:

A sample file can be found here:

If anyone could help with this, I would greatly appreciate it.
Can start looking at this Web-Scraping part-1.

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Selenium extract id text xzozx 1 173 Jun-15-2020, 06:32 AM
Last Post: Larz60+
  Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row BrandonKastning 0 222 Mar-22-2020, 06:10 AM
Last Post: BrandonKastning
  Web crawler extracting specific text from HTML lewdow 1 815 Jan-03-2020, 11:21 PM
Last Post: snippsat
  Help on parsing simple text on HTML amaumox 5 457 Jan-03-2020, 05:50 PM
Last Post: amaumox
  Extract text from tag content using regular expression Pavel_47 8 743 Nov-25-2019, 03:17 PM
Last Post: buran
  How do I extract specific lines from HTML files before and after a word? glittergirl 1 2,544 Aug-06-2019, 07:23 AM
Last Post: fishhook
  Getting a specific text inside an html with soup mathieugrimbert 9 5,253 Jul-10-2019, 12:40 PM
Last Post: mathieugrimbert
  Beutifulsoup: how to pick text that's not in HTML tags? pitonas 4 1,234 Oct-08-2018, 01:43 PM
Last Post: pitonas
  Decoding html to text string PeterPython 1 835 Aug-12-2018, 07:23 PM
Last Post: Larz60+
  Extract Anchor Text (Scrapy) soothsayerpg 2 2,660 Jul-21-2018, 07:18 AM
Last Post: soothsayerpg

Forum Jump:

Users browsing this thread: 1 Guest(s)