Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Extract text between bold headlines from HTML
I need to extract text from company transcripts. The files are in HTML format, saved locally in my PC. What I need to do is extract each executive's text. To do this, I would like to have a code which will extract the text after the name of each executive (which is in Bold). Each executive appears many times in the file. So, I would like to have the text of each executive grouped together.
I have found a solution to a similar concept but I do not know how to adapt this to my case as I am really new at Python:

A sample file can be found here:

If anyone could help with this, I would greatly appreciate it.
Can start looking at this Web-Scraping part-1.

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  How do I extract specific lines from HTML files before and after a word? glittergirl 1 1,317 Aug-06-2019, 07:23 AM
Last Post: fishhook
  Getting a specific text inside an html with soup mathieugrimbert 9 502 Jul-10-2019, 12:40 PM
Last Post: mathieugrimbert
  Beutifulsoup: how to pick text that's not in HTML tags? pitonas 4 621 Oct-08-2018, 01:43 PM
Last Post: pitonas
  Decoding html to text string PeterPython 1 513 Aug-12-2018, 07:23 PM
Last Post: Larz60+
  Extract Anchor Text (Scrapy) soothsayerpg 2 1,444 Jul-21-2018, 07:18 AM
Last Post: soothsayerpg
  webscraping - failing to extract specific text from rontar 2 619 May-19-2018, 08:01 AM
Last Post: rontar
  Extract contents from HTML chisox721 2 918 May-10-2018, 09:50 PM
Last Post: chisox721
  html to text problem Kyle 4 1,105 Apr-27-2018, 09:02 PM
Last Post: snippsat
  How to print particular text areas fron an HTML file (not site) Chris 10 2,206 Dec-11-2017, 09:20 AM
Last Post: j.crater
  read text file using python and display its output to html using django amit 0 10,252 Jul-23-2017, 06:14 AM
Last Post: amit

Forum Jump:

Users browsing this thread: 1 Guest(s)