Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 How do I extract specific lines from HTML files before and after a word?
#1
I am trying to extract the 10 lines before and after the word "apple" from a directory (with subdirectories) full of HTML files. I want to print out the lines into a CSV file. Ideally, the CSV file will contain two variables: 1) the HTML filename and 2) the 10 lines before and after the word "apple".

I have done the following:

import glob
import collections
import itertools
import sys
import csv

for filepath in glob.glob('**/*.html', recursive=True):
    with open(filepath) as f:
        before = collections.deque(maxlen=10)
        for line in f:
            if 'apple' in line:
                sys.stdout.writelines(before)
                sys.stdout.write(line)
                sys.stdout.writelines(itertools.islice(f, 10))
            break
        results = before.append(line)
        print(results)
I am currently getting a bunch of rows that say "None" in my terminal when I print the results. What is the issue here?
Quote
#2
Why do you expect that "append" method returns a value?
https://docs.python.org/2/library/collec...que.append
Nothing about the value returned. In case if a function doesn't return a result python always returns None.
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  How to fix looking specific word in a webpage BSOD 0 101 Jun-16-2020, 08:01 PM
Last Post: BSOD
  Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row BrandonKastning 0 221 Mar-22-2020, 06:10 AM
Last Post: BrandonKastning
  How to get the href value of a specific word in the html code julio2000 2 323 Mar-05-2020, 07:50 PM
Last Post: julio2000
  Web crawler extracting specific text from HTML lewdow 1 813 Jan-03-2020, 11:21 PM
Last Post: snippsat
  Extract text between bold headlines from HTML CostasG 1 412 Aug-31-2019, 10:53 AM
Last Post: snippsat
  Getting a specific text inside an html with soup mathieugrimbert 9 5,236 Jul-10-2019, 12:40 PM
Last Post: mathieugrimbert
  [split] How to find a specific word in a webpage and How to count it. marpop 2 968 Mar-12-2019, 08:25 AM
Last Post: snippsat
  .txt return specific lines or strings s_o_what 8 1,130 Feb-08-2019, 11:49 AM
Last Post: snippsat
  BeautifulSoup4, How to get an HTML tag with specific class. Broadsworde 6 3,864 Nov-22-2018, 05:25 PM
Last Post: snippsat
  [Python 3] - Extract specific data from a web page using lxml module Takeshio 9 2,384 Aug-25-2018, 08:46 AM
Last Post: leotrubach

Forum Jump:


Users browsing this thread: 1 Guest(s)