BS4 Not Able To Find Text In CSS Comments

digitalmatic7 · (This post was last modified: Feb-27-2018, 03:45 AM by digitalmatic7.)

(Feb-27-2018, 03:16 AM)snippsat Wrote: Why are you trying to parse CSS comments?
Can do it like this,first find style tag the write regex for comments.

1
2
3
4
5
6
7
8
9
10
11

import requests
from bs4 import BeautifulSoup
import re
from pprint import pprint

scrape = requests.get('http://www.seacoastonline.com/news/20171113/lets-not-let-politics-divide-us')
html = scrape.content
soup = BeautifulSoup(html, 'lxml')
style = soup.find('style')
css_comments = re.findall(r'\/\*(.*)\*\/', str(style))
pprint(css_comments)
Output:['houzz page',
 'legacy-header',
 '==== ARTICLE ======',
 'story strip article ad',
 ' cssUpdates branch',
 ' cssUpdates branch',
 ' Buzz widget ',
 '  TERMS OF SERVICE LINK - under viafoura comments submit button ',
 ' TOUT MID ARTICLE PLAYER ',
 ' MOBILE article story stack ',
 ' margin: 0 3vw 0 0; ']

Thanks! This method works for me.

I should have done better to explain what I'm trying to do in my OP (my bad). I just need to scan the entire source code including CSS comments for a keyword (in this case "houzz"), and if it exists take an action.

I had a script that was working for lots of keywords, but since this specific keyword is located in CSS comments it didn't work.

Here's the working code if anyone comes across this thread and needs it:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

import requests
from bs4 import BeautifulSoup
import re
 
scrape = requests.get('http://www.seacoastonline.com/news/20171113/lets-not-let-politics-divide-us', headers={"user-agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36"})
html = scrape.content
soup = BeautifulSoup(html, 'lxml')
 
css_comments = re.findall("houzz", str(soup))
 
if len(css_comments) > 0:
    print("houzz keyword found")
else:
    print("houzz keyword not found")

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Instagram Bot _ Posting Comments	kristianpython	3	5,027	May-23-2020, 12:54 PM Last Post: kristianpython
	Why doesn't my spider find body text?	sigalizer	5	5,386	Oct-30-2019, 11:35 PM Last Post: sigalizer
	Post comments to Wordpress Blog	SergeyLV	1	3,226	Aug-01-2019, 01:38 AM Last Post: Larz60+
	Form add tree comments with mptt	m0ntecr1st0	1	3,124	Feb-23-2019, 01:50 PM Last Post: m0ntecr1st0
	XML Parsing - Find a specific text (ElementTree)	TeraX	3	5,552	Oct-09-2018, 09:06 AM Last Post: TeraX
	How to find particular text from td tag using bs4	Prince_Bhatia	7	7,626	Sep-24-2018, 08:36 PM Last Post: nilamo
	Need comments on content/style of my first project	league55	2	3,677	Jan-24-2018, 08:20 AM Last Post: league55
	Detect comments part using web scraping	seco	7	6,405	Jan-18-2018, 10:06 PM Last Post: seco

BS4 Not Able To Find Text In CSS Comments

User Panel Messages

Announcements