Scrape for html based on url string and output into csv

Thread Rating:

0 Vote(s) - 0 Average
1
2
3
4
5

Thread Modes

Scrape for html based on url string and output into csv

snippsat

Administrators

Posts: 7,101

Threads: 122

Joined: Sep 2016

Reputation: 499

#10

Jan-12-2021, 11:37 AM (This post was last modified: Jan-12-2021, 11:37 AM by snippsat.)

Look at content you get back,eg print(soup).

<noscript>This '
 'site requires Javascript to work,.....

So i don't know if just test this on a server(that make this more difficult) that may not be needed for this this task.
Usually when a site use a lot of Javascript can use Selenium

As this is just a simple test of a server that not may be needed for this task,can bypass it be passing in the cookie.

from bs4 import BeautifulSoup
from requests import get

page = "http://py123.epizy.com/index.html"
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
cookies = {'__test': '1bb6e881021f013463740eeb74840b18'}
content = get(page, headers=headers,  cookies=cookies).content
soup = BeautifulSoup(content, "lxml")

table_info = soup.select_one('.table-info')
mail = table_info.select_one('.col-2 a')
mail = mail.get('href')
mail_clean = mail.split(':')[1]
print(mail_clean)

Output:
[email protected]

dana likes this post

Find

Messages In This Thread

Scrape for html based on url string and output into csv - by dana - Jan-10-2021, 08:52 PM

RE: Scrape for html based on url string and output into csv - by snippsat - Jan-10-2021, 09:58 PM

RE: Scrape for html based on url string and output into csv - by dana - Jan-11-2021, 12:19 AM

RE: Scrape for html based on url string and output into csv - by snippsat - Jan-11-2021, 12:06 PM

RE: Scrape for html based on url string and output into csv - by dana - Jan-11-2021, 11:49 PM

RE: Scrape for html based on url string and output into csv - by snippsat - Jan-12-2021, 01:13 AM

RE: Scrape for html based on url string and output into csv - by dana - Jan-12-2021, 02:59 AM

RE: Scrape for html based on url string and output into csv - by snippsat - Jan-12-2021, 03:34 AM

RE: Scrape for html based on url string and output into csv - by dana - Jan-12-2021, 10:10 AM

RE: Scrape for html based on url string and output into csv - by snippsat - Jan-12-2021, 11:37 AM

RE: Scrape for html based on url string and output into csv - by dana - Jan-12-2021, 08:11 PM

RE: Scrape for html based on url string and output into csv - by dana - Jan-12-2021, 11:48 PM

RE: Scrape for html based on url string and output into csv - by dana - Jan-13-2021, 01:44 PM

RE: Scrape for html based on url string and output into csv - by snippsat - Jan-13-2021, 03:52 PM

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Trying to scrape data from HTML with no identifiers	pythonpaul32	2	915	Dec-02-2023, 03:42 AM Last Post: pythonpaul32
	Python Obstacles \| Kung-Fu \| Full File HTML Document Scrape and Store it in MariaDB	BrandonKastning	5	2,965	Dec-29-2021, 02:26 AM Last Post: BrandonKastning
	Python Obstacles \| Karate \| HTML/Scrape Specific Tag and Store it in MariaDB	BrandonKastning	8	3,232	Nov-22-2021, 01:38 AM Last Post: BrandonKastning
	HTML multi select HTML listbox with Flask/Python	rfeyer	0	4,702	Mar-14-2021, 12:23 PM Last Post: rfeyer
	Pandas tuple list returning html string	shansaran	0	1,757	Mar-23-2020, 08:44 PM Last Post: shansaran
	Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row	BrandonKastning	0	2,402	Mar-22-2020, 06:10 AM Last Post: BrandonKastning
	scrape data 1 go to next page scrape data 2 and so on	alkaline3	6	5,279	Mar-13-2020, 07:59 PM Last Post: alkaline3
	How do I get rid of the HTML tags in my output?	glittergirl	1	3,763	Aug-05-2019, 08:30 PM Last Post: snippsat
	Formatting Output after Web Scrape	yoitspython	2	2,506	Jul-30-2019, 08:39 PM Last Post: yoitspython
	Basic Syntax/HTML Scrape Questions	sungar78	5	3,839	Sep-06-2018, 09:32 PM Last Post: sungar78

Users browsing this thread: 2 Guest(s)

View a Printable Version

Scrape for html based on url string and output into csv

User Panel Messages

Announcements