The "FindAll" Error

***snippsat*** · (This post was last modified: Apr-11-2020, 08:07 AM by snippsat.)

(Apr-11-2020, 12:36 AM)BadWhite Wrote: but have you tried to run the code?

Yes.

import requests
from bs4 import BeautifulSoup
#from Data import row

# Collect and parse first page
headers = {'User-agent': 'Mozilla/5.0'}
page = requests.get('https://web.archive.org/web/20121007172955/https://www.nga.gov/collection/anZ1.htm', headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')

# Pull all text from the BodyText div
artist_name_list = soup.find(class_='BodyText')

# Pull text from all instances of <a> tag within BodyText div
artist_name_list_items = artist_name_list.find_all('a')

# Create for loop to print out all artists' names
for artist_name in artist_name_list_items:
    print(artist_name.text)

Output:Zabaglia, Niccola
Zaccone, Fabian
Zadkine, Ossip
Zaech, Bernhard
Zagar, Jacob
Zagroba, Idalia
Zaidenberg, A.
Zaidenberg, Arthur
Zaisinger, Matthäus
Zajac, Jack
Zak, Eugène
Zakharov, Gurii Fillipovich
Zakowortny, Igor
Zalce, Alfredo
Zalopany, Michele
Zammiello, Craig
Zammitt, Norman
Zampieri, Domenico
Zampieri, called Domenichino, Domenico
Zanartú, Enrique Antunez
Zanchi, Antonio
Zanetti, Anton Maria
Zanetti Borzino, Leopoldina
Zanetti I, Antonio Maria, conte
Zanguidi, Jacopo
Zanini, Giuseppe
Zanini-Viola, Giuseppe
Zanotti, Giampietro
Zao Wou-Ki
Zas-Zie
Zie-Zor
nextpage

BadWhite Wrote:why you have added "headers" variable?

That was what i explain first,the site return 455 The request was rejected without user agent.

import requests
from bs4 import BeautifulSoup
#from Data import row

# Collect and parse first page
page = requests.get('https://web.archive.org/web/20121007172955/https://www.nga.gov/collection/anZ1.htm')
print(page.status_code)

Output:
445

So when get this no more scraping is possible,using a user agent we identify as browser in this case Firefox.
The get 200 OK and can continue to scrape.

The problem most be something on your side here a run in a other environment colab.
As you see it work fine there to.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	regex findall() returning weird result	Radical	1	755	Oct-15-2023, 08:47 PM Last Post: snippsat
	Python: re.findall to find multiple instances don't work but search worked	Secret	1	1,308	Aug-30-2022, 08:40 PM Last Post: deanhystad
	regex.findall that won't match anything	xiaobai97	1	2,116	Sep-24-2020, 02:02 PM Last Post: DeaD_EyE
	Regex findall()	NewBeie	2	4,417	Jul-10-2020, 12:19 PM Last Post: DeaD_EyE
	re.findall HELP!!! only returns None	Rusty	10	7,369	Jun-20-2020, 12:13 AM Last Post: Rusty
	Beginner question: lxml's findall in an xml namespace	aecklers	0	3,009	Jan-22-2020, 10:53 AM Last Post: aecklers
	Issue with re.findall	alinaveed786	8	5,085	Oct-20-2018, 09:28 AM Last Post: volcano63
	[Regex] Findall returns wrong number of hits	Winfried	8	6,034	Aug-23-2018, 02:21 PM Last Post: Winfried
	Combining the regex into single findall	syoung	0	2,585	May-28-2018, 10:11 AM Last Post: syoung
	unable to print the list when using re.findall()	satyaneel	5	4,281	Sep-27-2017, 10:26 AM Last Post: buran

The "FindAll" Error

User Panel Messages

Announcements