Python Forum
TypeError: must be str, not ResultSet
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
TypeError: must be str, not ResultSet
#1
I have tried freaking everything and for some reason google searches aren't providing much.

This is code to parse multiple pages of the same url, but for some reason I keep on getting this error.

Here is a fragment of the code that keeps giving the error. It is a part of a larger code, but I, or rather Python, has narrowed the error down to this segment:
import bs4 as bs
import urllib.request
while True:
	list = []
	listalmostfinal = []
	listfinal = []
	part_1 = 'https://www.businesswire.com/portal/site/home/template.PAGE/news/?javax.portlet.tpst=ccf123a93466ea4c882a06a9149550fd&javax.portlet.prp_ccf123a93466ea4c882a06a9149550fd_viewID=MY_PORTAL_VIEW&javax.portlet.prp_ccf123a93466ea4c882a06a9149550fd_ndmHsc=v2*A1515934800000*B1518565773469*DgroupByDate*G'
	part_2 = '*N1000003&javax.portlet.begCacheTok=com.vignette.cachetoken&javax.portlet.endCacheTok=com.vignette.cachetoken'
	page_counter = 1
	pi = 0
	url = ';-;'
	while True:
		page_flip = 'yes'
		for i in list:
			for j in worksheet_list:
				if i == j:
					page_flip = 'no'
		if page_flip == 'no':
			for i in list:
				if i not in listalmostfinal:
					listalmostfinal.append(i)
			if len(listalmostfinal) == 0:
				print('Error New Part')
			break
		else:
			page_counter += 1
			url = (part_1 + str(page_counter) + part_2)
			while True:
				try:
					sauce = urllib.request.urlopen(url).read()
					break
				except:
					time.sleep(30)
					pi += 1
					if pi >= 5:
						print('BW Search Access Failure')
						print(pi)
			soup = bs.BeautifulSoup(sauce, 'lxml')
			for a in soup.find_all('a', class_='bwTitleLink', limit=25):
				initialbusinesswireurls = ('https://www.businesswire.com' + a['href'])
				list.append(initialbusinesswireurls)
			while True:
				if page_counter > 5:
					print('Flipping Pages to Page ' + str(page_counter))
				break
Here is the output:
Flipping Pages to Page 6
Flipping Pages to Page 7
Flipping Pages to Page 8
Flipping Pages to Page 9
Flipping Pages to Page 10
Traceback (most recent call last):
  File "<pyshell#3>", line 74, in <module>
    url = (part_1 + str(page_counter) + part_2)
TypeError: must be str, not ResultSet
>>> 
I'm sure someone experienced with parsing knows what this is but I got no clue. The "url = ';-;'" was my attempt to reset the variable that I thought was causing the problem, since the code works the first time through, just not the second or third. If anyone knows what this is and or how to fix it, please help. Thank you!
#EDIT: Forgot imports in code
#ANOTHER EDIT:
>>> str(page_counter)
[]
This should output a number, but for some reason it is just giving brackets. Is this a result set?
##FINAL EDIT:
Nvm I got it. Turns out I had overwritten 'str' as a variable. I got rid of that and imported 'builtins' just to be safe.
Reply
#2
Use this url: https://www.businesswire.com/portal/site...PAGE/news/

It would really be helpful if you could show the error verbatim.
They contain some very useful information.
Reply
#3
(Feb-15-2018, 06:35 AM)Larz60+ Wrote: Use this url: https://www.businesswire.com/portal/site...PAGE/news/

It would really be helpful if you could show the error verbatim.
They contain some very useful information.

Nvm I got it. Turns out I had overwritten 'str' as a variable. I got rid of that and imported 'builtins' just to be safe.

Thank you for the fast reply though, and have a nice night. Tongue
Reply
#4
(Feb-15-2018, 06:44 AM)HiImNew Wrote: Turns out I had overwritten 'str' as a variable.
you do the same with list()
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Parsing bs4 Resultset gw1500se 4 196 Nov-09-2021, 04:06 AM
Last Post: snippsat
  AttributeError: ResultSet object has no attribute 'get_text' KatMac 1 1,335 May-07-2021, 05:32 PM
Last Post: snippsat
  Python and MYSQL Resultset vj78 2 896 Apr-02-2021, 12:41 AM
Last Post: vj78

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020