Aug-25-2022, 08:42 PM
(Aug-25-2022, 07:47 PM)snippsat Wrote: Change toreqs.content
.
This mean that Bs4 is givenbytes
and it will deal with Unicode,usingreqs.text
it can be mix up between Requests and Bs4.
Encodings
Quote:Any HTML or XML document is written in a specific encoding like ASCII or UTF-8.
But when you load that document into Beautiful Soup, you’ll discover it’s been converted to Unicode:
Unicode,Dammit
guesses correctly most of the time.# importing the modules import requests from bs4 import BeautifulSoup # target url url = "https://www.boshisw.com/boshi/14_14309/" # making requests instance reqs = requests.get(url) # using the BeautifulSoup module soup = BeautifulSoup(reqs.content, 'html.parser') print(type(soup)) # displaying the title print("Title of the website is : ") for title in soup.find_all('title'): print(title.get_text())
Output:我在原始社会当村长最新章节列表_我在原始社会当村长最新章节目录_博仕书屋
thank you, i looking this for hours,
i give you reputation point