Use Requests then you get correct website encoding and don't have to guess.
>>> import requests >>> >>> url='http://comment.bilibili.com/182148299.xml' >>> website = requests.get(url) >>> website.encoding 'ISO-8859-1'Together with BS it's like this.
import requests from bs4 import BeautifulSoup import lxml url = 'http://comment.bilibili.com/182148299.xml' website = requests.get(url) dataset = website.content soup = BeautifulSoup(dataset, features='xml') print(soup)
Output:<?xml version="1.0" encoding="utf-8"?>
<i><chatserver>chat.bilibili.com</chatserver><chatid>182148299</chatid><mission>0</mission><maxlimit>1500</maxlimit><state>0</state><real_name>0</real_name><source>k-v</source><d p="10.16400,1,25,16646914,1588986375,0,5f22bd5c,32424172229492743">test</d><d p="4.47900,1,25,16777215,1588986351,0,5f22bd5c,32424159570558983">test</d></i>