Python Forum

Hi All,

I have some html data in the form of pandas Series.
For example I am storing this data in a variable-html_series

Now when I try to apply BeautifulSoup here as -

soup = BeautifulSoup(html_series, "html.parser")
print(soup.prettify())

I am getting below error-
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Can you please tell me what I am missing here?

Thanks!

what exactly is html data in the form of pandas Series? This sounds non-sense to me

Let me explain-
1. Using shoppify api I fetched the json from an ecommerce site.
2. I normalized this json data into dataframe by-

df = json_normalize(result)

3. From this dataframe I take out the html content by-

html_data = df['body_html']

4. Now when I use below code I got the error-

soup = BeautifulSoup(html_data, "html.parser")
print(soup.prettify())

Hope I mentioned everything here.

It look like you try to put json data into BeautifulSoup.
What is the contented html_data?
For it to work it's has to be html.

from bs4 import BeautifulSoup

html_data = '''\
<!DOCTYPE html>
<html>
  <head>
    <title>Title of document</title>
  </head>
  <body>
    <p>Content of the document</p>
  </body>
</html'''

soup = BeautifulSoup(html_data, 'lxml')
print(soup.select('head > title')[0].text)

Output:
Title of document

I have resolved the above error it was due to dataframe normalization.
Now I have raise another ticket with below url-
https://python-forum.io/Thread-How-to-cl...Python-3-6

Please see once and let me know if you can help.
Thanks!

PrateekG

buran

PrateekG

snippsat

PrateekG