Hi all,
I'm new to coding and thought I would try and practice a little website scraping.
I've come across an instance where an element (?) I'm trying to retrieve is duplicated and I'm not sure how to extract only one instance of this value..
Here's the code I'm using (please note that "my practice url" does contain an actual url of a website in my code!):
I read about pandas and I've installed that and tried adding: import pandas as pd to the top of my code, but after reading more about it and watching a few videos, I'm not sure how to apply it to my code.
Could you please advise me on what needs to be done?
Thanking you
I'm new to coding and thought I would try and practice a little website scraping.
I've come across an instance where an element (?) I'm trying to retrieve is duplicated and I'm not sure how to extract only one instance of this value..
Here's the code I'm using (please note that "my practice url" does contain an actual url of a website in my code!):
from urllib.request import urlopen from bs4 import BeautifulSoup from urllib.request import Request, urlopen url = "[my practice url]" soup = BeautifulSoup(webpage, 'html.parser') prices = soup.findAll("div", {"class": "col price"}) for price in prices: income = price.span.text print(income)So to illustrate what's happening and make it clearer, if I comment out the loop component just to get a clearer picture of what the data contains, and then print(prices), here's an extract of what I get:
Output:</div>, <div class="col price"><span>$139,501</span></div>, <div class="col price">
<span>$139,501</span>
</div>, <div class="col price"><span>$137,349</span></div>, <div class="col price">
<span>$137,349</span>
</div>, <div class="col price"><span>$132,955</span></div>, <div class="col price">
<span>$132,955</span>
</div>, <div class="col price"><span>$129,000</span></div>, <div class="col price">
<span>$129,000</span>
So as you can see, within each line, the price amount is duplicated between the span tag. So, if I was to run the code above with the loop enabled, I'd see: Output:$139,501
$139,501
$137,349
$137,349
$132,955
$132,955
$129,000
$129,000
What I'm trying to achieve, is to obtain one instance of each number- but without any experience, I'm stuck. I read about pandas and I've installed that and tried adding: import pandas as pd to the top of my code, but after reading more about it and watching a few videos, I'm not sure how to apply it to my code.
Could you please advise me on what needs to be done?
Thanking you

perfringo write May-22-2021, 05:16 AM:
Nobody expects the Spanish Inquisition! Our chief weapon is surprise! Surprise and fear. Fear and surprise. Let me tell you something: when you're looking at your thread tonight and manic silence meets you don't come cryin' to me. Instead do use respective tags while posting code, output and errors (refer to BBCode help). This empowers others to help you. And... Always Look on the Bright Side of Life: I added them this time but if in the future you do it all by yourself I feel happy.
Nobody expects the Spanish Inquisition! Our chief weapon is surprise! Surprise and fear. Fear and surprise. Let me tell you something: when you're looking at your thread tonight and manic silence meets you don't come cryin' to me. Instead do use respective tags while posting code, output and errors (refer to BBCode help). This empowers others to help you. And... Always Look on the Bright Side of Life: I added them this time but if in the future you do it all by yourself I feel happy.