Python Forum
For Loop Works Fine But Append For Pandas Doesn't Work - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: For Loop Works Fine But Append For Pandas Doesn't Work (/thread-35808.html)



For Loop Works Fine But Append For Pandas Doesn't Work - knight2000 - Dec-17-2021

Hi all,

I'm practicing a little webscraping and I'm extracting various elements from it. Everything is going as expected except for one component. For simplicity, I've omitted the other elements just to keep the code short and sweet (it shouldn't make any difference to this issue at all).

When extracting this element and then 'printing' the statement on screen, the variable and the loop works perfectly.

Here's an example:

net_profit = []



with open("C:/Users/websites_page.html", encoding="utf8") as fp:
    soup = BeautifulSoup(fp, 'lxml')

    full_list_top_half = soup.find_all('div', class_ = 'col-lg-9 px-lg-3 d-flex flex-column justify-content-between')
    for item in full_list_top_half:
         # Get Website URL
         col = item.find('div', class_ = 'col-lg-9')


         #Get Asset Type
         content_between = col.find('div', class_ = 'd-flex flex-nowrap justify-content-between')


         #Get Net Profit
         text_truncate = content_between.find('div', class_ = 'font-weight-bold text-truncate')
         nprofit = text_truncate.find('span', class_ = 'ng-binding ng-scope').text
         print(nprofit)
         
I get:

Output:
$4,331 p/mo $9,429 p/mo $1,599 p/mo $110,133 p/mo $1,475 p/mo


Looks perfect and what I would expect.

So as I will eventually want to export this data using pandas, so as per normal I added one additional line to the code to append the data to a variable:

net_profit = []



with open("C:/Users/websites_page.html", encoding="utf8") as fp:
    soup = BeautifulSoup(fp, 'lxml')

    full_list_top_half = soup.find_all('div', class_ = 'col-lg-9 px-lg-3 d-flex flex-column justify-content-between')
    for item in full_list_top_half:
         # Get Website URL
         col = item.find('div', class_ = 'col-lg-9')


         #Get Asset Type
         content_between = col.find('div', class_ = 'd-flex flex-nowrap justify-content-between')


         #Get Net Profit
         text_truncate = content_between.find('div', class_ = 'font-weight-bold text-truncate')
         nprofit = text_truncate.find('span', class_ = 'ng-binding ng-scope').text
         #print(nprofit)
         net_profit.append(nprofit)
         print(net_profit)
So the second piece of code includes:
net_profit.append(nprofit)
and when I print that, I now get:

Output:
"C:\Users\anaconda3\python.exe" "C:/Users/test_file.py" [' $4,331 p/mo'] [' $4,331 p/mo', ' $9,429 p/mo'] [' $4,331 p/mo', ' $9,429 p/mo', ' $1,599 p/mo'] [' $4,331 p/mo', ' $9,429 p/mo', ' $1,599 p/mo', ' $110,133 p/mo'] [' $4,331 p/mo', ' $9,429 p/mo', ' $1,599 p/mo', ' $110,133 p/mo', ' $1,475 p/mo'] Process finished with exit code 0
'nprofit' prints as expected, but net_profit is looping and adding each piece of data to it.

I've never seen this happen before and after spending a few hours mulling over it and trying a couple of things, I don't know why this is happening.

Could anyone please help enlighten me with this?

Thanks a lot.


RE: For Loop Works Fine But Append For Pandas Doesn't Work - deanhystad - Dec-17-2021

I am confused about your confusion. The code works exactly as I would expect. Here is an even simpler example that demonstrates the behavior.
numbers = []
for number in range(5):
    print(number)
    numbers.append(number)
    print(numbers)
Output:
0 [0] 1 [0, 1] 2 [0, 1, 2] 3 [0, 1, 2, 3] 4 [0, 1, 2, 3, 4]
Since a number is appended to numbers each time through the loop I would expect numbers to be longer each time it is printed. This is just like your example where you nprofit to net_profits.

Do you only want to print net_profits after all the values are added?
numbers = []
for number in range(5):
    print(number)
    numbers.append(number)
print(numbers)
Output:
0 1 2 3 4 [0, 1, 2, 3, 4]



RE: For Loop Works Fine But Append For Pandas Doesn't Work - knight2000 - Dec-18-2021

Hi deanhystad,

Thanks very much for the time you invested in replying here.

haha. I won't need to convince you I'm still quite new and may not know what I'm doing! I've never tried to print a variable in a loop before- so I was expected it to be the same values that show if I printed nprofits or net_profits.

The reason for investigating was that when trying to use pandas (had more than one variable that's not included in the code here), I got a:
Error:
ValueError: All arrays must be of the same length
After cutting down the variables to try and find the problem, I obviously went off track and thought the problem that I posted was the problem. Clearly it's not!

Obviously there is another problem with the data values itself in my code and I'll go through the elimination process to try and uncover the actual problem.

Please pardon my ignorance! Shifty

But thank you for helping.

(Dec-17-2021, 08:15 AM)deanhystad Wrote: I am confused about your confusion. The code works exactly as I would expect. Here is an even simpler example that demonstrates the behavior.
numbers = []
for number in range(5):
    print(number)
    numbers.append(number)
    print(numbers)
Output:
0 [0] 1 [0, 1] 2 [0, 1, 2] 3 [0, 1, 2, 3] 4 [0, 1, 2, 3, 4]
Since a number is appended to numbers each time through the loop I would expect numbers to be longer each time it is printed. This is just like your example where you nprofit to net_profits.

Do you only want to print net_profits after all the values are added?
numbers = []
for number in range(5):
    print(number)
    numbers.append(number)
print(numbers)
Output:
0 1 2 3 4 [0, 1, 2, 3, 4]