Python Forum
scrap by defining 3 functions - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: scrap by defining 3 functions (/thread-24540.html)



scrap by defining 3 functions - zarize - Feb-18-2020

Hi guys,

I want to make a scrapping using 3 functions so i defined

def header_template():
    for i in range(1, 3):
        big_soup = []
        url = 'https://www.otodom.pl/wynajem/mieszkanie/warszawa/?search%5Bregion_id%5D=7&search%5Bsubregion_id%5D=197&search%5Bcity_id%5D=26&page=' + str(i)
        headers        = {
                    'accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
                    'accept-encoding':'gzip, deflate, sdch, br',
                    'accept-language':'en-GB,en;q=0.8,en-US;q=0.6,ml;q=0.4',
                    'cache-control':'max-age=0',
                    'upgrade-insecure-requests':'1',
                    'user-agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'
        }
        response       = requests.get(url,headers=headers)
        parser         = response.text#.content
        soup           = BeautifulSoup(parser, "html.parser")
       

        big_element    = soup.find_all('article')
        return big_element
        big_soup.append(big_element)
def create_objects():
    
    final_data = []
    for section in header_template():
        
        bedrooms = section.find('li', {'class': 'offer-item-rooms hidden-xs'}).text.split()
        bedrooms = int(bedrooms[0])
        return bedrooms
        
        data = {
            'Bedrooms':bedrooms
            }
            
        final_data.append(data)
        return final_data
def soup_to_excel():
    
    df = pd.DataFrame(final_data, columns=['Bedrooms'])
    return df
    df.to_excel(r'C:\Users\user\Desktop\learning.xlsx')
I wanted to create a script which would loop over 2 pages and return all data with bedroom (all values)
Where i got lost? I don't know why, but i can't append all results together.. All i get is only 1 result
It's actually first time when i am trying to do this with defining functions :P

I would appreciate any advices/tips

And i wanted to return it by
def finish_func():
    header_template()
    create_objects()
    soup_to_excel()

finish_func()