Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
scrap by defining 3 functions
#1
Hi guys,

I want to make a scrapping using 3 functions so i defined

def header_template():
    for i in range(1, 3):
        big_soup = []
        url = 'https://www.otodom.pl/wynajem/mieszkanie/warszawa/?search%5Bregion_id%5D=7&search%5Bsubregion_id%5D=197&search%5Bcity_id%5D=26&page=' + str(i)
        headers        = {
                    'accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
                    'accept-encoding':'gzip, deflate, sdch, br',
                    'accept-language':'en-GB,en;q=0.8,en-US;q=0.6,ml;q=0.4',
                    'cache-control':'max-age=0',
                    'upgrade-insecure-requests':'1',
                    'user-agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'
        }
        response       = requests.get(url,headers=headers)
        parser         = response.text#.content
        soup           = BeautifulSoup(parser, "html.parser")
       

        big_element    = soup.find_all('article')
        return big_element
        big_soup.append(big_element)
def create_objects():
    
    final_data = []
    for section in header_template():
        
        bedrooms = section.find('li', {'class': 'offer-item-rooms hidden-xs'}).text.split()
        bedrooms = int(bedrooms[0])
        return bedrooms
        
        data = {
            'Bedrooms':bedrooms
            }
            
        final_data.append(data)
        return final_data
def soup_to_excel():
    
    df = pd.DataFrame(final_data, columns=['Bedrooms'])
    return df
    df.to_excel(r'C:\Users\user\Desktop\learning.xlsx')
I wanted to create a script which would loop over 2 pages and return all data with bedroom (all values)
Where i got lost? I don't know why, but i can't append all results together.. All i get is only 1 result
It's actually first time when i am trying to do this with defining functions :P

I would appreciate any advices/tips

And i wanted to return it by
def finish_func():
    header_template()
    create_objects()
    soup_to_excel()

finish_func()
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Web scrap --Need help Lizardpython 4 1,029 Oct-01-2023, 11:37 AM
Last Post: Lizardpython
  I tried every way to scrap morningstar financials data without success so far sparkt 2 8,259 Oct-20-2020, 05:43 PM
Last Post: sparkt
  Web scrap multiple pages anilacem_302 3 3,837 Jul-01-2020, 07:50 PM
Last Post: mlieqo
  Need logic on how to scrap 100K URLs goodmind 2 2,629 Jun-29-2020, 09:53 AM
Last Post: goodmind
  Scrap a dynamic span hefaz 0 2,695 Mar-07-2020, 02:56 PM
Last Post: hefaz
  Skipping anti-scrap zarize 0 1,882 Jan-17-2020, 11:51 AM
Last Post: zarize
  Cannot get selenium to scrap past the first two pages newbie_programmer 0 4,165 Dec-12-2019, 06:19 AM
Last Post: newbie_programmer
  Scrap data from not standarized page? zarize 4 3,317 Nov-25-2019, 10:25 AM
Last Post: zarize
  page impossible to scrap? :O zarize 2 3,945 Oct-03-2019, 02:44 PM
Last Post: zarize
  Scrap a value from website harsush 1 2,283 Aug-29-2019, 01:57 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020