May-29-2022, 07:06 PM
(This post was last modified: May-29-2022, 07:06 PM by eddywinch82.)
Hi there,
I have the following Python Code :-
The issue I have is when I run the Code, the Data is printed for 15 Entries from May 28th to May 29th, several times,
I am not sure why that is the case ? Could someone suggest for me the reason why ? And tell me what I need to change in the Code, so
that that Data is printed only once and not several times ? I have tried to Scrape Data from a Website, where entries contain the word between or Flypast.
When I use the following piece of Code instead :-
The first entry for the 28th May, is printed out in the DataFrame 15 times ! instead of 15 seperate Entries I mentioned before.
Any help would be much appreciated.
Best Regards
Eddie Winch ))
I have the following Python Code :-
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
import pandas as pd import requests import numpy as np from bs4 import BeautifulSoup import xlrd import re pd.set_option( 'display.max_rows' , 500 ) pd.set_option( 'display.max_columns' , 500 ) pd.set_option( 'display.width' , 1000 ) res3 = requests.get( "https://web.archive.org/web/20220521203053/https://www.military-airshows.co.uk/press22/bbmfschedule2022.htm" ) soup3 = BeautifulSoup(res3.content, 'lxml' ) BBMF_2022 = [] #BBMF_elem = soup3.find_all('a', string=re.compile(r'between|Flypast')) for item in soup3.find_all( 'a' , string = re. compile (r 'between|Flypast' )): li1 = item.find_parent().text #li2 = li1.find_previous().font #print(link) print (li1) #print(li2) #BBMF_2022.append(li1) #check if links are in dataframe #df = pd.DataFrame(BBMF_2022, columns=['BBMF_2022']) #df |
I am not sure why that is the case ? Could someone suggest for me the reason why ? And tell me what I need to change in the Code, so
that that Data is printed only once and not several times ? I have tried to Scrape Data from a Website, where entries contain the word between or Flypast.
When I use the following piece of Code instead :-
1 2 3 4 5 6 7 8 9 10 11 12 |
for item in soup3.find_all( 'a' , string = re. compile (r 'between|Flypast' )): li1 = item.find_parent().text #li2 = li1.find_previous().font #print(link) #print(li1) #print(li2) BBMF_2022.append(li1) df = pd.DataFrame(BBMF_2022, columns = [ 'BBMF_2022' ]) df |
The first entry for the 28th May, is printed out in the DataFrame 15 times ! instead of 15 seperate Entries I mentioned before.
Any help would be much appreciated.
Best Regards
Eddie Winch ))