May-29-2022, 07:06 PM
(This post was last modified: May-29-2022, 07:06 PM by eddywinch82.)
Hi there,
I have the following Python Code :-
I am not sure why that is the case ? Could someone suggest for me the reason why ? And tell me what I need to change in the Code, so
that that Data is printed only once and not several times ? I have tried to Scrape Data from a Website, where entries contain the word between or Flypast.
When I use the following piece of Code instead :-
The first entry for the 28th May, is printed out in the DataFrame 15 times ! instead of 15 seperate Entries I mentioned before.
Any help would be much appreciated.
Best Regards
Eddie Winch ))
I have the following Python Code :-
import pandas as pd import requests import numpy as np from bs4 import BeautifulSoup import xlrd import re pd.set_option('display.max_rows', 500) pd.set_option('display.max_columns', 500) pd.set_option('display.width', 1000) res3 = requests.get("https://web.archive.org/web/20220521203053/https://www.military-airshows.co.uk/press22/bbmfschedule2022.htm") soup3 = BeautifulSoup(res3.content,'lxml') BBMF_2022 = [] #BBMF_elem = soup3.find_all('a', string=re.compile(r'between|Flypast')) for item in soup3.find_all('a', string=re.compile(r'between|Flypast')): li1 = item.find_parent().text #li2 = li1.find_previous().font #print(link) print(li1) #print(li2) #BBMF_2022.append(li1) #check if links are in dataframe #df = pd.DataFrame(BBMF_2022, columns=['BBMF_2022']) #dfThe issue I have is when I run the Code, the Data is printed for 15 Entries from May 28th to May 29th, several times,
I am not sure why that is the case ? Could someone suggest for me the reason why ? And tell me what I need to change in the Code, so
that that Data is printed only once and not several times ? I have tried to Scrape Data from a Website, where entries contain the word between or Flypast.
When I use the following piece of Code instead :-
for item in soup3.find_all('a', string=re.compile(r'between|Flypast')): li1 = item.find_parent().text #li2 = li1.find_previous().font #print(link) #print(li1) #print(li2) BBMF_2022.append(li1) df = pd.DataFrame(BBMF_2022, columns=['BBMF_2022']) df
The first entry for the 28th May, is printed out in the DataFrame 15 times ! instead of 15 seperate Entries I mentioned before.
Any help would be much appreciated.
Best Regards
Eddie Winch ))