Jan-07-2021, 05:20 AM
I have a variable called "link_list" that prints out multiple game-id URLs which I, using pythonic means, procured from a sports website. The terminal output looks something like this:
Also here's my code for how I am procuring these values [THIS IS JUST FOR REFERENCE ONLY]:
print(link_list) https://www.pcb.com.pk/match_detail.php?match_id=21782 https://www.pcb.com.pk/match_detail.php?match_id=21790 https://www.pcb.com.pk/match_detail.php?match_id=21798 https://www.pcb.com.pk/match_detail.php?match_id=21812 https://www.pcb.com.pk/match_detail.php?match_id=21822If I print using print(type(variable)) it shows the following <class 'str'> fields for each line
print(type(link_list)) <class 'str'> <class 'str'> <class 'str'> <class 'str'> <class 'str'> <class 'str'>Now what I want is, to merge these results (URLs) in the form of a list so I can then run a selenium code snippet later on to get relevant data from these. How do I go about doing this? Any assistance on the matter would be appreciated.
Also here's my code for how I am procuring these values [THIS IS JUST FOR REFERENCE ONLY]:
#Importing all the necessary Libraries import PySimpleGUI as sg from selenium import webdriver from selenium.webdriver.chrome.options import Options from bs4 import BeautifulSoup import urllib.request from selenium.webdriver.support.ui import Select import pandas as pd import re import itertools # Developing Simple Front End label = [sg.Text("Please Enter Link")] link = [sg.Input(enable_events=True, key='link')] scrape_btn = [sg.Button("Scrape")] exit_btn = [sg.Button("Exit")] select_discipline = ['batting', 'bowling'] dropdown_discipline = [sg.Combo(select_discipline, enable_events=True, key='selection')] #Developing Layout and Window Pane layout = [label, link, dropdown_discipline, scrape_btn, exit_btn] window = sg.Window("PCB Domestic Numbers", layout) while True: event, values = window.read() if event == "Scrape": combo = values['selection'] link = values['link'] if combo == 'batting': driver = webdriver.Chrome() driver.get(link) select = Select(driver.find_element_by_name('new_page_limit')) select.select_by_value('all') html = driver.page_source soup = BeautifulSoup(html, "html.parser") target_name = html[html.find("<h1 class="):html.find("</h1>")] name = target_name.split('by ')[1] link_list = [] for link in soup.findAll('a'): if "match_id" in link.get('href'): link_list = link link_list = str(link_list) start = '="' end = '">' link_list = link_list.split(start)[1].split(end)[0] print(type(link_list)) driver.close() window.close() elif combo == 'bowling': print(combo) if event == "Exit" or event == sg.WIN_CLOSED: break window.close()