Python Forum

Full Version: How to append multiple <class 'str'> into a single List
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I have a variable called "link_list" that prints out multiple game-id URLs which I, using pythonic means, procured from a sports website. The terminal output looks something like this:

print(link_list)


https://www.pcb.com.pk/match_detail.php?match_id=21782
https://www.pcb.com.pk/match_detail.php?match_id=21790
https://www.pcb.com.pk/match_detail.php?match_id=21798
https://www.pcb.com.pk/match_detail.php?match_id=21812
https://www.pcb.com.pk/match_detail.php?match_id=21822
If I print using print(type(variable)) it shows the following <class 'str'> fields for each line

print(type(link_list))

<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>
Now what I want is, to merge these results (URLs) in the form of a list so I can then run a selenium code snippet later on to get relevant data from these. How do I go about doing this? Any assistance on the matter would be appreciated.


Also here's my code for how I am procuring these values [THIS IS JUST FOR REFERENCE ONLY]:

#Importing all the necessary Libraries
import PySimpleGUI as sg
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
import urllib.request
from selenium.webdriver.support.ui import Select
import pandas as pd
import re
import itertools


# Developing Simple Front End
label = [sg.Text("Please Enter Link")]
link = [sg.Input(enable_events=True, key='link')]
scrape_btn = [sg.Button("Scrape")]
exit_btn = [sg.Button("Exit")]
select_discipline = ['batting', 'bowling']
dropdown_discipline = [sg.Combo(select_discipline, enable_events=True, key='selection')]

#Developing Layout and Window Pane
layout = [label, link, dropdown_discipline, scrape_btn, exit_btn]
window = sg.Window("PCB Domestic Numbers", layout)


while True:
	event, values = window.read()
	
	if event == "Scrape":
		combo = values['selection']
		link = values['link']
		if combo == 'batting':
			driver = webdriver.Chrome()
			driver.get(link)
			select = Select(driver.find_element_by_name('new_page_limit'))
			select.select_by_value('all')
			html = driver.page_source
			soup = BeautifulSoup(html, "html.parser")
			target_name = html[html.find("<h1 class="):html.find("</h1>")]
			name = target_name.split('by ')[1]
			link_list = []
			for link in soup.findAll('a'):
				if "match_id" in link.get('href'):
					link_list = link
					link_list = str(link_list)
					start = '="'
					end = '">'
					link_list = link_list.split(start)[1].split(end)[0]
					print(type(link_list))
			driver.close()
			window.close()

		elif combo == 'bowling':
			print(combo)
		

	if event == "Exit" or event == sg.WIN_CLOSED:
		break

window.close()
In your loop where you're setting link_list, append it to a list that it outside the loop.

You've created such a list on line 41. But line 41 is useless because the variable is overwritten later by line 44.

Perhaps a bit more like...
            start = '="'
            end = '">'
            link_list = []
            for link in soup.findAll('a'):
                if "match_id" in link.get('href'):
                    link_list.append(str(link).split(start)[1].split(end)[0])
(Jan-07-2021, 07:02 AM)bowlofred Wrote: [ -> ]In your loop where you're setting link_list, append it to a list that it outside the loop.

You've created such a list on line 41. But line 41 is useless because the variable is overwritten later by line 44.

Perhaps a bit more like...
            start = '="'
            end = '">'
            link_list = []
            for link in soup.findAll('a'):
                if "match_id" in link.get('href'):
                    link_list.append(str(link).split(start)[1].split(end)[0])

Thank you so much, it solved the problem as intended. Much appreciated :)