Sep-20-2018, 08:17 AM
hi,
I am creating a webscraper and scraping some images. Now problem is that id i am looking has some number endswith and same text startswith.
below is the html.
my code:
I am creating a webscraper and scraping some images. Now problem is that id i am looking has some number endswith and same text startswith.
below is the html.
<tr id="urltyp73645" name = "rowCount[]"> <td class="label" align="center"> Project Multi-unit Image Url </td> <td class=label align="center"><a title="project multi-unit image" href="/projects/tirumala__constructions/tirumala_grandevel_/maps/multunit.jpg" name="urlDel"> <img border="0" alt="Tirumala Grandevel" width="150" height="80" src="/projects/tirumala__constructions/tirumala_grandevel_/maps/multunit.jpg"/></a> </td> <td class='label' align='center'><input type='button' name='urltyp#73645' class='delSubmit' id='projects/tirumala__constructions/tirumala_grandevel_/maps/multunit.jpg##' value='Delete'/></td> </tr>i want to extract these id that starts with urltyp and then extract it's a tags and td. How can i do this?
my code:
import csv from bs4 import BeautifulSoup import requests import re login_url = "" url = "" final_data = [] username = "" password = "" payload = {"action":url, "name":"", "username":username, "password":password, "submit2":""} def parsedata(url): r = requests.get(url) data = r.text return data def returndata(): global login_url, url, final_data session_requests = requests.session() result = session_requests.post(login_url, payload).text result2 = session_requests.get(url).text soup = BeautifulSoup(result2, "html.parser") get_details = soup.find_all(id="entries") for i in get_details: trs = i.find_all(id=re.match("^urltyp", i)) print(trs) returndata()