Python Forum
how to find a particular id using python and bs4
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
how to find a particular id using python and bs4
#1
hi,

I am creating a webscraper and scraping some images. Now problem is that id i am looking has some number endswith and same text startswith.

below is the html.

 <tr id="urltyp73645" name = "rowCount[]">
                <td class="label" align="center"> Project Multi-unit Image Url </td>
                <td  class=label align="center"><a title="project multi-unit image" href="/projects/tirumala__constructions/tirumala_grandevel_/maps/multunit.jpg" name="urlDel">
                <img border="0" alt="Tirumala Grandevel" width="150" height="80" src="/projects/tirumala__constructions/tirumala_grandevel_/maps/multunit.jpg"/></a>
                </td>
                <td class='label' align='center'><input type='button' name='urltyp#73645' class='delSubmit' id='projects/tirumala__constructions/tirumala_grandevel_/maps/multunit.jpg##' value='Delete'/></td>
        </tr>
                                                
i want to extract these id that starts with urltyp and then extract it's a tags and td. How can i do this?

my code:

import csv
from bs4 import BeautifulSoup
import requests
import re

login_url = ""
url = ""

final_data = []
username = ""
password = ""
payload = {"action":url,
                "name":"",
                "username":username,
                "password":password,
                "submit2":""}

def parsedata(url):
    r = requests.get(url)
    data = r.text
    return data

def returndata():
    global login_url, url, final_data
    session_requests = requests.session()
    result = session_requests.post(login_url, payload).text
    result2 = session_requests.get(url).text
    soup = BeautifulSoup(result2, "html.parser")
    get_details = soup.find_all(id="entries")
    for i in get_details:
        trs = i.find_all(id=re.match("^urltyp", i))
        print(trs)
        
returndata()
Reply
#2
try (this is untested, so may not work)
get_details = soup.find_all('tr', {'id': re.match("^urltyp", i)})
Reply
#3
i received this error :

Traceback (most recent call last):
  File "C:\Users\prince.bhatia\Desktop\projects\image_delete_insert\Backend.py", line 34, in <module>
    returndata()
  File "C:\Users\prince.bhatia\Desktop\projects\image_delete_insert\Backend.py", line 31, in returndata
    trs = i.find_all("tr",{"id":re.match("^urltyp", i)})
  File "C:\Users\prince.bhatia\AppData\Local\Programs\Python\Python36\lib\re.py", line 172, in match
    return _compile(pattern, flags).match(string)
TypeError: expected string or bytes-like object

hey, i have changed the match with compile and it is working..code:

def returndata():
    global login_url, url, final_data
    session_requests = requests.session()
    result = session_requests.post(login_url, payload).text
    result2 = session_requests.get(url).text
    soup = BeautifulSoup(result2, "html.parser")
    get_details = soup.find_all(id="entries")
    for i in get_details:
        trs = i.find_all("tr",{"id":re.compile("^urltyp")})
        print(trs)
        
returndata()
Reply
#4
you're missing a space between id: and re
(which may still fail)
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020