Python Forum
Python the regex not getting any attributes
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python the regex not getting any attributes
#1
Hello,

I'm trying to pull the image url from the documents. I need to get all the urls sometimes its 2 or 3, may be 5.

Here is my document :

<div class="gallery "><picture class="gallery__item" style="z-index: 2; transform: translateX(0%); transition-duration: 0ms;"><source srcset="https://m.propertyfinder.ae/property/e4c815cff704b8a502fdfc1c8a4b6cd0/668/452/MODE/86943b/7276805-7bb32o.webp" type="image/webp"><source srcset="https://m.propertyfinder.ae/property/e4c815cff704b8a502fdfc1c8a4b6cd0/668/452/MODE/86943b/7276805-7bb32o.jpg" type="image/jpeg"><img src="https://m.propertyfinder.ae/property/e4c815cff704b8a502fdfc1c8a4b6cd0/668/452/MODE/86943b/7276805-7bb32o.jpg" class=" progressive-image--loaded   "></picture><picture class="gallery__item" style="z-index: 2; transform: translateX(100%); transition-duration: 0ms;"><source srcset="https://m.propertyfinder.ae/property/4ac824eac7510c981a471a926a4f1fe5/668/452/MODE/6e317a/7276805-b7735o.webp" type="image/webp"><source srcset="https://m.propertyfinder.ae/property/4ac824eac7510c981a471a926a4f1fe5/668/452/MODE/6e317a/7276805-b7735o.jpg" type="image/jpeg"><img src="https://m.propertyfinder.ae/property/4ac824eac7510c981a471a926a4f1fe5/668/452/MODE/6e317a/7276805-b7735o.jpg" class=" progressive-image--loaded   "></picture><picture class="gallery__item" style="z-index: 2; transform: translateX(-100%); transition-duration: 0ms;"><source srcset="https://m.propertyfinder.ae/property/95974697cb9c202b4713283ab7a5eb8c/668/452/MODE/789e34/7276805-861c8o.webp" type="image/webp"><source srcset="https://m.propertyfinder.ae/property/95974697cb9c202b4713283ab7a5eb8c/668/452/MODE/789e34/7276805-861c8o.jpg" type="image/jpeg"><img src="https://m.propertyfinder.ae/property/95974697cb9c202b4713283ab7a5eb8c/668/452/MODE/789e34/7276805-861c8o.jpg" class=" progressive-image--loaded   "></picture></div>
**Here is my python code :**

all_scripts = soup.find_all('picture')
print(len(all_scripts))

for scripts in all_scripts:

    image = re.search('<source srcset="([^"]+)" type="image/jpeg"[^}]+>', scripts.text);

    print(image)
I'm getting all the picture element by the code but when i try to pull the images only with type of "image/jpeg" its not working.

I'm getting error like

Error:
None None None
Reply
#2
what is the URL that you are trying to pull the images from? regex does not play well with html.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Accessing method attributes of python class Abedin 6 1,421 Apr-14-2025, 07:02 AM
Last Post: buran
  Python Regex quest 2 4,545 Sep-22-2022, 03:15 AM
Last Post: quest
  python regex: get rid of double dot wardancer84 4 3,427 Sep-09-2021, 03:03 PM
Last Post: wardancer84
  Python regex to get only numbers tantony 6 5,708 Oct-09-2019, 11:53 PM
Last Post: newbieAuggie2019
  Python QGIS tool that replaces layout text labels with attributes from an input table geodenn92 1 3,733 Aug-13-2019, 06:05 AM
Last Post: buran
  Python Attributes CanadaGuy 4 4,153 Nov-02-2018, 03:05 PM
Last Post: CanadaGuy

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020