Python Forum
Python the regex not getting any attributes
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python the regex not getting any attributes
#1
Hello,

I'm trying to pull the image url from the documents. I need to get all the urls sometimes its 2 or 3, may be 5.

Here is my document :

<div class="gallery "><picture class="gallery__item" style="z-index: 2; transform: translateX(0%); transition-duration: 0ms;"><source srcset="https://m.propertyfinder.ae/property/e4c815cff704b8a502fdfc1c8a4b6cd0/668/452/MODE/86943b/7276805-7bb32o.webp" type="image/webp"><source srcset="https://m.propertyfinder.ae/property/e4c815cff704b8a502fdfc1c8a4b6cd0/668/452/MODE/86943b/7276805-7bb32o.jpg" type="image/jpeg"><img src="https://m.propertyfinder.ae/property/e4c815cff704b8a502fdfc1c8a4b6cd0/668/452/MODE/86943b/7276805-7bb32o.jpg" class=" progressive-image--loaded   "></picture><picture class="gallery__item" style="z-index: 2; transform: translateX(100%); transition-duration: 0ms;"><source srcset="https://m.propertyfinder.ae/property/4ac824eac7510c981a471a926a4f1fe5/668/452/MODE/6e317a/7276805-b7735o.webp" type="image/webp"><source srcset="https://m.propertyfinder.ae/property/4ac824eac7510c981a471a926a4f1fe5/668/452/MODE/6e317a/7276805-b7735o.jpg" type="image/jpeg"><img src="https://m.propertyfinder.ae/property/4ac824eac7510c981a471a926a4f1fe5/668/452/MODE/6e317a/7276805-b7735o.jpg" class=" progressive-image--loaded   "></picture><picture class="gallery__item" style="z-index: 2; transform: translateX(-100%); transition-duration: 0ms;"><source srcset="https://m.propertyfinder.ae/property/95974697cb9c202b4713283ab7a5eb8c/668/452/MODE/789e34/7276805-861c8o.webp" type="image/webp"><source srcset="https://m.propertyfinder.ae/property/95974697cb9c202b4713283ab7a5eb8c/668/452/MODE/789e34/7276805-861c8o.jpg" type="image/jpeg"><img src="https://m.propertyfinder.ae/property/95974697cb9c202b4713283ab7a5eb8c/668/452/MODE/789e34/7276805-861c8o.jpg" class=" progressive-image--loaded   "></picture></div>
**Here is my python code :**

all_scripts = soup.find_all('picture')
print(len(all_scripts))

for scripts in all_scripts:

    image = re.search('<source srcset="([^"]+)" type="image/jpeg"[^}]+>', scripts.text);

    print(image)
I'm getting all the picture element by the code but when i try to pull the images only with type of "image/jpeg" its not working.

I'm getting error like

Error:
None None None
Reply
#2
what is the URL that you are trying to pull the images from? regex does not play well with html.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Python Regex quest 2 2,344 Sep-22-2022, 03:15 AM
Last Post: quest
  python regex: get rid of double dot wardancer84 4 2,364 Sep-09-2021, 03:03 PM
Last Post: wardancer84
  Using Regex Expression With Isin in Python eddywinch82 0 2,291 Apr-04-2021, 06:25 PM
Last Post: eddywinch82
  Exception handling in regex using python ShruthiLS 1 2,363 May-04-2020, 08:12 AM
Last Post: anbu23
  Python regex to get only numbers tantony 6 4,093 Oct-09-2019, 11:53 PM
Last Post: newbieAuggie2019
  Python QGIS tool that replaces layout text labels with attributes from an input table geodenn92 1 2,676 Aug-13-2019, 06:05 AM
Last Post: buran
  Python Attributes CanadaGuy 4 3,004 Nov-02-2018, 03:05 PM
Last Post: CanadaGuy

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020