Python Forum

Full Version: Python the regex not getting any attributes
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello,

I'm trying to pull the image url from the documents. I need to get all the urls sometimes its 2 or 3, may be 5.

Here is my document :

<div class="gallery "><picture class="gallery__item" style="z-index: 2; transform: translateX(0%); transition-duration: 0ms;"><source srcset="https://m.propertyfinder.ae/property/e4c815cff704b8a502fdfc1c8a4b6cd0/668/452/MODE/86943b/7276805-7bb32o.webp" type="image/webp"><source srcset="https://m.propertyfinder.ae/property/e4c815cff704b8a502fdfc1c8a4b6cd0/668/452/MODE/86943b/7276805-7bb32o.jpg" type="image/jpeg"><img src="https://m.propertyfinder.ae/property/e4c815cff704b8a502fdfc1c8a4b6cd0/668/452/MODE/86943b/7276805-7bb32o.jpg" class=" progressive-image--loaded   "></picture><picture class="gallery__item" style="z-index: 2; transform: translateX(100%); transition-duration: 0ms;"><source srcset="https://m.propertyfinder.ae/property/4ac824eac7510c981a471a926a4f1fe5/668/452/MODE/6e317a/7276805-b7735o.webp" type="image/webp"><source srcset="https://m.propertyfinder.ae/property/4ac824eac7510c981a471a926a4f1fe5/668/452/MODE/6e317a/7276805-b7735o.jpg" type="image/jpeg"><img src="https://m.propertyfinder.ae/property/4ac824eac7510c981a471a926a4f1fe5/668/452/MODE/6e317a/7276805-b7735o.jpg" class=" progressive-image--loaded   "></picture><picture class="gallery__item" style="z-index: 2; transform: translateX(-100%); transition-duration: 0ms;"><source srcset="https://m.propertyfinder.ae/property/95974697cb9c202b4713283ab7a5eb8c/668/452/MODE/789e34/7276805-861c8o.webp" type="image/webp"><source srcset="https://m.propertyfinder.ae/property/95974697cb9c202b4713283ab7a5eb8c/668/452/MODE/789e34/7276805-861c8o.jpg" type="image/jpeg"><img src="https://m.propertyfinder.ae/property/95974697cb9c202b4713283ab7a5eb8c/668/452/MODE/789e34/7276805-861c8o.jpg" class=" progressive-image--loaded   "></picture></div>
**Here is my python code :**

all_scripts = soup.find_all('picture')
print(len(all_scripts))

for scripts in all_scripts:

    image = re.search('<source srcset="([^"]+)" type="image/jpeg"[^}]+>', scripts.text);

    print(image)
I'm getting all the picture element by the code but when i try to pull the images only with type of "image/jpeg" its not working.

I'm getting error like

Error:
None None None
what is the URL that you are trying to pull the images from? regex does not play well with html.