Nov-06-2022, 10:17 PM
I'm new to scraping and am working on a scraper, I'm trying to figure out how to pull a URL from the src of this line of HTML:
<img alt="Bube Dame König Gras [Import allemand]" src="https://m.media-amazon.com/images/I/81f+DecFsrL._SY445_.jpg" data-old-hires="https://m.media-amazon.com/images/I/81f+DecFsrL._SL1500_.jpg" onload="markFeatureRenderForImageBlock(); this.onload='';setCSMReq('af');if(typeof addlongPoleTag === 'function'){ addlongPoleTag('af','desktop-image-atf-marker');};setCSMReq('cf')" class="a-dynamic-image a-stretch-vertical" id="landingImage" data-a-dynamic-image="{"https://m.media-amazon.com/images/I/81f+DecFsrL._SY679_.jpg":[679,480],"https://m.media-amazon.com/images/I/81f+DecFsrL._SY550_.jpg":[550,389],"https://m.media-amazon.com/images/I/81f+DecFsrL._SY445_.jpg":[445,315],"https://m.media-amazon.com/images/I/81f+DecFsrL._SY500_.jpg":[500,353],"https://m.media-amazon.com/images/I/81f+DecFsrL._SY606_.jpg":[606,428]}" style="max-width: 160.471px; max-height: 227px;"> </div>
I haven't had any luck pointing to this line to get the src. I'd settle for copying the html, making it text and finding it that way, it might actually be better that way. I'm using request-html, though Beautifulsoup might do this better. Any help would be appreciated!
<img alt="Bube Dame König Gras [Import allemand]" src="https://m.media-amazon.com/images/I/81f+DecFsrL._SY445_.jpg" data-old-hires="https://m.media-amazon.com/images/I/81f+DecFsrL._SL1500_.jpg" onload="markFeatureRenderForImageBlock(); this.onload='';setCSMReq('af');if(typeof addlongPoleTag === 'function'){ addlongPoleTag('af','desktop-image-atf-marker');};setCSMReq('cf')" class="a-dynamic-image a-stretch-vertical" id="landingImage" data-a-dynamic-image="{"https://m.media-amazon.com/images/I/81f+DecFsrL._SY679_.jpg":[679,480],"https://m.media-amazon.com/images/I/81f+DecFsrL._SY550_.jpg":[550,389],"https://m.media-amazon.com/images/I/81f+DecFsrL._SY445_.jpg":[445,315],"https://m.media-amazon.com/images/I/81f+DecFsrL._SY500_.jpg":[500,353],"https://m.media-amazon.com/images/I/81f+DecFsrL._SY606_.jpg":[606,428]}" style="max-width: 160.471px; max-height: 227px;"> </div>
I haven't had any luck pointing to this line to get the src. I'd settle for copying the html, making it text and finding it that way, it might actually be better that way. I'm using request-html, though Beautifulsoup might do this better. Any help would be appreciated!