Jun-07-2020, 12:33 PM
Hi everyone!
Please be warned, I am a doctoral candidate who realized that there is no way around learning how to use Python, but I have come accross numerous roadblocks where I hope you may be able to help?
I have managed to scrape/crawl Twitter feeds of selected users, but now I am looking to extract all articles from wordpress pages (excl. images), including primarily the following:
- Title
- Article Link
- Time & Date
- Text
Optimally as an output within CSV / Excel.
I have come accross the following website:
https://indianpythonista.wordpress.com/2...iful-soup/
https://www.digitalocean.com/community/t...d-python-3
https://zach-adams.com/2015/04/python-sc...wordpress/
But truly am struggling to get any of these codes, in all of its variants to work. (Scrapy wont install on my PyCharm, so I resorted to BeautifulSoup.)
A sample of websites I want to scrape (particularly subsections may include infite scrolling):
1) https://electrek.co/guides/tesla/
2) https://www.teslarati.com/tag/tesla/
Is there one of you out there who would be able to give a hand to amend on of the beaoutiful-soup scripts to the above 2 sample pages? I would take it from there and use it on any other wordpress blogs, but I guess I need a starting hand!
Appreciate your time! Have a great weekend and stay safe.
Please be warned, I am a doctoral candidate who realized that there is no way around learning how to use Python, but I have come accross numerous roadblocks where I hope you may be able to help?
I have managed to scrape/crawl Twitter feeds of selected users, but now I am looking to extract all articles from wordpress pages (excl. images), including primarily the following:
- Title
- Article Link
- Time & Date
- Text
Optimally as an output within CSV / Excel.
I have come accross the following website:
https://indianpythonista.wordpress.com/2...iful-soup/
https://www.digitalocean.com/community/t...d-python-3
https://zach-adams.com/2015/04/python-sc...wordpress/
But truly am struggling to get any of these codes, in all of its variants to work. (Scrapy wont install on my PyCharm, so I resorted to BeautifulSoup.)
A sample of websites I want to scrape (particularly subsections may include infite scrolling):
1) https://electrek.co/guides/tesla/
2) https://www.teslarati.com/tag/tesla/
Is there one of you out there who would be able to give a hand to amend on of the beaoutiful-soup scripts to the above 2 sample pages? I would take it from there and use it on any other wordpress blogs, but I guess I need a starting hand!
Appreciate your time! Have a great weekend and stay safe.