Apr-22-2020, 09:16 PM
What I want to do is enter a site's robots.txt list. It then pulls out all the URLs and contains their response code. Save this to csv.
I can do this to a point in a different way. But I can't save to csv and get the response code for each url individually.
Also, sitemaps of plugins like All in one SEO are different from the manually created Sitemap.
I use to code:
I can do this to a point in a different way. But I can't save to csv and get the response code for each url individually.
Also, sitemaps of plugins like All in one SEO are different from the manually created Sitemap.
I use to code:
import requests from bs4 import BeautifulSoup d = open("sitemap.txt", "a+") url = 'https://site.com/postsitemap.xml' page = requests.get(url) print('Sitemap yanıt kodu: %s' % page) data = [[r["loc"], r["lastmod"]] for r in raw["urlset"]["url"]] print("Sitemap URL sayısı:", len(data)) df = pd.DataFrame(data, columns=["links", "lastmod"]) sitemap_index = BeautifulSoup(page.content, 'html.parser') print('Created %s object' % type(sitemap_index)) urls = [element.text for element in sitemap_index.findAll('loc')] for link in sorted(x for x in (urls)): d.write(link+("\n")) with open("sitemap.txt") as f: print(f.read()) f.close()