Python Forum
Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row
#1
Hello Python Web Scrapers,

This is what I am currently up against and was hoping somebody could point me in the right direction.

Python3 + BeautifulSoup4 + lxml (HTML -> CSV):

How to loop to the next HTML URL and save as new CSV Row in the existing .csv that the current code scrapes from.

For instance: How would I make this URL do the above

Next HTML URL: https://law.justia.com/cases/federal/app...66/308423/

Python3 Code:

from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("http://law.justia.com/cases/federal/appellate-courts/F2/999/663/308588/")
bsObj = BeautifulSoup(html.read())
allOpinion = bsObj.findAll(id="opinion")
import requests
from bs4 import BeautifulSoup

url = "http://law.justia.com/cases/federal/appellate-courts/F2/999/663/308588/"
allTitle = bsObj.findAll({"title"})
allURL = url

print(allOpinion)
print(allTitle)
print(allURL)

import csv
csvRow = [allOpinion,allTitle,allURL]
csvfile = "current_F2_opinion_with_tags_current.csv"
with open(csvfile, "a") as fp:
    wr = csv.writer(fp, dialect='excel')
    wr.writerow(csvRow)

print(allOpinion[0].get_text(),url)
 
import csv
csvRow = [allOpinion[0].get_text(),allTitle[0].get_text(),allURL]
csvfile = "current_F2_opinion_without_tags_current.csv"
with open(csvfile, "a") as fp:
    wr = csv.writer(fp, dialect='excel')
    wr.writerow(csvRow)
Thank you!

Best Regards,

Brandon Kastning

P.S. - Everyone be safe!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  cleaning HTML pages using lxml and XPath wenkos 2 300 Aug-25-2021, 10:54 AM
Last Post: wenkos
  Python Web Scraping can not getting all HTML content yqqwe123 0 200 Aug-02-2021, 08:56 AM
Last Post: yqqwe123
  show csv file in flask template.html rr28rizal 8 27,020 Apr-12-2021, 09:24 AM
Last Post: adamabusamra
Lightbulb Hypertag. New language for HTML templating w/ Django support mwojnars 0 474 Apr-06-2021, 12:53 PM
Last Post: mwojnars
Sad web scraping HTML - :( Kingoman 22 1,486 Apr-05-2021, 09:50 AM
Last Post: snippsat
  HTML multi select HTML listbox with Flask/Python rfeyer 0 1,103 Mar-14-2021, 12:23 PM
Last Post: rfeyer
  Cleaning HTML data using Jupyter Notebook jacob1986 7 1,077 Mar-05-2021, 10:44 PM
Last Post: snippsat
  Parsing html page and working with checkbox (on a captcha) straannick 17 2,923 Feb-04-2021, 02:54 PM
Last Post: snippsat
  Saving html page and reloading into selenium while developing all xpaths Larz60+ 4 1,951 Feb-04-2021, 07:01 AM
Last Post: jonathanwhite1
Smile Extracting the Address tag from multiple HTML files using BeautifulSoup Dredd 8 1,247 Jan-25-2021, 12:16 PM
Last Post: Dredd

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020