Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row
#1
Hello Python Web Scrapers,

This is what I am currently up against and was hoping somebody could point me in the right direction.

Python3 + BeautifulSoup4 + lxml (HTML -> CSV):

How to loop to the next HTML URL and save as new CSV Row in the existing .csv that the current code scrapes from.

For instance: How would I make this URL do the above

Next HTML URL: https://law.justia.com/cases/federal/app...66/308423/

Python3 Code:

from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("http://law.justia.com/cases/federal/appellate-courts/F2/999/663/308588/")
bsObj = BeautifulSoup(html.read())
allOpinion = bsObj.findAll(id="opinion")
import requests
from bs4 import BeautifulSoup

url = "http://law.justia.com/cases/federal/appellate-courts/F2/999/663/308588/"
allTitle = bsObj.findAll({"title"})
allURL = url

print(allOpinion)
print(allTitle)
print(allURL)

import csv
csvRow = [allOpinion,allTitle,allURL]
csvfile = "current_F2_opinion_with_tags_current.csv"
with open(csvfile, "a") as fp:
    wr = csv.writer(fp, dialect='excel')
    wr.writerow(csvRow)

print(allOpinion[0].get_text(),url)
 
import csv
csvRow = [allOpinion[0].get_text(),allTitle[0].get_text(),allURL]
csvfile = "current_F2_opinion_without_tags_current.csv"
with open(csvfile, "a") as fp:
    wr = csv.writer(fp, dialect='excel')
    wr.writerow(csvRow)

Thank you!

Best Regards,

Brandon Kastning

P.S. - Everyone be safe!
apollo likes this post
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  html data cell attribute issue delahug 5 135 May-31-2020, 09:18 AM
Last Post: delahug
  TDD/CSS & HTML testing - CSS selector (.has-error) makoseafox 0 112 May-13-2020, 07:41 PM
Last Post: makoseafox
  Extracting html data using attributes WiPi 14 442 May-04-2020, 02:04 PM
Last Post: snippsat
  [split] Pytest-html add screenshots help rafiPython1 1 2,532 Apr-30-2020, 07:16 PM
Last Post: Gourav
  HTML loading process windows11 1 162 Apr-01-2020, 04:45 PM
Last Post: Larz60+
  how does a html form work exactly? mp3909 2 415 Apr-01-2020, 04:02 PM
Last Post: mp3909
  Selenium cant get elements from HTML(Rookie) Troop 1 175 Mar-31-2020, 03:37 AM
Last Post: Larz60+
  extrat data from a button html windows11 1 183 Mar-24-2020, 03:39 PM
Last Post: Larz60+
  Pandas tuple list returning html string shansaran 0 144 Mar-23-2020, 08:44 PM
Last Post: shansaran
  Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? BrandonKastning 21 452 Mar-23-2020, 05:51 PM
Last Post: ndc85430

Forum Jump:


Users browsing this thread: 1 Guest(s)