Python Forum
Scraping Wikipedia Article (Name in 1 column & URL in 2nd column) ->CSV! Anyone?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Scraping Wikipedia Article (Name in 1 column & URL in 2nd column) ->CSV! Anyone?
#1
Question 
Scraping Wikipedia Article (Name in 1 column & URL in 2nd column) ->CSV! Anyone?

Targeted Columns & All Links

Does anyone know how to accomplish this feat?

I followed this tutorial/blog:

https://www.kindacode.com/article/extrac...ul-soup-4/

This code:

import requests

# BeautifulSoup is imported with the name bas4 
import bs4

URL = 'https://en.wikipedia.org/wiki/List_of_counties_in_Washington'

# Fetch all the HTML source from the url
response = requests.get(URL)


soup = bs4.BeautifulSoup(response.text, 'html.parser')
links = soup.select('a')

# Print out the result
for link in links:
  print(link.get_text())
  if link.get('href') != None:
    if 'https://' in link.get('href'):
      print(link.get('href'))
    else:
      print('https://en.wikipedia.org' + link.get('href')) # Convert relative URL to absolute URL

  print('----------------------------') # Just a line break
This prints the output with "name on top and URL on bottom", such as:

Counties
----------------------------
Adams
https://en.wikipedia.org/wiki/Adams_County,_Washington
----------------------------
Asotin
https://en.wikipedia.org/wiki/Asotin_County,_Washington
----------------------------
Benton
https://en.wikipedia.org/wiki/Benton_County,_Washington
----------------------------
I would like to store the name in 1 column and the URL in a 2nd column and send it to a CSV

Thank you everyone for this forum! I will append this thread if I start to find answers, rather than replies as I was doing earlier in error! Appreciate the correction and forum/assistance!

Best Regards,

Brandon Kastning
“And one of the elders saith unto me, Weep not: behold, the Lion of the tribe of Juda, the Root of David, hath prevailed to open the book,...” - Revelation 5:5 (KJV)

“And oppress not the widow, nor the fatherless, the stranger, nor the poor; and ...” - Zechariah 7:10 (KJV)

#LetHISPeopleGo

Reply


Messages In This Thread
Scraping Wikipedia Article (Name in 1 column & URL in 2nd column) ->CSV! Anyone? - by BrandonKastning - Jan-19-2022, 11:39 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
Question Scraping Columns with Pandas (Column Entries w/ more than 1 word writes two columns) BrandonKastning 7 3,226 Jan-13-2022, 10:52 PM
Last Post: BrandonKastning
  Python Obstacles | Krav Maga | Wiki Scraped Content [Column Copy] BrandonKastning 4 2,266 Jan-03-2022, 06:59 AM
Last Post: BrandonKastning
  Python Obstacles | Kapap | Wiki Scraped Content [Column Nulling] BrandonKastning 2 1,761 Jan-03-2022, 04:26 AM
Last Post: BrandonKastning
  fetching, parsing data from Wikipedia apollo 2 3,576 May-06-2021, 08:08 PM
Last Post: snippsat
  Django : OperationalError no such column: Utilisateurs_videos.user_id Adem 0 3,074 Mar-20-2021, 06:11 PM
Last Post: Adem
  Need help scraping wikipedia table bborusz2 6 3,285 Dec-01-2020, 11:31 PM
Last Post: snippsat
  Article Extraction - Wordpress svzekio 7 5,362 Jul-10-2020, 10:18 PM
Last Post: steve_shambles
  Jump to next empty column with Google Sheets & Python Biks 1 2,708 Jun-16-2020, 04:51 PM
Last Post: aguiatoma
  expecting value: line 1 column 1 (char 0) in print (r.json)) loutsi 3 7,701 Jun-05-2020, 08:38 PM
Last Post: nuffink
  How to capture Single Column from Web Html Table? ahmedwaqas92 5 4,440 Jul-29-2019, 02:17 AM
Last Post: ahmedwaqas92

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020