Python Forum

Full Version: Want to scrape a table data and export it into CSV format
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello
I am trying to scrape a table data using the code below .I am trying to scrape the names of all states of USA .I ve managed to extract all 'a'tags but they include all data within a tags .How can i scrape only the names of USA states?
import bs4
import requests
res = requests.get('https://simple.wikipedia.org/wiki/List_of_U.S._states')
soup = bs4.BeautifulSoup(res.text, 'lxml')
soup.select('a')
for i in soup.select('a'):
	print(i.text)
Your search most bye much more specific that link a,as it is lot of a on site.
Here a some hints with a missing loop Think
from bs4 import BeautifulSoup
import requests

res = requests.get('https://simple.wikipedia.org/wiki/List_of_U.S._states')
soup = BeautifulSoup(res.content, 'lxml')
CSS selectors:
>>> name = soup.select('#mw-content-text > div > table > tbody > tr:nth-child(3) > th > a')
>>> name
[<a href="/wiki/Alabama" title="Alabama">Alabama</a>]
>>> name[0].text
'Alabama'
>>> name = soup.select('#mw-content-text > div > table > tbody > tr:nth-child(4) > th > a')
>>> name[0].text
'Alaska'
>>> name = soup.select('#mw-content-text > div > table > tbody > tr:nth-child(5) > th > a')
>>> name[0].text
'Arizona'
Search bye tag name:
>>> flag_name = soup.find_all('th', scope="row")
>>> flag_name[0].text.strip()
'Alabama'
>>> flag_name[1].text.strip()
'Alaska'
>>> flag_name[2].text.strip()
'Arizona' 
(Oct-19-2019, 11:51 AM)snippsat Wrote: [ -> ]Your search most bye much more specific that link a,as it is lot of a on site.
Here a some hints with a missing loop Think
from bs4 import BeautifulSoup
import requests

res = requests.get('https://simple.wikipedia.org/wiki/List_of_U.S._states')
soup = BeautifulSoup(res.content, 'lxml')
CSS selectors:
>>> name = soup.select('#mw-content-text > div > table > tbody > tr:nth-child(3) > th > a')
>>> name
[<a href="/wiki/Alabama" title="Alabama">Alabama</a>]
>>> name[0].text
'Alabama'
>>> name = soup.select('#mw-content-text > div > table > tbody > tr:nth-child(4) > th > a')
>>> name[0].text
'Alaska'
>>> name = soup.select('#mw-content-text > div > table > tbody > tr:nth-child(5) > th > a')
>>> name[0].text
'Arizona'
Search bye tag name:
>>> flag_name = soup.find_all('th', scope="row")
>>> flag_name[0].text.strip()
'Alabama'
>>> flag_name[1].text.strip()
'Alaska'
>>> flag_name[2].text.strip()
'Arizona' 
ok got it all clear.But how to export that into CSV file?
(Oct-20-2019, 10:33 AM)tahir1990 Wrote: [ -> ]ok got it all clear.But how to export that into CSV file?
Show what you have tried.

If save to a list,then can eg use csv module.
import csv

lst = ['Alabama', 'Alaska', 'Arizona']
with open('filename.csv', 'w') as myfile:
    wr = csv.writer(myfile)
    wr.writerow(lst) 
Output:
Alabama,Alaska,Arizona
(Oct-20-2019, 02:32 PM)snippsat Wrote: [ -> ]
(Oct-20-2019, 10:33 AM)tahir1990 Wrote: [ -> ]ok got it all clear.But how to export that into CSV file?
Show what you have tried.

If save to a list,then can eg use csv module.
import csv

lst = ['Alabama', 'Alaska', 'Arizona']
with open('filename.csv', 'w') as myfile:
    wr = csv.writer(myfile)
    wr.writerow(lst) 
Output:
Alabama,Alaska,Arizona
ok in order to read or write that file i saved it as usstates.csv but i am getting this error .I think i have saved that file in the wrong location.In which directory i need to save that file? I am using python 3.7.4 shell.
Traceback (most recent call last):
  File "<pyshell#23>", line 1, in <module>
    with open('usstates.csv', 'r') as csv_file:
FileNotFoundError: [Errno 2] No such file or directory: 'usstates.csv'
If give no path to usstates.csv your code .py file need to be in same folder as that file.
Can eg give path to where file is r'C:\foo\usstates.csv'.
(Oct-21-2019, 02:52 PM)snippsat Wrote: [ -> ]If give no path to usstates.csv your code .py file need to be in same folder as that file.
Can eg give path to where file is r'C:\foo\usstates.csv'.

You mean in the folder where i installed python ?
(Oct-21-2019, 04:41 PM)tahir1990 Wrote: [ -> ]You mean in the folder where i installed python ?
No,where you have placed usstates.csv.
Example.
with open("usstates.csv", "r") as f:
    print(f.read())
C:\code
λ python my_file.py
Traceback (most recent call last):
  File "my_file.py", line 1, in <module>
    with open("usstates.csv", "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'usstates.csv'
Now i try to open usstates.csv,but it's not in C:\code folder,if it where in this folder no error.
If i give correct path to placement no error.
with open(r"C:\foo\usstates.csv", "r") as f:
    print(f.read())
C:\code
λ python my_file.py
hello
(Oct-21-2019, 06:12 PM)snippsat Wrote: [ -> ]
(Oct-21-2019, 04:41 PM)tahir1990 Wrote: [ -> ]You mean in the folder where i installed python ?
No,where you have placed usstates.csv.
Example.
with open("usstates.csv", "r") as f:
    print(f.read())
C:\code
λ python my_file.py
Traceback (most recent call last):
  File "my_file.py", line 1, in <module>
    with open("usstates.csv", "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'usstates.csv'
Now i try to open usstates.csv,but it's not in C:\code folder,if it where in this folder no error.
If i give correct path to placement no error.
with open(r"C:\foo\usstates.csv", "r") as f:
    print(f.read())
C:\code
λ python my_file.py
hello
I have saved the file in C:\Program Files and entered the following code below.Still getting error.
with open(r"C:\programfiles\usstates.csv","r") as myfile:
	print(myfile.read())

	
Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    with open(r"C:\programfiles\usstates.csv","r") as myfile:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\programfiles\\usstates.csv'
there is space between Program and Files in c:\Program Files\

why would you put your csv file there is entirely different and legit question...