Posts: 8
Threads: 2
Joined: Oct 2019
Hello
I am trying to scrape a table data using the code below .I am trying to scrape the names of all states of USA .I ve managed to extract all 'a'tags but they include all data within a tags .How can i scrape only the names of USA states?
import bs4
import requests
res = requests.get('https://simple.wikipedia.org/wiki/List_of_U.S._states')
soup = bs4.BeautifulSoup(res.text, 'lxml')
soup.select('a')
for i in soup.select('a'):
print(i.text)
Posts: 7,324
Threads: 123
Joined: Sep 2016
Oct-19-2019, 11:51 AM
(This post was last modified: Oct-19-2019, 11:51 AM by snippsat.)
Your search most bye much more specific that link a ,as it is lot of a on site.
Here a some hints with a missing loop
from bs4 import BeautifulSoup
import requests
res = requests.get('https://simple.wikipedia.org/wiki/List_of_U.S._states')
soup = BeautifulSoup(res.content, 'lxml') CSS selectors:
>>> name = soup.select('#mw-content-text > div > table > tbody > tr:nth-child(3) > th > a')
>>> name
[<a href="/wiki/Alabama" title="Alabama">Alabama</a>]
>>> name[0].text
'Alabama'
>>> name = soup.select('#mw-content-text > div > table > tbody > tr:nth-child(4) > th > a')
>>> name[0].text
'Alaska'
>>> name = soup.select('#mw-content-text > div > table > tbody > tr:nth-child(5) > th > a')
>>> name[0].text
'Arizona' Search bye tag name:
>>> flag_name = soup.find_all('th', scope="row")
>>> flag_name[0].text.strip()
'Alabama'
>>> flag_name[1].text.strip()
'Alaska'
>>> flag_name[2].text.strip()
'Arizona'
Posts: 8
Threads: 2
Joined: Oct 2019
(Oct-19-2019, 11:51 AM)snippsat Wrote: Your search most bye much more specific that link a ,as it is lot of a on site.
Here a some hints with a missing loop 
from bs4 import BeautifulSoup
import requests
res = requests.get('https://simple.wikipedia.org/wiki/List_of_U.S._states')
soup = BeautifulSoup(res.content, 'lxml') CSS selectors:
>>> name = soup.select('#mw-content-text > div > table > tbody > tr:nth-child(3) > th > a')
>>> name
[<a href="/wiki/Alabama" title="Alabama">Alabama</a>]
>>> name[0].text
'Alabama'
>>> name = soup.select('#mw-content-text > div > table > tbody > tr:nth-child(4) > th > a')
>>> name[0].text
'Alaska'
>>> name = soup.select('#mw-content-text > div > table > tbody > tr:nth-child(5) > th > a')
>>> name[0].text
'Arizona' Search bye tag name:
>>> flag_name = soup.find_all('th', scope="row")
>>> flag_name[0].text.strip()
'Alabama'
>>> flag_name[1].text.strip()
'Alaska'
>>> flag_name[2].text.strip()
'Arizona' ok got it all clear.But how to export that into CSV file?
Posts: 7,324
Threads: 123
Joined: Sep 2016
(Oct-20-2019, 10:33 AM)tahir1990 Wrote: ok got it all clear.But how to export that into CSV file? Show what you have tried.
If save to a list,then can eg use csv module.
import csv
lst = ['Alabama', 'Alaska', 'Arizona']
with open('filename.csv', 'w') as myfile:
wr = csv.writer(myfile)
wr.writerow(lst) Output: Alabama,Alaska,Arizona
Posts: 8
Threads: 2
Joined: Oct 2019
(Oct-20-2019, 02:32 PM)snippsat Wrote: (Oct-20-2019, 10:33 AM)tahir1990 Wrote: ok got it all clear.But how to export that into CSV file? Show what you have tried.
If save to a list,then can eg use csv module.
import csv
lst = ['Alabama', 'Alaska', 'Arizona']
with open('filename.csv', 'w') as myfile:
wr = csv.writer(myfile)
wr.writerow(lst) Output: Alabama,Alaska,Arizona
ok in order to read or write that file i saved it as usstates.csv but i am getting this error .I think i have saved that file in the wrong location.In which directory i need to save that file? I am using python 3.7.4 shell.
Traceback (most recent call last):
File "<pyshell#23>", line 1, in <module>
with open('usstates.csv', 'r') as csv_file:
FileNotFoundError: [Errno 2] No such file or directory: 'usstates.csv'
Posts: 7,324
Threads: 123
Joined: Sep 2016
Oct-21-2019, 02:52 PM
(This post was last modified: Oct-21-2019, 02:53 PM by snippsat.)
If give no path to usstates.csv your code .py file need to be in same folder as that file.
Can eg give path to where file is r'C:\foo\usstates.csv' .
Posts: 8
Threads: 2
Joined: Oct 2019
Oct-21-2019, 04:41 PM
(This post was last modified: Oct-21-2019, 04:42 PM by tahir1990.)
(Oct-21-2019, 02:52 PM)snippsat Wrote: If give no path to usstates.csv your code .py file need to be in same folder as that file.
Can eg give path to where file is r'C:\foo\usstates.csv' .
You mean in the folder where i installed python ?
Posts: 7,324
Threads: 123
Joined: Sep 2016
Oct-21-2019, 06:12 PM
(This post was last modified: Oct-21-2019, 06:12 PM by snippsat.)
(Oct-21-2019, 04:41 PM)tahir1990 Wrote: You mean in the folder where i installed python ? No,where you have placed usstates.csv .
Example.
with open("usstates.csv", "r") as f:
print(f.read()) C:\code
λ python my_file.py
Traceback (most recent call last):
File "my_file.py", line 1, in <module>
with open("usstates.csv", "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'usstates.csv' Now i try to open usstates.csv ,but it's not in C:\code folder,if it where in this folder no error.
If i give correct path to placement no error.
with open(r"C:\foo\usstates.csv", "r") as f:
print(f.read()) C:\code
λ python my_file.py
hello
Posts: 8
Threads: 2
Joined: Oct 2019
(Oct-21-2019, 06:12 PM)snippsat Wrote: (Oct-21-2019, 04:41 PM)tahir1990 Wrote: You mean in the folder where i installed python ? No,where you have placed usstates.csv .
Example.
with open("usstates.csv", "r") as f:
print(f.read()) C:\code
λ python my_file.py
Traceback (most recent call last):
File "my_file.py", line 1, in <module>
with open("usstates.csv", "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'usstates.csv' Now i try to open usstates.csv ,but it's not in C:\code folder,if it where in this folder no error.
If i give correct path to placement no error.
with open(r"C:\foo\usstates.csv", "r") as f:
print(f.read()) C:\code
λ python my_file.py
hello I have saved the file in C:\Program Files and entered the following code below.Still getting error.
with open(r"C:\programfiles\usstates.csv","r") as myfile:
print(myfile.read())
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
with open(r"C:\programfiles\usstates.csv","r") as myfile:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\programfiles\\usstates.csv'
Posts: 8,168
Threads: 160
Joined: Sep 2016
Oct-22-2019, 08:03 AM
(This post was last modified: Oct-22-2019, 08:04 AM by buran.)
there is space between Program and Files in c:\Program Files\
why would you put your csv file there is entirely different and legit question...
|