Python Forum

Full Version: Extracting url from a csv
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello I would like to extract url from a csv, the idea is to iterate the column containing those links and download the content in a folder as .txt files
I tried to use a python library called "newspaper" but it doesn't seem to work properly. I think it's better with BS4 but I didn't get it.
This is the code I used to extract the content of those urls:

# to access your specific url column
from newspaper import Article
import sys as sys
import pandas as pd
data = pd.read_csv('/Users/alexfrandsen14/Desktop/Projects/newspaper3k-scraper/candidate_coverage.csv')

for x in data['url_column_name']: #replace 'url_column_name' with the actual name in your df
article_name = Article(x, language='en') # x is the url in each row of the column
article.download()
article.parse()
f=open(article.title, 'w') # open a file named the title of the article (could be long)
f.write(article.text)
f.close()

Apparently, it doesn't detect the "newspaper" module.
Any ideas?
I'm also enclosing the csv I want to extract the urls from.
Greetings

such a thing
[Image: como-importo-los-historiales-de-posicion...acker.html]