I scraped a list of links of topo maps of mi.
The list is only half the link.
The first part of the link is "http://www.dnr.state.mi.us" is not on the list
This have to be added to each link.
The list looks like this:
links
/spatialdatalibrary/pdf_maps/topomaps/Adair.pdf
/spatialdatalibrary/pdf_maps/topomaps/Adams_Park.pdf
/spatialdatalibrary/pdf_maps/topomaps/Adams_Point.pdf
/spatialdatalibrary/pdf_maps/topomaps/Adamsville.pdf
/spatialdatalibrary/pdf_maps/topomaps/Addis_Creek.pdf
/spatialdatalibrary/pdf_maps/topomaps/Addison.pdf
/spatialdatalibrary/pdf_maps/topomaps/Adrian.pdf
I need each link to look like this
(____this part is added___) to the list.
http://www.dnr.state.mi.us/spatialdatali.../Adair.pdf
I am using windows 7 and python 3.7
I hope their a way of doing this.
Thank you
Renny
Hello,
I've forgotten how to readlines in a txt but if you copy and paste to a csv and then do:
import csv
from csv import reader # import the modules
with open('myfile.csv','r') as data #open the first as a read only so you don't lost any data if it goes wrong
data_read = reader(data) #prep the object function
for row in data_read: #for as many times as there are lines
list1.append(row) #save to a variable list
This will save all of the lines to a list, which you can then write to another csv or output them in the shell - I'll put both below:
Adding the pre-fix:
for x in range(len(list1)):
list2.append("http://www.dnr.state.mi.us" + list1[x]) #puts the prefix before every item in the first list and saves to a second list
To another csv:
with open('secondcsv.csv',"w",newline='') as csv2: #Creates the second CSV
csvWriter = csv.writer(csv2,delimiter=',') #Tells it to put each row on a new line/cell
for x in range(len(list2)): #will run as many times as there are items in the list
csvWriter.writerow(list2[x]) #write to cell
To output in shell
for x in range(len(list1)):
print("http://www.dnr.state.mi.us"+list1[x])
#or you can just do the below if you created the second list
for x in range(len(list2)):
print(x)
Do you not have some command line tools like sed
that can help you do this easily, or at least a text editor that can?
You should giver a try Blue Dog that's how it's work here,it's not hard task.
Good effort jamesaarr,but a some unnecessary stuff and look at
Never use "for i in range(len(sequence))
Can do it like this,so can iterate over directly file object and the use
f-string
for string concatenation.
url = 'http://www.dnr.state.mi.us'
with open('url_refs.txt') as f:
for line in f:
print(f'{url}{line.strip()}')
Output:
http://www.dnr.state.mi.us/spatialdatalibrary/pdf_maps/topomaps/Adair.pdf
http://www.dnr.state.mi.us/spatialdatalibrary/pdf_maps/topomaps/Adams_Park.pdf
http://www.dnr.state.mi.us/spatialdatalibrary/pdf_maps/topomaps/Adams_Point.pdf
http://www.dnr.state.mi.us/spatialdatalibrary/pdf_maps/topomaps/Adamsville.pdf
http://www.dnr.state.mi.us/spatialdatalibrary/pdf_maps/topomaps/Addis_Creek.pdf
http://www.dnr.state.mi.us/spatialdatalibrary/pdf_maps/topomaps/Addison.pdf
http://www.dnr.state.mi.us/spatialdatalibrary/pdf_maps/topomaps/Adrian.pdf
snippsat, that worked perfect. I looked up " line.strip()" and I still don't know how it works, I am going have to play around with it. Thank you very much. Think you two jamesaarr and ndc85430 for your help.

(Sep-23-2021, 05:47 PM)Blue Dog Wrote: [ -> ]snippsat, that worked perfect. I looked up " line.strip()" and I still don't know how it works
One thing that very useful in Python is take stuff out and test it interactively.
>>> # So when reading from file a new line(\n) is added
>>> url = 'http://www.dnr.state.mi.us/spatialdatalibrary/pdf_maps/topomaps/Adair.pdf\n'
>>> url
'http://www.dnr.state.mi.us/spatialdatalibrary/pdf_maps/topomaps/Adair.pdf\n'
>>> print(url)
http://www.dnr.state.mi.us/spatialdatalibrary/pdf_maps/topomaps/Adair.pdf
>>> # strip() take away \n
>>> url = url.strip()
>>> url
'http://www.dnr.state.mi.us/spatialdatalibrary/pdf_maps/topomaps/Adair.pdf'
>>> print(url)
http://www.dnr.state.mi.us/spatialdatalibrary/pdf_maps/topomaps/Adair.pdf
>>>