Python Forum
Dyanmically Naming Files - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Dyanmically Naming Files (/thread-24060.html)



Dyanmically Naming Files - ovidius - Jan-29-2020

Hello Everybody

I wrote this code which takes 2 files removes any letters inside of them keeping only the phone numbers, then removes any duplicates and compares the files to find the common content.
The Code is this:

import re
import csv

filename_list=[]
file1 = input("Please input file1:   ")
filename_list.append(file1)
file2 = input("Please input file2:  ")
filename_list.append(file2)
duplicate_list=[]

def clean_file(filename):
	with open (filename,'r') as f:
		list1=f.readlines()
		for ch in list1:
			result=re.sub('[^0-9]','',ch)
			with open(('{}_clean.csv').format(filename),'a+') as cl:
				if len(result)<10:
					result=result.strip()
				else:
					cl.write(result + '\n')

def clean_duplicates(filename):
	lines_seen = set()
	with open(('{}_clean_dup.csv').format(filename),'w') as rf:
		duplicate_list.append(rf.name)
		for line in open(('{}_clean.csv').format(filename),'r'):
			if line not in lines_seen:
				rf.write(line)
				lines_seen.add(line)

def find_common():
	comp_file1 = open(duplicate_list[0], "r")
	comp_file2 = open(duplicate_list[1], "r")
	result = open("results.csv", "a")
	list1 = comp_file1.readlines()
	list2 = comp_file2.readlines()
	for i in list1:
		for j in list2:
			if i==j:
				result.write(i)

	comp_file1.close()
	comp_file2.close()
	result.close()

for filename in filename_list:
	clean_file(filename)
	clean_duplicates(filename)

find_common()
So the code works but I have a slight problem. The produced files get filenames like this: filename.csv_clean.csv and filename.csv_clean_dup.csv.

I tried .rsplit, .rpartition trying to drop the .csv extension from the initial filename but it doesn't work.

Can anyone help?


RE: Dyanmically Naming Files - buran - Jan-29-2020

using os module functions
>>> import os
>>> os.path.split(r'c:\some_folder\some_file.csv')
('c:\\some_folder', 'some_file.csv')
>>> root, file = os.path.split(r'c:\some_folder\some_file.csv')
>>> os.path.join(root, 'new_file.csv')
'c:\\some_folder\\new_file.csv'
>>> root, ext = os.path.splitext(r'c:\some_folder\some_file.csv')
>>> root
'c:\\some_folder\\some_file'
>>> '_'.join((root, 'clean.csv'))
'c:\\some_folder\\some_file_clean.csv'
or using pathlib module
>>> import pathlib
>>> p = pathlib.Path(r'c:\some_folder\some_file.csv')
>>> p.with_name('new_file.csv')
WindowsPath('c:/some_folder/new_file.csv')
>>> p.parent.joinpath(''.join((p.stem, '_clean.csv')))
WindowsPath('c:/some_folder/some_file_clean.csv')
>>>



RE: Dyanmically Naming Files - ovidius - Jan-29-2020

(Jan-29-2020, 02:06 PM)buran Wrote: using os module functions
>>> import os
>>> os.path.split(r'c:\some_folder\some_file.csv')
('c:\\some_folder', 'some_file.csv')
>>> root, file = os.path.split(r'c:\some_folder\some_file.csv')
>>> os.path.join(root, 'new_file.csv')
'c:\\some_folder\\new_file.csv'
>>> root, ext = os.path.splitext(r'c:\some_folder\some_file.csv')
>>> root
'c:\\some_folder\\some_file'
>>> '_'.join((root, 'clean.csv'))
'c:\\some_folder\\some_file_clean.csv'
or using pathlib module
>>> import pathlib
>>> p = pathlib.Path(r'c:\some_folder\some_file.csv')
>>> p.with_name('new_file.csv')
WindowsPath('c:/some_folder/new_file.csv')
>>> p.parent.joinpath(''.join((p.stem, '_clean.csv')))
WindowsPath('c:/some_folder/some_file_clean.csv')
>>>
Buran Thank you for the swift reply. My problem is that I cannot incorporate this to my code. If you see the code I posted the deletion of .csv extension from the initial file has to happen in the with open statement
 with open(('{}_clean.csv').format(filename),'a+') as cl:
also I need only the filename to change and not the path because the code will work in the same path where the 2 files are. So what I need -if it's at all possible - it's something that works with string formatting inside the .format(filename).

Again thank you for your help and your time.


RE: Dyanmically Naming Files - buran - Jan-29-2020

all of the examples I have provided would do. Obviously the os.path functions will require less change in your code.
(Jan-29-2020, 02:28 PM)ovidius Wrote: something that works with string formatting inside the .format(filename)
you can construct the new file name in the open function using os.path functions, but for sake of readability it's better to do it on separate line, e.g. replace line 26
for line in open(('{}_clean.csv').format(filename),'r'):
with
new_name = ''.join((os.path.splitext(filename)[0], '_clean.csv'))
for line in open(new_name, 'r'):