Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Dyanmically Naming Files
#1
Hello Everybody

I wrote this code which takes 2 files removes any letters inside of them keeping only the phone numbers, then removes any duplicates and compares the files to find the common content.
The Code is this:

import re
import csv

filename_list=[]
file1 = input("Please input file1:   ")
filename_list.append(file1)
file2 = input("Please input file2:  ")
filename_list.append(file2)
duplicate_list=[]

def clean_file(filename):
	with open (filename,'r') as f:
		list1=f.readlines()
		for ch in list1:
			result=re.sub('[^0-9]','',ch)
			with open(('{}_clean.csv').format(filename),'a+') as cl:
				if len(result)<10:
					result=result.strip()
				else:
					cl.write(result + '\n')

def clean_duplicates(filename):
	lines_seen = set()
	with open(('{}_clean_dup.csv').format(filename),'w') as rf:
		duplicate_list.append(rf.name)
		for line in open(('{}_clean.csv').format(filename),'r'):
			if line not in lines_seen:
				rf.write(line)
				lines_seen.add(line)

def find_common():
	comp_file1 = open(duplicate_list[0], "r")
	comp_file2 = open(duplicate_list[1], "r")
	result = open("results.csv", "a")
	list1 = comp_file1.readlines()
	list2 = comp_file2.readlines()
	for i in list1:
		for j in list2:
			if i==j:
				result.write(i)

	comp_file1.close()
	comp_file2.close()
	result.close()

for filename in filename_list:
	clean_file(filename)
	clean_duplicates(filename)

find_common()
So the code works but I have a slight problem. The produced files get filenames like this: filename.csv_clean.csv and filename.csv_clean_dup.csv.

I tried .rsplit, .rpartition trying to drop the .csv extension from the initial filename but it doesn't work.

Can anyone help?
Reply
#2
using os module functions
>>> import os
>>> os.path.split(r'c:\some_folder\some_file.csv')
('c:\\some_folder', 'some_file.csv')
>>> root, file = os.path.split(r'c:\some_folder\some_file.csv')
>>> os.path.join(root, 'new_file.csv')
'c:\\some_folder\\new_file.csv'
>>> root, ext = os.path.splitext(r'c:\some_folder\some_file.csv')
>>> root
'c:\\some_folder\\some_file'
>>> '_'.join((root, 'clean.csv'))
'c:\\some_folder\\some_file_clean.csv'
or using pathlib module
>>> import pathlib
>>> p = pathlib.Path(r'c:\some_folder\some_file.csv')
>>> p.with_name('new_file.csv')
WindowsPath('c:/some_folder/new_file.csv')
>>> p.parent.joinpath(''.join((p.stem, '_clean.csv')))
WindowsPath('c:/some_folder/some_file_clean.csv')
>>>
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
(Jan-29-2020, 02:06 PM)buran Wrote: using os module functions
>>> import os
>>> os.path.split(r'c:\some_folder\some_file.csv')
('c:\\some_folder', 'some_file.csv')
>>> root, file = os.path.split(r'c:\some_folder\some_file.csv')
>>> os.path.join(root, 'new_file.csv')
'c:\\some_folder\\new_file.csv'
>>> root, ext = os.path.splitext(r'c:\some_folder\some_file.csv')
>>> root
'c:\\some_folder\\some_file'
>>> '_'.join((root, 'clean.csv'))
'c:\\some_folder\\some_file_clean.csv'
or using pathlib module
>>> import pathlib
>>> p = pathlib.Path(r'c:\some_folder\some_file.csv')
>>> p.with_name('new_file.csv')
WindowsPath('c:/some_folder/new_file.csv')
>>> p.parent.joinpath(''.join((p.stem, '_clean.csv')))
WindowsPath('c:/some_folder/some_file_clean.csv')
>>>
Buran Thank you for the swift reply. My problem is that I cannot incorporate this to my code. If you see the code I posted the deletion of .csv extension from the initial file has to happen in the with open statement
 with open(('{}_clean.csv').format(filename),'a+') as cl:
also I need only the filename to change and not the path because the code will work in the same path where the 2 files are. So what I need -if it's at all possible - it's something that works with string formatting inside the .format(filename).

Again thank you for your help and your time.
Reply
#4
all of the examples I have provided would do. Obviously the os.path functions will require less change in your code.
(Jan-29-2020, 02:28 PM)ovidius Wrote: something that works with string formatting inside the .format(filename)
you can construct the new file name in the open function using os.path functions, but for sake of readability it's better to do it on separate line, e.g. replace line 26
for line in open(('{}_clean.csv').format(filename),'r'):
with
new_name = ''.join((os.path.splitext(filename)[0], '_clean.csv'))
for line in open(new_name, 'r'):
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  naming entities(city abbreviations) tirumalaramakrishna 1 1,234 May-06-2022, 11:22 AM
Last Post: jefsummers
  Python Style and Naming Variables rsherry8 3 2,200 Jun-07-2021, 09:30 PM
Last Post: deanhystad
  Naming the file as time and date. BettyTurnips 3 2,957 Jan-15-2021, 07:52 AM
Last Post: BettyTurnips
  naming conventions mvolkmann 4 2,134 Sep-28-2020, 05:51 PM
Last Post: Gribouillis
  Question about naming variables in class methods sShadowSerpent 1 2,002 Mar-25-2020, 04:51 PM
Last Post: ndc85430
  naming images adding to number within multiple functions Bmart6969 0 1,918 Oct-09-2019, 10:11 PM
Last Post: Bmart6969
  Sub: Python-3: Better Avoid Naming A Variable As list ? adt 9 4,003 Aug-29-2019, 08:15 AM
Last Post: adt
  Naming convention advice Alfalfa 5 3,314 Jul-21-2018, 11:47 AM
Last Post: Larz60+
  Python Naming Error: List not defined Intelligent_Agent0 1 14,297 Mar-13-2018, 08:34 PM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020