Python Forum

Pages: 1 2

I am trying to replace commas with a "^" in a string that is already quoted. Below you can see input and output. However my code does not replace the string.

Any help would be appreciated.

Input:

Output:Id,Category,Description,Date
1,Test,Red Cars,02/12/2017
2,Test,Blue Cars,03/01/2017
3,Test,"Green, big cars",01/05/2016

Output should be:

Output:Id,Category,Description,Date
1,Test,Red Cars,02/12/2017
2,Test,Blue Cars,03/01/2017
3,Test,"Green^ big cars",01/05/2016

import csv

ifile = open('C:/Users/jpilon/Documents/test.csv', 'r')
reader = csv.reader(ifile,delimiter=',')
ofile = open('C:/Users/jpilon/Documents/test_new.csv', 'w')
writer = csv.writer(ofile, delimiter=',')


findlist = ['"*,*"']
replacelist = ['"*^*"']

rep = dict(zip(findlist, replacelist))

def findReplace(find, replace):
   s = ifile.read()
   s = s.replace(find, replace)
   ofile.write(s)

for item in findlist:
   findReplace(item, rep[item])

ifile.close()
ofile.close()

Could just be me, but your input looks exactly like your output.

You don't use reader that csv module makes.
In line 15 you read() all in as string.

So read in and make a nested list,then can replace values and keep csv structure.

import csv

with open('in.csv') as f:
    reader = csv.reader(f, delimiter=',')
    cars_info = [i for i in reader]

Test:

>>> cars_info
[['Id', 'Category', 'Description', 'Date'],
 ['1', 'Test', 'Red Cars', '02/12/2017'],
 ['2', 'Test', 'Blue Cars', '03/01/2017'],
 ['3', 'Test', 'Green, big cars', '01/05/2016']]
>>> cars_info[3][2]
'Green, big cars'

>>> cars_info[3][2] = "Green^ big cars"
>>> cars_info
[['Id', 'Category', 'Description', 'Date'],
 ['1', 'Test', 'Red Cars', '02/12/2017'],
 ['2', 'Test', 'Blue Cars', '03/01/2017'],
 ['3', 'Test', 'Green^ big cars', '01/05/2016']]

Thanks snippsat. I am new to Python, so forgive my ignorance. I understand what you are doing, I guess where I am lost is how to apply it to a global find and replace in my code.

If I were certain that a column 3 could potentially have the double quotes, then how would I replace any that met that criteria?

Using the csv module is probably the way to go, since it'll handle the quotes for you. But as usual, a regular expression also works:

>>> text = '''
... Id,Category,Description,Date
... 1,Test,Red Cars,02/12/2017
... 2,Test,Blue Cars,03/01/2017
... 3,Test,"Green, big cars",01/05/2016
... '''
>>> import re
>>> regex = re.compile(r'("[^",]*),([^",]*")')
>>> print(regex.sub(r'\1^\2', text))

Id,Category,Description,Date
1,Test,Red Cars,02/12/2017
2,Test,Blue Cars,03/01/2017
3,Test,"Green^ big cars",01/05/2016

The regular expression seemed to work on my example data, but when I tried a larger file with more columns, it did not replace the comma within the quotes.

Was the regular expression someone pointing to column 2 only?

import re

with open('file.csv') as f:
    s = f.read() + '\n'  # add trailing new line character

regex = re.compile(r'("[^",]*),([^",]*")')

s1 = (regex.sub(r'\1^\2', s))

print(s1)

f=open('file.csv',"w")
f.write(s1)
f.close()

Just use csv module

import csv

with open('/tmp/input.csv', 'r') as in_file:
    data = csv.reader(in_file, delimiter=',')
    for row in data:
        print([col.replace(',', '^') for col in row])

Output:['Id', 'Category', 'Description', 'Date']
['1', 'Test', 'Red Cars', '02/12/2017']
['2', 'Test', 'Blue Cars', '03/01/2017']
['3', 'Test', 'Green^ big cars', '01/05/2016']

I did finally figure out how to read the csv and write new values to csv using my test file. Thanks for all your help! Smile

Now when I try this on a 1 gig csv file, I run into memory error. I know there are ways to do this in chunks, but that should be a question in a new thread.

import csv

new_rows_list = []

# Read File
f1 = open('in_file', 'r')
reader = csv.reader(f1, delimiter=',')
for row in reader:
    new_row = ([col.replace(',', '^') for col in row])
    new_rows_list.append(new_row)


# Write File
f2 = open('out_file', 'w')
writer = csv.writer(f2)
writer.writerows(new_rows_list)
f2.close()
f1.close()

(Apr-25-2017, 08:18 PM)jmpatx Wrote: [ -> ]

import csv

new_rows_list = []

# Read File
f1 = open('in_file', 'r')
reader = csv.reader(f1, delimiter=',')
for row in reader:
    new_row = ([col.replace(',', '^') for col in row])
    new_rows_list.append(new_row)


# Write File
f2 = open('out_file', 'w')
writer = csv.writer(f2)
writer.writerows(new_rows_list)
f2.close()
f1.close()

Don't store the whole file in memory, just work on it line-by-line:

import csv

with open("in_file", "r", newline="") as f1:
    reader = csv.reader(f1, delimiter=",")
    with open("out_file", "w", newline="") as f2:
        writer = csv.writer(f2)
        for row in reader:
            new_row = [col.replace(",", "^") for col in row]
            writer.writerow(new_row)

(Apr-26-2017, 05:08 AM)nilamo Wrote: [ -> ]Don't store the whole file in memory, just work on it line-by-line:

Yepp better.
Can also write it like this,one with is enough.

import csv

with open("in.csv") as f1,open("out.csv", "w", newline="") as f2:
    reader = csv.reader(f1, delimiter=",")
    writer = csv.writer(f2)
    for row in reader:
        new_row = [col.replace(",", "^") for col in row]
        writer.writerow(new_row)

Pages: 1 2

jmpatx

sparkz_alot

snippsat

jmpatx

nilamo

jmpatx

wavic

jmpatx

nilamo

snippsat